• Title/Summary/Keyword: 스트림 분할

Search Result 97, Processing Time 0.031 seconds

Multi-DNN Acceleration Techniques for Embedded Systems with Tucker Decomposition and Hidden-layer-based Parallel Processing (터커 분해 및 은닉층 병렬처리를 통한 임베디드 시스템의 다중 DNN 가속화 기법)

  • Kim, Ji-Min;Kim, In-Mo;Kim, Myung-Sun
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.26 no.6
    • /
    • pp.842-849
    • /
    • 2022
  • With the development of deep learning technology, there are many cases of using DNNs in embedded systems such as unmanned vehicles, drones, and robotics. Typically, in the case of an autonomous driving system, it is crucial to run several DNNs which have high accuracy results and large computation amount at the same time. However, running multiple DNNs simultaneously in an embedded system with relatively low performance increases the time required for the inference. This phenomenon may cause a problem of performing an abnormal function because the operation according to the inference result is not performed in time. To solve this problem, the solution proposed in this paper first reduces the computation by applying the Tucker decomposition to DNN models with big computation amount, and then, make DNN models run in parallel as much as possible in the unit of hidden layer inside the GPU. The experimental result shows that the DNN inference time decreases by up to 75.6% compared to the case before applying the proposed technique.

Multi-query Indexing Technique for Efficient Query Processing on Stream Data in Sensor Networks (센서 네트워크에서 스트림 데이터 질의의 효율적인 처리를 위한 다중 질의 색인 기법)

  • Lee, Min-Soo;Kim, Yearn-Jeong;Yoon, Hye-Jung
    • Journal of Korea Multimedia Society
    • /
    • v.10 no.11
    • /
    • pp.1367-1383
    • /
    • 2007
  • A sensor network consists of a network of sensors that can perform computation and also communicate with each other through wireless communication. Some important characteristics of sensor networks are that the network should be self administered and the power efficiency should be greatly considered due to the fact that it uses battery power. In sensor networks, when large amounts of various stream data is produced and multiple queries need to be processed simultaneously, the power efficiency should be maximized. This work proposes a technique to create an index on multiple monitoring queries so that the multi-query processing performance could be increased and the memory and power could be efficiently used. The proposed SMILE tree modifies and combines the ideas of spatial indexing techniques such as k-d trees and R+-trees. The k-d tree can divide the dimensions at each level, while the R+-tree improves the R-tree by dividing the space into a hierarchical manner and reduces the overlapping areas. By applying the SMILE tree on multiple queries and using it on stream data in sensor networks, the response time for finding an indexed query takes in some cases 50% of the time taken for a linear search to find the query.

  • PDF

An Adaptive Intra Coding Technique Using 1-D and 2-D Integer Transforms (1차원 및 2차원 정수 변환을 이용한 적응적 화면내 코딩 기법)

  • Park, Min-Cheol;Kim, Dong-Won;Moon, Joo-Hee
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.46 no.5
    • /
    • pp.66-79
    • /
    • 2009
  • In this paper, we propose a new adaptive intra coding technique using 1-D and 2-D integer transforms for improving coding efficiency of H.264/AVC. Proposed technique selects the most effective transform and prediction mode for each block after processing 1-D and 2-D transforms of all prediction modes. In case of using 1-D transform, $4{\times}4$ block is divided into four $1{\times}4$ or $4{\times}1$ subblocks and then each subblock is predicted and subtracted by using the decoded subblock located at the nearest position in the direction of prediction. After prediction error subblock is processed by 1-D transform and quantization, four subblocks are merged back into original $4{\times}4$ block and then, reordered as 1-D signal by a DC biased zigzag scanning pattern according to the prediction mode. Finally, comparing the coding efficiency between bitstreams based on 1-D transform and conventional 2-D transform, prediction mode and quantized coefficients for each block are decided and corresponding quantized coefficients are transmitted. Experimental results show that the proposed adaptive technique increases 0.34dB in BD-PSNR and decreases 4.03% in BD-Bitrate on the average compared with H.264/AVC.

Evaluation of Video Codec AI-based Multiple tasks (인공지능 기반 멀티태스크를 위한 비디오 코덱의 성능평가 방법)

  • Kim, Shin;Lee, Yegi;Yoon, Kyoungro;Choo, Hyon-Gon;Lim, Hanshin;Seo, Jeongil
    • Journal of Broadcast Engineering
    • /
    • v.27 no.3
    • /
    • pp.273-282
    • /
    • 2022
  • MPEG-VCM(Video Coding for Machine) aims to standardize video codec for machines. VCM provides data sets and anchors, which provide reference data for comparison, for several machine vision tasks including object detection, object segmentation, and object tracking. The evaluation template can be used to compare compression and machine vision task performance between anchor data and various proposed video codecs. However, performance comparison is carried out separately for each machine vision task, and information related to performance evaluation of multiple machine vision tasks on a single bitstream is not provided currently. In this paper, we propose a performance evaluation method of a video codec for AI-based multi-tasks. Based on bits per pixel (BPP), which is the measure of a single bitstream size, and mean average precision(mAP), which is the accuracy measure of each task, we define three criteria for multi-task performance evaluation such as arithmetic average, weighted average, and harmonic average, and to calculate the multi-tasks performance results based on the mAP values. In addition, as the dynamic range of mAP may very different from task to task, performance results for multi-tasks are calculated and evaluated based on the normalized mAP in order to prevent a problem that would be happened because of the dynamic range.

A Segmented Leap-Ahead LFSR Pseudo-Random Number Generator (분할 구조를 갖는 Leap-Ahead 선형 궤환 쉬프트 레지스터 의사 난수 발생기)

  • Park, Young-Kyu;Kim, Sang-Choon;Lee, Je-Hoon
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.24 no.1
    • /
    • pp.51-58
    • /
    • 2014
  • A LFSR is commonly used for various stream cryptography applications to generate random numbers. A Leap-ahead LFSR was presented to generate a multi-bits random number per cycle. It only requires a single LFSR and it has an advantages in hardware complexity. However, it suffers from the significant reduction of maximum period of the generated random numbers. This paper presents the new segmented Leap-ahead LFSR to solve this problem. It consists of two segmented LFSRs. We prove the efficiency of the proposed segmented architecture using the precise mathematical analysis. We also demonstrate the proposed comparison results with other counterparts using Xinilx Vertex5 FPGA. The proposed architecture can increase 2.5 times of the maximum period of generated random numbers compared to the typical Leap-ahead architecture.

Composite Stock Cutting using Distributed Simulated Annealing (분산 시뮬레이티드 어닐링을 이용한 복합 재료 재단)

  • Hong, Chul-Eui
    • Journal of KIISE:Software and Applications
    • /
    • v.29 no.1_2
    • /
    • pp.20-29
    • /
    • 2002
  • The composite stock cutting problem is to allocate rectangular and/or irregular patterns onto a large composite stock sheet of finite dimensions in such a way that the resulting scrap will be minimized. In this paper, the distributed simulated annealing with the new cost error tolerant spatial decomposition is applied to the composite stock cutting problem in MPI environments. The cost error tolerant scheme relaxes synchronization and chooses small perturbations on states asynchronously in a dynamically changed stream length to keep the convergence property of the sequential annealing. This paper proposes the efficient data structures for representation of patterns and their affinity relations and also shows how to determine move generations, annealing parameters, and a cost function. The spatial decomposition method is addressed in detail. This paper identifies that the final quality is not degraded with almost linear speedup. Composite stock shapes are not constrained to convex polygons or even regular shapes, but the rotations are only allowed to 2 or 4 due to its composite nature.

A Study on the Data Compression Algorithm for Just-in-Time Rendering of Concentric Mosaic (동심원 모자이크의 실시간 표현을 위한 데이터 압축 알고리즘에 관한 연구)

  • Jee, Inn-Ho;Ahn, Hong-Yeoung
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.10 no.1
    • /
    • pp.91-96
    • /
    • 2010
  • Concentric mosaics are made with arranging and summing of video frames by using common spacial standards. Compared with previous works on 3-D wavelet transform coding, we have made important design considerations to enable flexible partial decoding and bit-stream random access. A just-in-time(JIT) rendering engine of the compressed concentric mosaic is developed. However, computationally, it is still demanding to accomplish the real-time rendering. Only the contents for specific scene representation are need to be decoded by maintaining compressed data. Thus our proposed algorithm is able to render real concentric mosaic by using lifting scheme instead of wavelet transform.

The Study of the Equation $(x+1)^d=x^d+1$ over Finite Fields (유한체위에서 방정식 $(x+1)^d=x^d+1$에 대한 연구)

  • Cho, Song-Jin;Kim, Han-Doo;Choi, Un-Sook;Kwon, Sook-Hee;Kwon, Min-Jeong;Kim, Jin-Gyoung
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2012.05a
    • /
    • pp.237-240
    • /
    • 2012
  • Binary sequences of period $N=2^k-1$ are widely used in many areas of engineering and sciences. Some well-known applications include code-division multiple-access (CDMA) communications and stream cipher systems. In this paper, we analyze the equation $(x+1)^d=x^d+1$ over finite fields. The $d$ of the equation is used to analyze cross-correlation of binary sequences.

  • PDF

An efficient compression method of metadata using BiM (BiM을 이용한 메타데이터의 효율적인 부호화 방법)

  • 양승준;남제호;김영태;강경옥
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2001.11b
    • /
    • pp.199-202
    • /
    • 2001
  • ISO/IEC 15938-1(MPEG-7 Systems)에서는 멀티미디어 컨텐츠에 대한 메타데이터의 효율적인 전송과 저장을 위한 이진 표현 방법인 BiM(binary format for MPEC-7)을 제공한다. 멀티미디어 컨텐츠를 기술(description)하는 메타데이터의 텍스트 표현은 대체로 많은 저장 용량과 전송 리소스를 요구하기 때문에 효율적인 압축을 위해서는 이진 형식으로의 변환이 요구된다. 또한 텍스트 형식은 방송 환경과 같은 스트리밍 전송에는 적절하지 못한 단점이 있다. BiM은 컨텐츠에 대한 기술을 전체 또는 2개 이상의 AU(access units) 단위로 분할하며 부호화하는 방법을 지원함으로써 스트리밍 전송을 가능하게 한다. 이러한 구조는 이진 포맷 형태로 표현되는 헤더를 가지는 패킷 기반 형태이며, 융통성이 있는 전송 순서를 제공한다. 또한, 비트 스트림의 전체를 해석(parsing)하지 않고 랜덤 엑세스 기능을 제공하는 장점이 있다. BiM이 지닌 이러한 장점들로 인하여 현재 방송산업계를 중심으로 메타데이터를 방송에 활용하기 위한 기술을 표준화하는 국제 민간 표준화 기구인 TV-Anytime 포럼에서는 방송 컨텐츠에 대한 메타데이터의 압축에 관한 요구사항을 만족하는 하나의 방법으로 BiM을 고려하고 있다 본 논문에서는 이러한 MPEG-7 시스템의 BiM을 소개하고, 이를 이용하여 TV-Anytime 포럼의 메타데이터를 이진 포맷으로 부호화한 실험과 그 결과를 기술한다.

  • PDF

Improvement in Performance of ATM Network Interface Card and Performance Evaluation (ATM 망 접속 장치의 성능 향상 방법과 성능 평가)

  • Kim, Cheul-Young;Lee, Seung-Ha;Na, Yun-Joo;Nam, Ji-Seung
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2001.10b
    • /
    • pp.1383-1386
    • /
    • 2001
  • Internet 이용자의 급격한 증가와 광대역 통신망(B-ISDN) 구축의 확산에 따라 ATM(Asynchronous Transfer Mode)망 접속장치의 큰 수요가 기대되며, 또한 ATM망 접속장치의 성능 향상도 요구되고 있다. 기존의 연구들은 컴퓨터 프로그램의 메모리에 대한 참조가 지역적이라는 특성을 이용한 가상 메모리의 효율적인 페이지 교체 알고리즘 및 캐쉬 처리 방안들이 진행되어 왔다. 본 논문은 ATM 프로토콜 프로세서를 설계하는데 있어 네트워크 트래픽의 지역성(Locality of Reference)을 고려한 캐쉬 메모리 구조를 적용하여 보다 향상된 ATM 셀 수신이 가능하도록 한다. ATM 셀의 가상 패스 식별자/가상 채널 식별자(VPI/VCI)를 캐쉬 처리함으로써, 패킷을 분해, 재조립(Segmentation and Reassembly)할 때 관련 테이블의 검색 시간을 줄일 수 있다. 캐쉬 메모리 적용으로 인한 성능 향상을 평가하기 위해 ATM NIC 프로세서와 내부 캐시 메모리 그리고, 외부 SRAM 사이에 셀 수신 정보의 Read 와 Write에 드는 시간 비용(System Clock Cycle)을 캐시의 Hit 또는 Miss 등에 따라 구분하고, 이를 기반으로 한 시뮬레이터에 3 종류의 ATM 셀 스트림을 가하여 각각에 대해 평균 셀 처리시간, 데이터 버스의 트래픽 비율 그리고, 히트율의 3가지 평가요소를 측정하고, 비교하였다.

  • PDF