• Title/Summary/Keyword: 병렬 압축 알고리즘

Search Result 54, Processing Time 0.027 seconds

Motion Search Region Prediction using Neural Network Vector Quantization (신경 회로망 벡터 양자화를 이용한 움직임 탐색 영역의 예측)

  • Ryu, Dae-Hyun;Kim, Jae-Chang
    • Journal of the Korean Institute of Telematics and Electronics B
    • /
    • v.33B no.1
    • /
    • pp.161-169
    • /
    • 1996
  • This paper presents a new search region prediction method using vector quantization for the motion estimation. We find motion vectors using the full search BMA from two successive frame images first. Then the motion vectors are used for training a codebook. The trained codebook is the predicted search region. We used the unsupervised neural network for VQ encoding and codebook design. A major advantage of formulating VQ as neural networks is that the large number of adaptive training algorithm that are used for neural networks can be applied to VQ. The proposed method reduces the computation and reduce the bits required to represent the motion vectors because of the smaller search points. The computer simulation results show the increased PSNR as compared with the other block matching algorithms.

  • PDF

A Development of JPEG-LS Platform for Mirco Display Environment in AR/VR Device. (AR/VR 마이크로 디스플레이 환경을 고려한 JPEG-LS 플랫폼 개발)

  • Park, Hyun-Moon;Jang, Young-Jong;Kim, Byung-Soo;Hwang, Tae-Ho
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.14 no.2
    • /
    • pp.417-424
    • /
    • 2019
  • This paper presents the design of a JPEG-LS codec for lossless image compression from AR/VR device. The proposed JPEG-LS(: LosSless) codec is mainly composed of a context modeling block, a context update block, a pixel prediction block, a prediction error coding block, a data packetizer block, and a memory block. All operations are organized in a fully pipelined architecture for real time image processing and the LOCO-I compression algorithm using improved 2D approach to compliant with the SBT coding. Compared with a similar study in JPEG-LS, the Block-RAM size of proposed STB-FLC architecture is reduced to 1/3 compact and the parallel design of the predication block could improved the processing speed.

A FPGA Implementation of BIST Design for the Batch Testing (일괄검사를 위한 BIST 설계의 FPGA 구현)

  • Rhee, Kang-Hyeon
    • The Transactions of the Korea Information Processing Society
    • /
    • v.4 no.7
    • /
    • pp.1900-1906
    • /
    • 1997
  • In this paper, the efficient BILBO(named EBILBO) is designed for BIST that is able to batch the testing when circuit is designed on FPGA. The proposed algorithm of batch testing is able to test the normal operation speed with one-pin-count that can control all part of large and complex circuit. PRTPG is used for the test pattern and MISR is used for PSA. The proposed algorithm of batch testing is VHDL coding on behavioral description, so it is easily modified the model of test pattern generation, signature analysis and compression. The EBILBO's area and the performance of designed BIST are evaluated with ISCAS89 benchmark circuit on FPGA. In circuit with above 600 cells, it is shown that area is reduced below 30%, test pattern is flexibly generated about 500K and the fault coverage is from 88.3% to 100%. EBILBO for the proposed batch testing BIST is able to execute concurrently normal and test mode operation in real time to the number of $s+n+(2^s/2^p-1)$ clock(where, in CUT, # of PI;n, # of register, p is order # of polynomial). The proposed algorithm coded with VHDL is made of library, then it well be widely applied to DFT that satisfy the design and test field on sme time.

  • PDF

Design of High Speed Binary Arithmetic Encoder for CABAC Encoder (CABAC 부호화기를 위한 고속 이진 산술 부호화기의 설계)

  • Park, Seungyong;Jo, Hyungu;Ryoo, Kwangki
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.21 no.4
    • /
    • pp.774-780
    • /
    • 2017
  • This paper proposes an efficient binary arithmetic encoder hardware architecture for CABAC encoding, which is an entropy coding method of HEVC. CABAC is an entropy coding method that is used in HEVC standard. Entropy coding removes statistical redundancy and supports a high compression ratio of images. However, the binary arithmetic encoder causes a delay in real time processing and parallel processing is difficult because of the high dependency between data. The operation of the proposed CABAC BAE hardware structure is to separate the renormalization and process the conventional iterative algorithm in parallel. The new scheme was designed as a four-stage pipeline structure that can reduce critical path optimally. The proposed CABAC BAE hardware architecture was designed with Verilog HDL and implemented in 65nm technology. Its gate count is 8.07K and maximum operating speed of 769MHz. It processes the four bin per clock cycle. Maximum processing speed increased by 26% from existing hardware architectures.

Enhancement of H.264/AVC Encoding Speed and Reduction of CPU Load through Parallel Programming Based on CUDA (CUDA 기반의 병렬 프로그래밍을 통한 H.264/AVC 부호화 속도 향상 및 CPU 부하 경감)

  • Jang, Eun-Been;Ha, Yun-Su
    • Journal of Advanced Marine Engineering and Technology
    • /
    • v.34 no.6
    • /
    • pp.858-863
    • /
    • 2010
  • In order to enhance encoding speed in dynamic image encoding using H.264/AVC, reducing the time for motion estimation which takes a large portion of the processing time is very important. An approach using graphics processing unit(GPU) as a coprocessor to assist the central processing unit(CPU) in computing massive data, will be a way to reduce the processing time. In this paper, we present an efficient block-level parallel algorithm for the motion estimation(ME) on a computer unified device architecture(CUDA) platform developed in general-purpose computation on GPU. Experiments are carried out to verify the effectiveness of the proposed algorithm.

Design and Implementation of Multiple View Image Synthesis Scheme based on RAM Disk for Real-Time 3D Browsing System (실시간 3D 브라우징 시스템을 위한 램 디스크 기반의 다시점 영상 합성 기법의 설계 및 구현)

  • Sim, Chun-Bo;Lim, Eun-Cheon
    • The Journal of the Korea Contents Association
    • /
    • v.9 no.5
    • /
    • pp.13-23
    • /
    • 2009
  • One of the main purpose of multiple-view image processing technology is support realistic 3D image to device user by using multiple viewpoint display devices and compressed data restoration devices. This paper proposes a multiple view image synthesis scheme based on RAM disk which makes possible to browse 3D images generated by applying effective composing method to real time input stereo images. The proposed scheme first converts input images to binary image. We applies edge detection algorithm such as Sobel algorithm and Prewiit algorithm to find edges used to evaluate disparities from images of 4 multi-cameras. In addition, we make use of time interval between hardware trigger and software trigger to solve the synchronization problem which has stated ambiguously in related studies. We use a unique identifier on each snapshot of images for distributed environment. With respect of performance results, the proposed scheme takes 0.67 sec in each binary array. to transfer entire images which contains left and right side with disparity information for high quality 3D image browsing. We conclude that the proposed scheme is suitable for real time 3D applications.

Novel IME Instructions and their Hardware Architecture for Fast Search Algorithm (고속 탐색 알고리즘에 적합한 움직임 추정 전용 명령어 및 구조 설계)

  • Bang, Ho-Il;SunWoo, Myung-Hoon
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.48 no.12
    • /
    • pp.58-65
    • /
    • 2011
  • This paper presents an ASIP (Application-specific Instruction Processor) for motion estimation that employs specific IME instructions and its programmable and reconfigurable hardware architecture for various video codecs, such as H.264/AVC, MPEG4, etc. With the proposed specific instructions and variable point 2D SAD hardware accelerator, it can handle the real-time processing requirement of High Definition (HD) video. With the SAD unit and its parallel operations using pattern information, the proposed IME instructions support not only full search algorithms but also other fast search algorithms. The hardware size is 25.5K gates for each Processing Element Group (PEG) which has 128 SAD Processor Elements (PEs). The proposed ASIP has been verified by the Synopsys Processor Designer and implemented by the Design Compiler using the IBM 90nm process technology. The hardware size is 453K gates for the IME unit and the operating frequency is 188MHz for 1080p@30 frame in real time. The proposed ASIP can reduce the hardware size about 26% and the number of operation cycles about 18%.

Efficient DSP Architecture For High- Quality Audio Algorithms (고음질 오디오 알고리즘을 위한 효율적인 DSP 설계)

  • Moon, Jong-Ha;SunWoo, Myung-Hoon
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.44 no.5
    • /
    • pp.112-117
    • /
    • 2007
  • This paper presents specialized DSP instructions and their hardware architecture for audio coding algorithms, such as the MPEG-2/4 Advanced Audio Coding(AAC), Dolby AC-3, MPEG-2 Backward Compatible(BC), etc. The proposed architecture is specially designed and optimized for the MDCT/IMDCT(Inverse Modified Discrete Cosine Transform), and Huffman decoding of the AAC decoding algorithm. Performance comparisons show a significant improvement compared with TMS320C62x and ASDSP21060 for the MDCT/IMDCT computation. In addition, the dedicated Huffman decoding accelerator performs decoding and preparing operand in only one cycle. The proposed DPU(Data Processing Unit) consists of 107,860 gates and achieves 150 MIPS.

Deep Learning-based Real-Time Super-Resolution Architecture Design (경량화된 딥러닝 구조를 이용한 실시간 초고해상도 영상 생성 기술)

  • Ahn, Saehyun;Kang, Suk-Ju
    • Journal of Broadcast Engineering
    • /
    • v.26 no.2
    • /
    • pp.167-174
    • /
    • 2021
  • Recently, deep learning technology is widely used in various computer vision applications, such as object recognition, classification, and image generation. In particular, the deep learning-based super-resolution has been gaining significant performance improvement. Fast super-resolution convolutional neural network (FSRCNN) is a well-known model as a deep learning-based super-resolution algorithm that output image is generated by a deconvolutional layer. In this paper, we propose an FPGA-based convolutional neural networks accelerator that considers parallel computing efficiency. In addition, the proposed method proposes Optimal-FSRCNN, which is modified the structure of FSRCNN. The number of multipliers is compressed by 3.47 times compared to FSRCNN. Moreover, PSNR has similar performance to FSRCNN. We developed a real-time image processing technology that implements on FPGA.

Implementation of High-Throughput SHA-1 Hash Algorithm using Multiple Unfolding Technique (다중 언폴딩 기법을 이용한 SHA-1 해쉬 알고리즘 고속 구현)

  • Lee, Eun-Hee;Lee, Je-Hoon;Jang, Young-Jo;Cho, Kyoung-Rok
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.47 no.4
    • /
    • pp.41-49
    • /
    • 2010
  • This paper proposes a new high speed SHA-1 architecture using multiple unfolding and pre-computation techniques. We unfolds iterative hash operations to 2 continuos hash stage and reschedules computation timing. Then, the part of critical path is computed at the previous hash operation round and the rest is performed in the present round. These techniques reduce 3 additions to 2 additions on the critical path. It makes the maximum clock frequency of 118 MHz which provides throughput rate of 5.9 Gbps. The proposed architecture shows 26% higher throughput with a 32% smaller hardware size compared to other counterparts. This paper also introduces a analytical model of multiple SHA-1 architecture at the system level that maps a large input data on SHA-1 block in parallel. The model gives us the required number of SHA-1 blocks for a large multimedia data processing that it helps to make decision hardware configuration. The hs fospeed SHA-1 is useful to generate a condensed message and may strengthen the security of mobile communication and internet service.