• Title/Summary/Keyword: VLSI Architecture

Search Result 277, Processing Time 0.022 seconds

Pipelined VLSI Architectures for the Hierarchical Block-Matching Algorithm (계층적 블록매칭 알고리즘을 위한 파이프라인식 VLSI 아키텍쳐)

  • Kim, Hyeong-Cheol;Maeng, Seung-Ryeol
    • The Transactions of the Korea Information Processing Society
    • /
    • v.5 no.7
    • /
    • pp.1691-1716
    • /
    • 1998
  • 본 논문에서는 계층적 블록매칭 알고리즘(HBMA)을 위한 두 가지 병렬 VLSI 아키텍쳐를 제안한다. HBMA는 계층에 따른 반복수행과 공간 인터폴레이션을 기반으로 수행되며, 이러한 수행 특성은 병렬처리의 장애요소인 데이터 종속성을 내재하고 있다. 제안된 아키텍쳐는 HBMA의 계층간 데이터 종속성을 해결하기 위하여 기본적으로 파이프라인 구조를 채택하고 있으며, HBMA에서 주어진 매개변수에 따라 세 단계의 스테이지로 구성된다. 제안된 아키텍쳐는 입력 프레임 데이터의 흐름을 제어하는 방식에 따라 두 가지 종류로 구분된다. U-Architecture는 단방향 스캔 순서를 따르도록 설계되었으며, B-Architecture는 양방향 스캔 수서를 따르도록 설계되었다. 각 아키텍쳐의 내부 메모리와 인터폴레이션 모듈은 해당 스캔 순서에 따라 동기적으로 동작할 수 있는 구조를 가진다. 성능분석의 결과로서 본 논문에서 제안한 두 가지 아키텍쳐가 모두 방송용 비디오 포맷을 실시간으로 처리할 수 있음을 보이고, HDTV 포맷은 가까운 장래의 VLSI 기술로 실시간 성능을 얻을 수 있음을 보였다. 또한, B-Architecture는 공간 연결성 내부 메모리 구조를 채택함으로써 입력 데이터의 재활용도를 높이고, 이에 따라 Q-Architecture에 비해서 데이터 입출력 핀의 개수를 약 반정도 줄일 수 있는 특성을 보이고 있다.

  • PDF

An efficient VLSI architecture for high speed matrix transpositio (고속 행렬 전치를 위한 효율적인 VLSI 구조)

  • 김견수;장순화;김재호;손경식
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.21 no.12
    • /
    • pp.3256-3264
    • /
    • 1996
  • This paper presents an efficient VLSI architecture for transposing matris in high speed. In the case of transposing N*N matrix, N$^{2}$ numbers of transposition cells are configured as regular and spuare shaped structure, and pipeline structure for operating each transposition cell in paralle. Transposition cell consists of register and input data selector. The characteristic of this architecture is that the data to be transposed are divided into several bundles of bits, then processed serially. Using the serial transposition of divided input data, hardware complexity of transpositioncell can be reduced, and routing between adjacent transposition cells can be simple. the proposed architecture is designed and implemented with 0.5 .mu.m VLSI library. As a result, it shows stable operation in 200 MHz and less hardware complexity than conventional architectures.

  • PDF

On a Reduction of Pitch Searching Time by Separating the Speech Components in the CELP Vocoder (성분분리에 의한 CELP 보코더의 피치 검색시간 단축에 관한 연구)

  • Hyeon, Jin-Il;Byeon, Gyeong-Jin;Han, Gi-Cheon;Kim, Jong-Jae;Yu, Ha-Yeong;Kim, Jae-Seok;Kim, Dae-Sik;Bae, Myeong-Jin
    • The Journal of the Acoustical Society of Korea
    • /
    • v.14 no.1E
    • /
    • pp.22-29
    • /
    • 1995
  • Code excited Linear Prediction(CELP) vocoder exhibits good performance at data rates below 4.8 kbps. The major drawback of CELP type coders is their large amount of computation. In this paper, we propose a new pitch searching method that preseves the quality of the CELP vodocer reducing computational complexity. The basic idea is that pregrasps preliminary pitches about signal and performs pitch search only about the preliminary pitches. Applying the proposed method to the CELP vocoder, we can reduce complexity about 90% in th pitch search.

  • PDF

A VLSI architecture for fast motion estimation algorithm (고속 움직임 추정 알고리즘에 적합한 VLSI 구조 연구)

  • 이재헌;라종범
    • Proceedings of the IEEK Conference
    • /
    • 1998.06a
    • /
    • pp.717-720
    • /
    • 1998
  • In this paper, we propose a VLSI architecture for implementing a crecently proposed fast block matching algorithm, which is called the HSBMA3S. The proposed architecture consists of a systolic array based basic unit and two shift register arrays. And it covers a search range of -32 ~+31. By using a basic unit repeatedly, we can redcue the number of gates. To implement the basic unit, we can select one among various conventional systolic arrays by trading-off between speed and hardware cost. In this paper, the architecture for the basic unit is selected so that the hardware cost can be minimized. The proposed architecture is fast enough for low bit-rate applications (frame size of 352x288, 30 frames/sec) and can be implemented by less than 20,000 gates. Moreover, by simply modifying the basic unit, the architecture can be used for the higher bit-rate application of the frame size of 720*480 and 30 frames/sec.

  • PDF

Design of a new VLSI architecture for morphological filters (새로운 수리형태학 필터 VLSI 구조 설계)

  • 웅수환;선우명훈
    • Journal of the Korean Institute of Telematics and Electronics C
    • /
    • v.34C no.8
    • /
    • pp.22-38
    • /
    • 1997
  • This paper proposes a new VLSI architecture for morphological filters and presents its chip design and implementation. The proposed architecture can significantly reduce hardware costs compared with existing architecture by using a feedback loop path to reuse partial results and a decoder/encoder pair to detect maximum/minimum values. In addition, the proposed architecture requires one common architecture for both diltion and erosion and fewer number of operations. Moreover, it can be easily extended for larger size morphologica operations. We developed VHDL (VHSIC hardware description language) models, performed logic synthesis using the SYNOPSYS CAD tool. We used the SOG (sea-of-gate) cell library and implemented the actual chip. The total number of gates is only 2,667 and the clock frequency is 30 MHz that meets real-time image processing requirements.

  • PDF

Real-Time 2-D Median Filter (실시간 2차원 메디안 필터)

  • Jeong, Jae-Gil
    • The Journal of Engineering Research
    • /
    • v.3 no.1
    • /
    • pp.57-64
    • /
    • 1998
  • This paper presents an architecture of a real-time two-dimensional median filter. The architecture has appropriate characteristics for the VLSI implementation such as small memory requirements, regular computations, and local data transfers. For the efficient two-dimensional median filter, a separable two-dimensional median filtering structure and a bit-sliced pipelined median searching algorithm are used. A behavioral simulator is implemented with C language and used for the analysis of the presented architecture.

  • PDF

Efficient Radix-4 Systolic VLSI Architecture for RSA Public-key Cryptosystem (RSA 공개키 암호화시스템의 효율적인 Radix-4 시스톨릭 VLSI 구조)

  • Park Tae geun
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.29 no.12C
    • /
    • pp.1739-1747
    • /
    • 2004
  • In this paper, an efficient radix-4 systolic VLSI architecture for RSA public-key cryptosystem is proposed. Due to the simple operation of iterations and the efficient systolic mapping, the proposed architecture computes an n-bit modular exponentiation in n$^{2}$ clock cycles since two modular multiplications for M$_{i}$ and P$_{i}$ in each exponentiation process are interleaved, so that the hardware is fully utilized. We encode the exponent using Radix-4. SD (Signed Digit) number system to reduce the number of modular multiplications for RSA cryptography. Therefore about 20% of NZ (non-zero) digits in the exponent are reduced. Compared to conventional approaches, the proposed architecture shows shorter period to complete the RSA while requiring relatively less hardware resources. The proposed RSA architecture based on the modified Montgomery algorithm has locality, regularity, and scalability suitable for VLSI implementation.

High-Performance VLSI Architecture for Stereo Vision (스테레오 비전을 위한 고성능 VLSI 구조)

  • Seo, Youngho;Kim, Dong-Wook
    • Journal of Broadcast Engineering
    • /
    • v.18 no.5
    • /
    • pp.669-679
    • /
    • 2013
  • This paper proposed a new VLSI (Very Large Scale Integrated Circuit) architecture for stereo matching in real time. We minimized the amount of calculation and the number of memory accesses through analyzing calculation of stereo matching. From this, we proposed a new stereo matching calculating cell and a new hardware architecture by expanding it in parallel, which concurrently calculates cost function for all pixels in a search range. After expanding it, we proposed a new hardware architecture to calculate cost function for 2-dimensional region. The implemented hardware can be operated with minimum 250Mhz clock frequence in FPGA (Field Programmable Gate Array) environment, and has the performance of 805fps in case of the search range of 64 pixels and the image size of $640{\times}480$.

VLSI Implementation of CORDIC-Based Digital Quadrature Demodulator (CORDIC을 이용한 디지탈 Quadrature 복조기의 VLSI 구현)

  • 남승현;성원용
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.23 no.7
    • /
    • pp.1718-1731
    • /
    • 1998
  • Digital quadrature demodulator is needed for the coherent demodulation in the digital communication systems such as Binary Phase-Shift-Keying, Quadrature Phase-Shift-Keying, and Quadrature Anmplitude Modulation. Conventaionally, the DDFS (Direct Digital Frequency Synthsizer) is used for generating the carrier signal and seperate multi-pliers are used for mixing. And the DDFS is implemented using the ROM (Read Only Memory), which can be a bottle-neck neck when the fast-speed and small-area implementation is required. A new architecture is developed, which employs the circular rotation mode of the CORDIC algorithm for signal mixing as well as carrier generation. To optimize the hardware design parameters, the finiteword-length effects of the proposed implementation arachitecture are analyzed in comparison with a conventional ROM-based architecture. The hardware costs are also estimated, which showed that the proposed architecture occupies only a third of the area of the conventional ROM-based architecture for the same performance. A full-custom VLSI is developed using the proposed architecture.

  • PDF