• Title/Summary/Keyword: Systolic Architecture

Search Result 96, Processing Time 0.026 seconds

Discrete Cosine Transform Algorithms for the VLSI Parallel Implementation (VLSI 병렬 연산을 위한 여현 변환 알고리듬)

  • 조남익;이상욱
    • Journal of the Korean Institute of Telematics and Electronics
    • /
    • v.25 no.7
    • /
    • pp.851-858
    • /
    • 1988
  • In this paper, we propose two different VLSI architectures for the parallel computation of DCT (discrete cosine transform) algorithm. First, it is shown that the DCT algorithm can be implemented on the existing systolic architecture for the DFT(discrete fourier transform) by introducing some modification. Secondly, a new prime factor DCT algorithm based on the prime factor DFT algorithm is proposed. And it is shown that the proposed algorihtm can be implemented in parallel on the systolic architecture for the prime factor DFT. However, proposed algorithm is only applicable to the data length which can be decomposed into relatively prime and odd numbers. It is also found that the proposed systolic architecture requires less multipliers than the structures implementing FDCT(fast DCT) algorithms directly.

  • PDF

Stereo matching algorithm based on systolic array architecture using edges and pixel data (에지 및 픽셀 데이터를 이용한 어레이구조의 스테레오 매칭 알고리즘)

  • Jung, Woo-Young;Park, Sung-Chan;Jung, Hong
    • Proceedings of the KIEE Conference
    • /
    • 2003.11c
    • /
    • pp.777-780
    • /
    • 2003
  • We have tried to create a vision system like human eye for a long time. We have obtained some distinguished results through many studies. Stereo vision is the most similar to human eye among those. This is the process of recreating 3-D spatial information from a pair of 2-D images. In this paper, we have designed a stereo matching algorithm based on systolic array architecture using edges and pixel data. This is more advanced vision system that improves some problems of previous stereo vision systems. This decreases noise and improves matching rate using edges and pixel data and also improves processing speed using high integration one chip FPGA and compact modules. We can apply this to robot vision and automatic control vehicles and artificial satellites.

  • PDF

A New Systolic Array for LSD-first Multiplication in $CF(2^m)$ ($CF(2^m)$상의 LSD 우선 곱셈을 위한 새로운 시스톨릭 어레이)

  • Kim, Chang-Hoon;Nam, In-Gil
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.33 no.4C
    • /
    • pp.342-349
    • /
    • 2008
  • This paper presents a new digit-serial systolic multiplier over $CF(2^m)$ for cryptographic applications. When input data come in continuously, the proposed array produces multiplication results at a rate of one every ${\lceil}m/D{\rceil}$ clock cycles, where D is the selected digit size. Since the inner structure of the proposed array is tree-type, critical path increases logarithmically proportional to D. Therefore, the computation delay of the proposed architecture is significantly less than previously proposed digit-serial systolic multipliers whose critical path increases proportional to D. Furthermore, since the new architecture has the features of regularity, modularity, and unidirectional data flow, it is well suited to VLSI implementations.

An Architecture of One-Dimensional Systolic Array for Full-Search Block Matching Algorithm (완전탐색 블럭정합 알고리즘을 위한 일차원 시스톨릭 어레이의 구조)

  • Lee, Su-Jin;Woo, Chong-Ho
    • Journal of the Institute of Electronics Engineers of Korea SC
    • /
    • v.39 no.5
    • /
    • pp.34-42
    • /
    • 2002
  • In this paper, we designed the VLSI array architecture for the high speed processing of the motion estimation used by block matching algorithm. We derived the one dimensional systolic array from the full search block matching algorithm. The data and control signals of the proposed systolic array are passed through adjacent processing element. So proposed architecture has temporal and spatial locality. The I/O ports exists only in the first and last processing elements of the array. This architecture has low pin counts and modular expandability. So the proposed array architecture can be cascaded for different block size and search range.

A New N-time Systolic Array Architecture for the Vector Median Filter (N-time 시스톨릭 어레이 구조를 가지는 벡터 미디언 필터의 하드웨어 아키텍쳐)

  • Yang, Yeong-Yil
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.8 no.4
    • /
    • pp.293-296
    • /
    • 2007
  • In this paper, we propose the systolic array architecture for the vector median filter. In the color image processing, the vector signal (i.e. the color) consists of three elements, red, green and blue. The vector median filter is very effective to utilize the correlation among red, green and blue elements. The computational complexity of the proposed architecture for computing the vector median of N vector signals is (N+2) clock periods compared to the (3N+1) clock periods in the previous method. In addition to, the input vector signals can be loaded in serial in the proposed architecture. In the previous method, N input vector signals should be loaded to the vector median filter in parallel at the first clock. The proposed architecture is implemented with FPGA.

  • PDF

Design of a Bit-Level Super-Systolic Array (비트 수준 슈퍼 시스톨릭 어레이의 설계)

  • Lee Jae-Jin;Song Gi-Yong
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.42 no.12
    • /
    • pp.45-52
    • /
    • 2005
  • A systolic array formed by interconnecting a set of identical data-processing cells in a uniform manner is a combination of an algorithm and a circuit that implements it, and is closely related conceptually to arithmetic pipeline. High-performance computation on a large array of cells has been an important feature of systolic array. To achieve even higher degree of concurrency, it is desirable to make cells of systolic array themselves systolic array as well. The structure of systolic array with its cells consisting of another systolic array is to be called super-systolic array. This paper proposes a scalable bit-level super-systolic amy which can be adopted in the VLSI design including regular interconnection and functional primitives that are typical for a systolic architecture. This architecture is focused on highly regular computational structures that avoids the need for a large number of global interconnection required in general VLSI implementation. A bit-level super-systolic FIR filter is selected as an example of bit-level super-systolic array. The derived bit-level super-systolic FIR filter has been modeled and simulated in RT level using VHDL, then synthesized using Synopsys Design Compiler based on Hynix $0.35{\mu}m$ cell library. Compared conventional word-level systolic array, the newly proposed bit-level super-systolic arrays are efficient when it comes to area and throughput.

New systolic arrays for computation of the 1-D and 2-D discrete wavelet transform (1차원 및 2차원 이산 웨이브렛 변환 계산을 위한 새로운 시스톨릭 어레이)

  • 반성범;박래홍
    • Journal of the Korean Institute of Telematics and Electronics S
    • /
    • v.34S no.10
    • /
    • pp.132-140
    • /
    • 1997
  • This paper proposes systolic array architectures for compuataion of the 1-D and 2-D discrete wavelet transform (DWT). The proposed systolic array for compuataion of the 1-D DWT consists of L processing element (PE) arrays, where the PE array denotes the systolic array for computation of the one level DWT. The proposed PE array computes only the product terms that are required for further computation and the outputs of low and high frequency filters are computed in alternate clock cycles. Therefore, the proposed architecuter can compute the low and high frequency outputs using a single architecture. The proposed systolic array for computation of the 2-D DWT consists of two systolic array architectures for comutation of the 1-D DWT and memory unit. The required time and hardware cost of the proposed systolic arrays are comparable to those of the conventional architectures. However, the conventional architectures need extra processing units whereas the proposed architectures fo not. The proposed architectures can be applied to subband decomposition by simply changing the filter coefficients.

  • PDF

A VLSI architecture for fast motion estimation algorithm (고속 움직임 추정 알고리즘에 적합한 VLSI 구조 연구)

  • 이재헌;라종범
    • Proceedings of the IEEK Conference
    • /
    • 1998.06a
    • /
    • pp.717-720
    • /
    • 1998
  • In this paper, we propose a VLSI architecture for implementing a crecently proposed fast block matching algorithm, which is called the HSBMA3S. The proposed architecture consists of a systolic array based basic unit and two shift register arrays. And it covers a search range of -32 ~+31. By using a basic unit repeatedly, we can redcue the number of gates. To implement the basic unit, we can select one among various conventional systolic arrays by trading-off between speed and hardware cost. In this paper, the architecture for the basic unit is selected so that the hardware cost can be minimized. The proposed architecture is fast enough for low bit-rate applications (frame size of 352x288, 30 frames/sec) and can be implemented by less than 20,000 gates. Moreover, by simply modifying the basic unit, the architecture can be used for the higher bit-rate application of the frame size of 720*480 and 30 frames/sec.

  • PDF

A VLSI Architecture for Fast Motion Estimation Algorithm (고속 움직임 추정 알고리즘에 적합한 VLSI 구조 연구)

  • 이재헌;나종범
    • Journal of Broadcast Engineering
    • /
    • v.3 no.1
    • /
    • pp.85-92
    • /
    • 1998
  • The block matching algorithm is the most popular motion estimation method in image sequence coding. In this paper, we propose a VLSI architecture. for implementing a recently proposed fast bolck matching algorith, which uses spatial correlation of motion vectors and hierarchical searching scheme. The proposed architecture consists of a basic searching unit based on a systolic array and two shift register arrays. And it covers a search range of -32~ +31. By using the basic searching unit repeatedly, it reduces the number of gatyes for implementation. For basic searching unit implementation, a proper systolic array can be selected among various conventional ones by trading-off between speed and hardware cost. In this paper, a structure is selected as the basic searching unit so that the hardware cost can be minimized. The proposed overall architecture is fast enough for low bit-rate applications (frame size of $352{\times}288$, 3Oframes/sec) and can be implemented by less than 20,000 gates. Moreover, by simply modifying the basic searching unit, the architecture can be used for the higher bit-rate application of the frame size of $720{\times}480$ and 30 frames/sec.

  • PDF

Efficient Radix-4 Systolic VLSI Architecture for RSA Public-key Cryptosystem (RSA 공개키 암호화시스템의 효율적인 Radix-4 시스톨릭 VLSI 구조)

  • Park Tae geun
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.29 no.12C
    • /
    • pp.1739-1747
    • /
    • 2004
  • In this paper, an efficient radix-4 systolic VLSI architecture for RSA public-key cryptosystem is proposed. Due to the simple operation of iterations and the efficient systolic mapping, the proposed architecture computes an n-bit modular exponentiation in n$^{2}$ clock cycles since two modular multiplications for M$_{i}$ and P$_{i}$ in each exponentiation process are interleaved, so that the hardware is fully utilized. We encode the exponent using Radix-4. SD (Signed Digit) number system to reduce the number of modular multiplications for RSA cryptography. Therefore about 20% of NZ (non-zero) digits in the exponent are reduced. Compared to conventional approaches, the proposed architecture shows shorter period to complete the RSA while requiring relatively less hardware resources. The proposed RSA architecture based on the modified Montgomery algorithm has locality, regularity, and scalability suitable for VLSI implementation.