• Title/Summary/Keyword: systolic array architecture

Search Result 62, Processing Time 0.027 seconds

A Systolic Array for High-Speed Computing of Full Search Block Matching Algorithm

  • Jung, Soon-Ho;Woo, Chong-Ho
    • Journal of Korea Multimedia Society
    • /
    • v.14 no.10
    • /
    • pp.1275-1286
    • /
    • 2011
  • This paper proposes a high speed systolic array architecture for full search block matching algorithm (FBMA). The pixels of the search area for a reference block are input only one time to find the matched candidate block and reused to compute the sum of absolute difference (SAD) for the adjacent candidate blocks. Each row of designed 2-dimensional systolic array compares the reference block with the adjacent blocks of the same row in search area. The lower rows of the designed array get the pixels from the upper row and compute the SAD with reusing the overlapped pixels of the candidate blocks within same column of the search area. This designed array has no data broadcasting and global paths. The comparison with existing architectures shows that this array is superior in terms of throughput through it requires a little more hardware.

A New Systolic Array for LSD-first Multiplication in $CF(2^m)$ ($CF(2^m)$상의 LSD 우선 곱셈을 위한 새로운 시스톨릭 어레이)

  • Kim, Chang-Hoon;Nam, In-Gil
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.33 no.4C
    • /
    • pp.342-349
    • /
    • 2008
  • This paper presents a new digit-serial systolic multiplier over $CF(2^m)$ for cryptographic applications. When input data come in continuously, the proposed array produces multiplication results at a rate of one every ${\lceil}m/D{\rceil}$ clock cycles, where D is the selected digit size. Since the inner structure of the proposed array is tree-type, critical path increases logarithmically proportional to D. Therefore, the computation delay of the proposed architecture is significantly less than previously proposed digit-serial systolic multipliers whose critical path increases proportional to D. Furthermore, since the new architecture has the features of regularity, modularity, and unidirectional data flow, it is well suited to VLSI implementations.

Stereo matching algorithm based on systolic array architecture using edges and pixel data (에지 및 픽셀 데이터를 이용한 어레이구조의 스테레오 매칭 알고리즘)

  • Jung, Woo-Young;Park, Sung-Chan;Jung, Hong
    • Proceedings of the KIEE Conference
    • /
    • 2003.11c
    • /
    • pp.777-780
    • /
    • 2003
  • We have tried to create a vision system like human eye for a long time. We have obtained some distinguished results through many studies. Stereo vision is the most similar to human eye among those. This is the process of recreating 3-D spatial information from a pair of 2-D images. In this paper, we have designed a stereo matching algorithm based on systolic array architecture using edges and pixel data. This is more advanced vision system that improves some problems of previous stereo vision systems. This decreases noise and improves matching rate using edges and pixel data and also improves processing speed using high integration one chip FPGA and compact modules. We can apply this to robot vision and automatic control vehicles and artificial satellites.

  • PDF

Conservative Approximation-Based Full-Search Block Matching Algorithm Architecture for QCIF Digital Video Employing Systolic Array Architecture

  • Ganapathi, Hegde;Amritha, Krishna R.S.;Pukhraj, Vaya
    • ETRI Journal
    • /
    • v.37 no.4
    • /
    • pp.772-779
    • /
    • 2015
  • This paper presents a power-efficient hardware realization for a motion estimation technique that is based on the full-search block matching algorithm (FSBMA). The considered input is the quarter common intermediate format of digital video. The mean of absolute difference (MAD) is the distortion criteria employed for the block matching process. The conventional architecture considered for the hardware realization of FSBMA is that of the shift register-based 2-D systolic array. For this architecture, a conservative approximation technique is adapted to eliminate unnecessary MAD computations involved in the block matching process. Upon introducing the technique to the conventional architecture, the power and complexity of its implantation is reduced, while the accuracy of the motion vector extracted from the block matching process is preserved. The proposed architecture is verified for its functional specifications. A performance evaluation of the proposed architecture is carried out using parameters such as power, area, operating frequency, and efficiency.

VLSI Design of Soft Decision Viterbi Decoder Using Systolic Array Architecture (역추적 방식의 시스토릭 어레이 구조를 가진 연판정 비터비 복호기의 설계)

  • Kim, Ki-Bo;Kim, Jong-Tae
    • Proceedings of the KIEE Conference
    • /
    • 1999.07g
    • /
    • pp.3199-3201
    • /
    • 1999
  • Convolutional coding with Viterbi decoding is known as a powerful method for forward error correction among many kinds of channel coding methods. This paper presents a soft decision Viterbi decoder which has systolic array trace-back architecture[1]. Soft decision is known as more effective method than hard decision and most of digital communication systems use soft decision. The advantage of using a systolic array decoder is that the trace-back operation can be accomplished continuously in an array of registers in a pipe-line fashion, instead of waiting for the entire trace-back procedure to be completed at each iteration. Therefore it may be suitable for faster communication system. We described operations of each module of the decoder and showed results of the logic synthesis and functional simulation.

  • PDF

A VLSI Architecture for Fast Motion Estimation Algorithm (고속 움직임 추정 알고리즘에 적합한 VLSI 구조 연구)

  • 이재헌;나종범
    • Journal of Broadcast Engineering
    • /
    • v.3 no.1
    • /
    • pp.85-92
    • /
    • 1998
  • The block matching algorithm is the most popular motion estimation method in image sequence coding. In this paper, we propose a VLSI architecture. for implementing a recently proposed fast bolck matching algorith, which uses spatial correlation of motion vectors and hierarchical searching scheme. The proposed architecture consists of a basic searching unit based on a systolic array and two shift register arrays. And it covers a search range of -32~ +31. By using the basic searching unit repeatedly, it reduces the number of gatyes for implementation. For basic searching unit implementation, a proper systolic array can be selected among various conventional ones by trading-off between speed and hardware cost. In this paper, a structure is selected as the basic searching unit so that the hardware cost can be minimized. The proposed overall architecture is fast enough for low bit-rate applications (frame size of $352{\times}288$, 3Oframes/sec) and can be implemented by less than 20,000 gates. Moreover, by simply modifying the basic searching unit, the architecture can be used for the higher bit-rate application of the frame size of $720{\times}480$ and 30 frames/sec.

  • PDF

A VLSI Architecture of Systolic Array for FET Computation (고속 퓨리어 변환 연산용 VLSI 시스토릭 어레이 아키텍춰)

  • 신경욱;최병윤;이문기
    • Journal of the Korean Institute of Telematics and Electronics
    • /
    • v.25 no.9
    • /
    • pp.1115-1124
    • /
    • 1988
  • A two-dimensional systolic array for fast Fourier transform, which has a regular and recursive VLSI architecture is presented. The array is constructed with identical processing elements (PE) in mesh type, and due to its modularity, it can be expanded to an arbitrary size. A processing element consists of two data routing units, a butterfly arithmetic unit and a simple control unit. The array computes FFT through three procedures` I/O pipelining, data shuffling and butterfly arithmetic. By utilizing parallelism, pipelining and local communication geometry during data movement, the two-dimensional systolic array eliminates global and irregular commutation problems, which have been a limiting factor in VLSI implementation of FFT processor. The systolic array executes a half butterfly arithmetic based on a distributed arithmetic that can carry out multiplication with only adders. Also, the systolic array provides 100% PE activity, i.e., none of the PEs are idle at any time. A chip for half butterfly arithmetic, which consists of two BLC adders and registers, has been fabricated using a 3-um single metal P-well CMOS technology. With the half butterfly arithmetic execution time of about 500 ns which has been obtained b critical path delay simulation, totla FFT execution time for 1024 points is estimated about 16.6 us at clock frequency of 20MHz. A one-PE chip expnsible to anly size of array is being fabricated using a 2-um, double metal, P-well CMOS process. The chip was layouted using standard cell library and macrocell of BLC adder with the aid of auto-routing software. It consists of around 6000 transistors and 68 I/O pads on 3.4x2.8mm\ulcornerarea. A built-i self-testing circuit, BILBO (Built-In Logic Block Observation), was employed at the expense of 3% hardware overhead.

  • PDF

A New N-time Systolic Array Architecture for the Vector Median Filter (N-time 시스톨릭 어레이 구조를 가지는 벡터 미디언 필터의 하드웨어 아키텍쳐)

  • Yang, Yeong-Yil
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.8 no.4
    • /
    • pp.293-296
    • /
    • 2007
  • In this paper, we propose the systolic array architecture for the vector median filter. In the color image processing, the vector signal (i.e. the color) consists of three elements, red, green and blue. The vector median filter is very effective to utilize the correlation among red, green and blue elements. The computational complexity of the proposed architecture for computing the vector median of N vector signals is (N+2) clock periods compared to the (3N+1) clock periods in the previous method. In addition to, the input vector signals can be loaded in serial in the proposed architecture. In the previous method, N input vector signals should be loaded to the vector median filter in parallel at the first clock. The proposed architecture is implemented with FPGA.

  • PDF

An Implementation of Digital Neural Network Using Systolic Array Processor (영어 수계를 이용한 디지털 신경망회로의 실현)

  • 윤현식;조원경
    • Journal of the Korean Institute of Telematics and Electronics B
    • /
    • v.30B no.2
    • /
    • pp.44-50
    • /
    • 1993
  • In this paper, we will present an array processor for implementation of digital neural networks. Back-propagation model can be formulated as a consecutive matrix-vector multiplication problem with some prespecified thresholding operation. This operation procedure is suited for the design of an array processor, because it can be recursively and repeatedly executed. Systolic array circuit architecture with Residue Number System is suggested to realize the efficient arithmetic circuit for matrix-vector multiplication and compute sigmoid function. The proposed design method would expect to adopt for the application field of neural networks, because it can be realized to currently developed VLSI technology.

  • PDF

Romberg's Integration Using a Systolic Array (Romberg 적분법을 위한 Systolic Array)

  • 박덕원
    • Journal of the Korea Society of Computer and Information
    • /
    • v.3 no.4
    • /
    • pp.55-62
    • /
    • 1998
  • This Paper proposed a systolic Arrays architecture for computing Romberg's integration method. It consists of systolic arrays of two stage, one for integration by Trapezoidal rule and the other for integration by using Richardson's extrapolation. the proposed its architecture is very high speed and regular. This paper illustrates how " mathematical hardware " package, as well as software library routines, may be part of the mathematical problem solver's tool kit in the future.he future.

  • PDF