• Title/Summary/Keyword: systolic array architecture

Search Result 62, Processing Time 0.026 seconds

A Systolic Array for Ordinary Differential Equations (상미분 방정식을 위한 시스토릭어레이)

  • 박덕원
    • Journal of the Korea Society of Computer and Information
    • /
    • v.8 no.3
    • /
    • pp.66-72
    • /
    • 2003
  • An ordinary differential equation in analytical numerics is utilized to some applications, for example, physics, mechanical engineering, electrical engineering, thermodynamics and etc. But this equation has problems a lots to process in the real time processing by software method. This paper is proposed a systolic Arrays architecture for solving the Runge-Kutta method. it is one of method for solving an ordinary differential equation. the proposed its architecture is very high speed and regular. this hardware proposed in this paper may be part of the mathematical problem solver's tool kit in the future and may be available to many applications in the engineering.

  • PDF

Design of FFT processor with systolic architecture (시스토릭 아키텍쳐를 갖는 FFT 프로세서의 설계)

  • Kang, B.H.;Jeong, S.W.;Lee, J.K.;Choi, B.Y.;Shin, K.W.;Lee, M.K.
    • Proceedings of the KIEE Conference
    • /
    • 1987.07b
    • /
    • pp.1488-1491
    • /
    • 1987
  • This paper describes 16-point FFT processor using systolic array and its implementation into VLSI. Designed FFT processor executes FFT/IFFT arithmetic under mode control and consists of cell array, array controller and input/output buffer memory. For design for testibility, we added built-in self test circuit into designed FFT processor. To verify designed 16-point FFT processor, logic simulation was performed by YSLOG on MICRO-VAXII. From the simulation results, it is estimated that the proposed FFT processor can perform 16-point FFT in about 4400[ns].

  • PDF

Design of an Efficient VLSI Architecture of SADCT Based on Systolic Array (시스톨릭 어레이에 기반한 SADCT의 효율적 VLSl 구조설계)

  • Gang, Tae-Jun;Jeong, Ui-Yun;Gwon, Sun-Gyu;Ha, Yeong-Ho
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.38 no.3
    • /
    • pp.282-291
    • /
    • 2001
  • In this paper, an efficient VLSI architecture of Shape Adaptive Discrete Cosine Transform(SADCT) based on systolic array is proposed. Since transform size in SADCT is varied according to the shape of object in each block, it are dropped that both usability of processing elements(PE´s) and throughput rate in time-recursive SADCT structure. To overcome these disadvantages, it is proposed that the architecture based on a systolic way structure which doesn´t need memory. In the proposed architecture, throughput rate is improved by consecutive processing of one-dimensional SADCT without memory and PE´s in the first column are connected to that in the last one for improvement of usability of PE. And input data are put into each column of PE in parallel according to the maximum data number in each rearranged block. The proposed architecture is described by VHDL. Also, its function is evaluated by MentorTM. Even though the hardware complexity is somewhat increased, the throughput rate is improved about twofold.

  • PDF

Performance Analysis of Extended QRD-RLS Equalizer (Extended QRD-RLS 등화기의 성능 분석)

  • Jang, Jin-Kyu;Jang, Young-Beom
    • Journal of the Institute of Electronics Engineers of Korea TC
    • /
    • v.48 no.8
    • /
    • pp.27-35
    • /
    • 2011
  • In this paper, performances of the extended QRD-RLS equalizer is analyzed. Since the extended QRD-RLS equalizer is efficiently implemented by systolic array architecture, we analyze performances of this structure with signals of different lengths. By multiplying the frequency responses of the unknown channel and proposed equalizer, we observed the flatness of the overall system function. Through the simulation, it is shown that the performance of the extended QRD-RLS equalizer is remarkably increased with input signals of length 16.

A Fast and Scalable Priority Queue Hardware Architecture for Packet Schedulers (패킷 스케줄러를 위한 빠르고 확장성 있는 우선순위 큐의 하드웨어 구조)

  • Kim, Sang-Gyun;Moon, Byung-In
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.44 no.10
    • /
    • pp.55-60
    • /
    • 2007
  • This paper proposes a fast and scalable priority queue architecture for use in high-speed networks which supports quality of service (QoS) guarantees. This architecture is cost-effective since a single queue can generate outputs to multiple out-links. Also, compared with the previous multiple systolic array priority queues, the proposed queue provides fast output generation which is important to high-speed packet schedulers, using a special control block. In addition this architecture provides the feature of high scalability.

A Design and Implementation of Real-time Video frame data Processing control for Block Matching Algorithm (고속블럭정합 알고리즘을 위한 실시간 영상프레임 데이터 처리 제어 방법의 설계 및 구현)

  • 이강환;황호정
    • Proceedings of the IEEK Conference
    • /
    • 2001.06b
    • /
    • pp.373-376
    • /
    • 2001
  • This paper has been studied a real-time video frame data processing control that used the linear systolic array for motion estimation. The proposed data control processing provides to the input data into the multiple processor array unit(MPAU) from search area and reference block data. The proposed data control architecture has based on two slice band for input data processing. And it has no required external control logic blocks for input data as like reference block or search area data.

  • PDF

Block matching algorithm using quantization (양자화를 이용한 블록 정합 알고리즘에 대한 연구)

  • Lee, Young;Park, Gwi-Tae
    • Journal of the Korean Institute of Telematics and Electronics S
    • /
    • v.34S no.2
    • /
    • pp.43-51
    • /
    • 1997
  • In this paper, we quantize the image data to simplify the systolic array architecture for block matching algorithm. As the number of bits for pixel data to be processed is reduced by quantization, one can simplify the hardware of systolic array. Especially, if the bit serial input is used, one can even more simplify the structure of processing element. First, we analize the effect of quantization to a block matching. then we show the structure of quantizer and processing element when bit serial input is used. The simulation results applied to standard images have shown that the proposed block matching method has less prediction error than the conventional high speed algorithm.

  • PDF

High-Performance Givens Rotation-based QR Decomposition Architecture Applicable for MIMO Receiver (MIMO 수신기에 적용 가능한 고성능 기븐스 회전 기반의 QR 분해 하드웨어 구조)

  • Yoon, Ji-Hwan;Lee, Min-Woo;Park, Jong-Sun
    • Journal of the Institute of Electronics Engineers of Korea SC
    • /
    • v.49 no.3
    • /
    • pp.31-37
    • /
    • 2012
  • This paper presents an efficient hardware architecture to enable the high-speed Givens rotation-based QR decomposition. The proposed architecture achieves a highly parallel givens rotation process by maximizing the number of pivots selected for parallel zero-insertions. Sign-select lookahed (SSL)-CORDIC is also efficiently used for the high-speed givens rotation. The performance of QR decomposition hardware considerably increases compared to the conventional triangular systolic array (TSA) architecture. Moreover, the circuit area of QR decomposition hardware was reduced by decreasing the number of flip-flops for holding the pre-computed results during the decomposition process. The proposed QR decomposition hardware was implemented using TSMC $0.25{\mu}m$ technology. The experimental results show that the proposed architecture achieves up to 70 % speed-up over the TACR/TSA-based architecture for the $8{\times}8$ matrix decomposition.

Design of an Area-Efficient Reed-Solomon Decoder using Pipelined Recursive Technique (파이프라인 재귀적인 기술을 이용한 면적 효율적인 Reed-Solomon 복호기의 설계)

  • Lee, Han-Ho
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.42 no.7 s.337
    • /
    • pp.27-36
    • /
    • 2005
  • This paper presents an area-efficient architecture to implement the high-speed Reed-Solomon(RS) decoder, which is used in a variety of communication systems such as wireless and very high-speed optical communications. We present the new pipelined-recursive Modified Euclidean(PrME) architecture to achieve high-throughput rate and reducing hardware-complexity using folding technique. The proposed pipelined recursive architecture can reduce the hardware complexity about 80$\%$ compared to the conventional systolic-array and fully-parallel architecture. The proposed RS decoder has been designed and implemented with the 0.13um CMOS technology in a supply voltage of 1.2 V. The result show that total number of gate is 393 K and it has a data processing rate of S Gbits/s at clock frequency of 625 MHz. The proposed area-efficient architecture can be readily applied to the next generation FEC devices for high-speed optical communications as well as wireless communications.

New Enhanced Degree Computationless Modified Euclid's Algorithm and its Architecture for Reed-Solomon decoders (Reed-Solomon 복호기를 위한 새로운 E-DCME 알고리즘 및 하드웨어 구조)

  • Baek, Jae-Hyun;SunWoo, Myung-Hoon
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.32 no.8A
    • /
    • pp.820-826
    • /
    • 2007
  • This paper proposes an enhanced degree computationless modified Euclid's(E-DCME) algorithm and its architecture for Reed-Solomon decoders. The proposed E-DCME algorithm has shorter critical path delay that is $T_{mult}+T_{add}+T_{mux}$ compared with the existing modified Euclid's algorithm and the degree computationless modified Euclid's(DCME) algorithm since it uses new initial conditions. The proposed E-DCME architecture employing a systolic array requires only 2t-1 clock cycles to solve the key equation without initial latency. In addition, the E-DCME architecture consisting of 3t basic cells has regularity and scalability since it uses only one processing element. The E-DCME architecture using the $0.18{\mu}m$ Samsung standard cell library consists of 18,000 gates.