• Title/Summary/Keyword: systolic array architecture

Search Result 62, Processing Time 0.021 seconds

Systolic Arrays for Lattice-Reduction-Aided MIMO Detection

  • Wang, Ni-Chun;Biglieri, Ezio;Yao, Kung
    • Journal of Communications and Networks
    • /
    • v.13 no.5
    • /
    • pp.481-493
    • /
    • 2011
  • Multiple-input multiple-output (MIMO) technology provides high data rate and enhanced quality of service for wireless communications. Since the benefits from MIMO result in a heavy computational load in detectors, the design of low-complexity suboptimum receivers is currently an active area of research. Lattice-reduction-aided detection (LRAD) has been shown to be an effective low-complexity method with near-maximum-likelihood performance. In this paper, we advocate the use of systolic array architectures for MIMO receivers, and in particular we exhibit one of them based on LRAD. The "Lenstra-Lenstra-Lov$\acute{a}$sz (LLL) lattice reduction algorithm" and the ensuing linear detections or successive spatial-interference cancellations can be located in the same array, which is considerably hardware-efficient. Since the conventional form of the LLL algorithm is not immediately suitable for parallel processing, two modified LLL algorithms are considered here for the systolic array. LLL algorithm with full-size reduction-LLL is one of the versions more suitable for parallel processing. Another variant is the all-swap lattice-reduction (ASLR) algorithm for complex-valued lattices, which processes all lattice basis vectors simultaneously within one iteration. Our novel systolic array can operate both algorithms with different external logic controls. In order to simplify the systolic array design, we replace the Lov$\acute{a}$sz condition in the definition of LLL-reduced lattice with the looser Siegel condition. Simulation results show that for LR-aided linear detections, the bit-error-rate performance is still maintained with this relaxation. Comparisons between the two algorithms in terms of bit-error-rate performance, and average field-programmable gate array processing time in the systolic array are made, which shows that ASLR is a better choice for a systolic architecture, especially for systems with a large number of antennas.

Efficient Architecture of an n-bit Radix-4 Modular Multiplier in Systolic Array Structure (시스톨릭 어레이 구조를 갖는 효율적인 n-비트 Radix-4 모듈러 곱셈기 구조)

  • Park, Tae-geun;Cho, Kwang-won
    • The KIPS Transactions:PartA
    • /
    • v.10A no.4
    • /
    • pp.279-284
    • /
    • 2003
  • In this paper, we propose an efficient architecture for radix-4 modular multiplication in systolic array structure based on the Montgomery's algorithm. We propose a radix-4 modular multiplication algorithm to reduce the number of iterations, so that it takes (3/2)n+2 clock cycles to complete an n-bit modular multiplication. Since we can interleave two consecutive modular multiplications for 100% hardware utilization and can start the next multiplication at the earliest possible moment, it takes about only n/2 clock cycles to complete one modular multiplication in the average. The proposed architecture is quite regular and scalable due to the systolic array structure so that it fits in a VLSI implementation. Compared to conventional approaches, the proposed architecture shows shorter period to complete a modular multiplication while requiring relatively less hardware resources.

A Study on the design of two's complement bit-serial FIR filter with systolic array architecture (Systolic Array를 이용한 Two's Complement Bit-Serial Fir 필터 설계에 관한 연구)

  • 엄두섭;박노경;차균현
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.14 no.5
    • /
    • pp.442-452
    • /
    • 1989
  • This Paper describes the impleentation of two's complement bit-serial FIR filter with systolic architectur. The filter coefficients are represented as sign and magnitude form and the input data is represented as two's complement form. We use systolic array to obtain high operation speed so this FIR filter sucessfully operates in real-time environment.

  • PDF

A Hardware Architecture of SEED Algorithm with 320 Mbps (320 Mbps SEED 알고리즘의 하드웨어 구조)

  • Lee Haeng-Woo;Ra Yoo-Chan
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.10 no.2
    • /
    • pp.291-297
    • /
    • 2006
  • This paper describes the architecture for reducing its size and increasing the computation rate in implementing the SEED algorithm of a 128-bit block cipher, and the result of the circuit design. In order to increase the computation rate, it is used the architecture of the pipelined systolic array. This architecture is a simple thing without involving any buffer at the input and output part. By this circuit, it can be recorded 320 Mbps encryption rate at 10 MHz clock. We designed the circuits with goals of the high-speed computations and the simplified structures.

A New Systolic Array Architecture for the OS CFAR Processor (OS CFAR 프로세서에 대한 새로운 시스톨릭 어레이 구조)

  • 송재필
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • 1991.06a
    • /
    • pp.163-168
    • /
    • 1991
  • In this paper, we propose a new systolic architecture for the order statistics(OS) constant false alarm rate(CFAR) processor. In the proposed architecture, each processing element(PE) can compare two reference data cells with one test cell simultaneously in each clock cycle. So the utilization of each PE in this architecture is 100% whereas the utilization of each PE in the systolic architecture previously reported by Ritcey and Hwang is 50% because of one clock delay between two adjacent PE's active in computation. This can speed up the data processing rate by a factor of two. With this architecture, we can obtain the reduced number of communication links between adjacent PE's and reduction of the latency by half in comparison with the one proposed by Ritcey and Hwang.

  • PDF

Systolic arry archtecture for full-search mothion estimation (완전탐색에 의한 움직임 추정기 시스토릭 어레이 구조)

  • 백종섭;남승현;이문기
    • Journal of the Korean Institute of Telematics and Electronics B
    • /
    • v.31B no.12
    • /
    • pp.27-34
    • /
    • 1994
  • Block matching motion estimation is the most widely used method for motion compensated coding of image sequences. Based on a two dimensional systolic array, VLSI architecture and implementation of the full search block matching algorithm are described in this paper. The proposed architecture improves conventional array architecture by designing efficient processing elements that can control the data prodeuced by efficient search window division method. The advantages are that 1) it allows serial input to reduce pin counts for efficient composition of local memories but performs parallel processing. 2) It is flexible and can adjust to dimensional changes of search windows with simple control logic. 3) It has no idel time during the operation. 4) It can operate in real/time for low and main level in MPEG-2 standard. 5) It has modular and regular structure and thus is sutiable for VLSI implementation.

  • PDF

A Digit Serial Multiplier Over GF(2m)Based on the MSD-first Algorithm (GF(2m)상의 MSD 우선 알고리즘 기반 디지트-시리얼 곱셈기)

  • Kim, Chang-Hoon;Kim, Soon-Cheol
    • The KIPS Transactions:PartA
    • /
    • v.15A no.3
    • /
    • pp.161-166
    • /
    • 2008
  • In this paper, an efficient digit-serial systolic array is proposed for multiplication in finite field GF($2^m$) using the polynomial basis representation. The proposed systolic array is based on the most significant digit first (MSD-first) multiplication algorithm and produces multiplication results at a rate of one every "m/D" clock cycles, where D is the selected digit size. Since the inner structure of the proposed multiplier is tree-type, critical path increases logarithmically proportional to D. Therefore, the computation delay of the proposed architecture is significantly less than previously proposed digit-serial systolic multipliers whose critical path increases proportional to D. Furthermore, since the new architecture has the features of a high regularity, modularity, and unidirectional data flow, it is well suited to VLSI implementation.

Self-Testing for FFT processor with systolic array architecture (시스토릭 어레이 구조를 갖는 FFT 프로세서에 대한 Self-Testing)

  • Lee, J.K.;Kang, B.H.;Choi, B.I.;Shin, K.U.;Lee, M.K.
    • Proceedings of the KIEE Conference
    • /
    • 1987.07b
    • /
    • pp.1503-1506
    • /
    • 1987
  • This paper proposes the self test method for 16 point FFT processor with systolic array architecture. To test efficiently and solve the increased hardware problems due to built-in self test, we change the normal registers into Linear Feedback Shift Registers(LFSR). LFSR can be served as a test pattern generator or a signature analyzer during self test operation, while LFSR a ordering register or a accumulator during normal operation. From the results of logic simulation for 16 point FFT processor by YSLOG, the total time is estimated in about. 21.4 [us].

  • PDF

A VLSI architecture for fast motion estimation algorithm (고속 움직임 추정 알고리즘에 적합한 VLSI 구조 연구)

  • 이재헌;라종범
    • Proceedings of the IEEK Conference
    • /
    • 1998.06a
    • /
    • pp.717-720
    • /
    • 1998
  • In this paper, we propose a VLSI architecture for implementing a crecently proposed fast block matching algorithm, which is called the HSBMA3S. The proposed architecture consists of a systolic array based basic unit and two shift register arrays. And it covers a search range of -32 ~+31. By using a basic unit repeatedly, we can redcue the number of gates. To implement the basic unit, we can select one among various conventional systolic arrays by trading-off between speed and hardware cost. In this paper, the architecture for the basic unit is selected so that the hardware cost can be minimized. The proposed architecture is fast enough for low bit-rate applications (frame size of 352x288, 30 frames/sec) and can be implemented by less than 20,000 gates. Moreover, by simply modifying the basic unit, the architecture can be used for the higher bit-rate application of the frame size of 720*480 and 30 frames/sec.

  • PDF