• 제목/요약/키워드: Systolic array

검색결과 144건 처리시간 0.032초

비트 레벨 일차원 시스톨릭 모듈러 승산 (Bit-level 1-dimensional systolic modular multiplication)

  • 최성욱;우종호
    • 전자공학회논문지B
    • /
    • 제33B권9호
    • /
    • pp.62-69
    • /
    • 1996
  • In this paper, the bit-level 1-dimensional systolic array for modular multiplication is designed. First of all, the parallel algorithm and data dependence graph from walter's method based on montgomery algorithm suitable for array design for modular multiplication is derived. By the systematic procedure for systolic array design, four 1-dimensional systolic arrays are obtained and then are evaluated by various criteria. As it is modified the array which is derived form [0,1] projection direction by adding a control logic and it is serialized the communication paths of data A, optimal 1-dimensional systolic array is designed. It has constant I/O channels for expansile module and it is easy for fault tolerance due to unidirectional paths. It is suitable for RSA cryptosystem which deals iwth the large size and many consecutive message blocks.

  • PDF

A Systolic Array for High-Speed Computing of Full Search Block Matching Algorithm

  • Jung, Soon-Ho;Woo, Chong-Ho
    • 한국멀티미디어학회논문지
    • /
    • 제14권10호
    • /
    • pp.1275-1286
    • /
    • 2011
  • This paper proposes a high speed systolic array architecture for full search block matching algorithm (FBMA). The pixels of the search area for a reference block are input only one time to find the matched candidate block and reused to compute the sum of absolute difference (SAD) for the adjacent candidate blocks. Each row of designed 2-dimensional systolic array compares the reference block with the adjacent blocks of the same row in search area. The lower rows of the designed array get the pixels from the upper row and compute the SAD with reusing the overlapped pixels of the candidate blocks within same column of the search area. This designed array has no data broadcasting and global paths. The comparison with existing architectures shows that this array is superior in terms of throughput through it requires a little more hardware.

두 형의 Voronoi Diagram 구축을 위한 Systolic Arrays (Systolic Arrays for Constructing Static and Dynamic Voronoi Diagrams)

  • 오승준
    • ETRI Journal
    • /
    • 제10권3호
    • /
    • pp.125-140
    • /
    • 1988
  • Computational geometry has wide applications in pattern recognition, image processing, VLSI design, and computer graphics. Voronoi diagrams in computational geometry possess many important properites which are related to other geometric structures of a set of point. In this pater the design of systolic algorithms for the static and the dynamic Voronoi diagrams is considered. The major motivation for developing the systolic architecture is for VLSI implementation. A new systematic transform technique for designing systolic arrays, in particular, for the problem in computational geometry has been proposed. Following this procedure, a type T systolic array architecture and associated systolic algorithms have been designed for constructing Voronoi diagrams. The functions of the cells in the array are also specified. The resulting systolic array achieves the maximal throughput with O(n) computational complexity.

  • PDF

Systolic Arrays for Lattice-Reduction-Aided MIMO Detection

  • Wang, Ni-Chun;Biglieri, Ezio;Yao, Kung
    • Journal of Communications and Networks
    • /
    • 제13권5호
    • /
    • pp.481-493
    • /
    • 2011
  • Multiple-input multiple-output (MIMO) technology provides high data rate and enhanced quality of service for wireless communications. Since the benefits from MIMO result in a heavy computational load in detectors, the design of low-complexity suboptimum receivers is currently an active area of research. Lattice-reduction-aided detection (LRAD) has been shown to be an effective low-complexity method with near-maximum-likelihood performance. In this paper, we advocate the use of systolic array architectures for MIMO receivers, and in particular we exhibit one of them based on LRAD. The "Lenstra-Lenstra-Lov$\acute{a}$sz (LLL) lattice reduction algorithm" and the ensuing linear detections or successive spatial-interference cancellations can be located in the same array, which is considerably hardware-efficient. Since the conventional form of the LLL algorithm is not immediately suitable for parallel processing, two modified LLL algorithms are considered here for the systolic array. LLL algorithm with full-size reduction-LLL is one of the versions more suitable for parallel processing. Another variant is the all-swap lattice-reduction (ASLR) algorithm for complex-valued lattices, which processes all lattice basis vectors simultaneously within one iteration. Our novel systolic array can operate both algorithms with different external logic controls. In order to simplify the systolic array design, we replace the Lov$\acute{a}$sz condition in the definition of LLL-reduced lattice with the looser Siegel condition. Simulation results show that for LR-aided linear detections, the bit-error-rate performance is still maintained with this relaxation. Comparisons between the two algorithms in terms of bit-error-rate performance, and average field-programmable gate array processing time in the systolic array are made, which shows that ASLR is a better choice for a systolic architecture, especially for systems with a large number of antennas.

선형 시스토릭 어레이를 이용한 완전탐색 블럭정합 이동 예측기의 구조 (A linear systolic array based architecture for full-search block matching motion estimator)

  • 김기현;이기철
    • 한국통신학회논문지
    • /
    • 제21권2호
    • /
    • pp.313-325
    • /
    • 1996
  • This paper presents a new architecture for full-search block-matching motion estimation. The architecture is based on linear systolic arrays. High speed operation is obtained by feeding reference data, search data, and control signals into the linear systolic array in a pipelined fashion. Input data are fed into the linear systolic array at a half of the processor speed, reducing the required data bandwidth to half. The proposed architecture has a good scalability with respect to the number of processors and input bandwidth when the size of reference block and search range change.

  • PDF

삼중대각행렬 선형방정식의 해를 구하기 위한 내용-주소법 씨스톨릭 어레이 (Content-Addressable Systolic Array for Solving Tridiagonal Linear Equation Systems)

  • 이병홍;김정선;채수환
    • 한국통신학회논문지
    • /
    • 제16권6호
    • /
    • pp.556-565
    • /
    • 1991
  • A가 nxn 삼중대각행렬인 선형방정식 Ax=b를 WZ분해 알고리즘을 이용하여 해석하고 이 알고리즘을 CAM Systolic Array 로 구현했다. 그리고 이 어레이를 평가하기위하여 LU분해 알고리즘을 제시하고 이를 W, D, Z분해 알고리즘과 비교 고찰한 결과 LU분해 알고리즘 보다 WZ분해 알고리즘이 1/4정도 가까운 시간으로 실행시간이 단축될 수 있었다. CAM Systolic Array에서 실행되는 각 단계를 1 time stpe으로 가정하면 2n+1 times이 필요하고 CAM의 데이타 워드는 메트릭스 원소의 값과 행번호, 연산의 형태 및 상태에 관한 정보를 포함하고 pipeline식으로 각 프로세서를 systolic processing하므로서 중앙제어가 필요없고, data brodcasting도 피할 수 있다.

  • PDF

Polyadic nonserial 동적 프로그래밍을 위한 문제크기에 독립적인 시스톨릭 어레이의 설계 (Design of Problem Size-Independent Systolic Array for Polyadic-Nonserial Dynamic Programming)

  • 우창호;신동석;정신일;권대형
    • 전자공학회논문지A
    • /
    • 제30A권3호
    • /
    • pp.67-75
    • /
    • 1993
  • In many practical applications of systolic array, it is common that the problem size(n) is larger than the array size(M). In this case, the problem has to be partitioned into block to fit into the array before it is processed. This paper presents a problem partition method for dynamic programming and 2-dimensional systolic array suitable for it. Designed array has two types of array configur-ation for processing the partitioned problem. The queue is designed for storing and recirculating the intermediate results in the correct location and time. The number of processing elements and queues required are M(3M+1)/2, 4M respectively. The total processing time is 2(M+1)+(n+10M+3)(n/M-1)(n/M-1)/6.

  • PDF

고속 모듈러 승산의 비교와 확장 가능한 시스톨릭 어레이의 설계 (Comparison of High Speed Modular Multiplication and Design of Expansible Systolic Array)

  • 추봉조;최성욱
    • 한국정보처리학회논문지
    • /
    • 제6권5호
    • /
    • pp.1219-1224
    • /
    • 1999
  • This paper derived Montgomery's parallel algorithms for modular multiplication based on Walter's and Iwamura's method, and compared data dependence graph of each parallel algorithm. Comparing the result, Walter's parallel algorithm has small computational index in data dependence graph, so it is selected and used to computed spatial and temporal pipelining diagrams with each projection direction for designing expansible bit-level systolic array. We also evaluated internal operation of proposed expansible systolic array C++ language.

  • PDF

Systolic Array를 이용한 3×3 Convolution 연산기 설계 (Design 3×3 Convolution Calculator with Systolic Array)

  • 김형순;이준희;서영호
    • 한국방송∙미디어공학회:학술대회논문집
    • /
    • 한국방송∙미디어공학회 2021년도 추계학술대회
    • /
    • pp.221-222
    • /
    • 2021
  • 본 연구는 Convolution Neural Network에서 사용되는 Convolution 연산기를 Systolic Array를 이용하여 구현한다. 두 개의 층으로 나뉜 연산기에 고정 소수점 값을 가지는 커널 값과 연속적인 입력을 넣고 정확한 출력이 나오는지 확인한다. 연산기 구현은 Verilog HDL로 하였으며 대조 연산은 Python에서 진행하였다.

  • PDF

Binary CDMA를 위한 고속 코릴레이터 설계 (Design of High-Speed Correlator for a Binary CDMA)

  • 구군서;정우경;문장식;류승문;이용석
    • 대한전자공학회:학술대회논문집
    • /
    • 대한전자공학회 2003년도 하계종합학술대회 논문집 II
    • /
    • pp.787-790
    • /
    • 2003
  • This paper describes a high speed correlator that can acquire synchronization quickly. The existing addition algorithm is a binary adder tree architecture that will result in extremely slow speed of operation due to many levels of logic required for computation of correlation[2][3]. This paper suggests the new various architectures, which are systolic array architecture, simple pipeline architecture and block systolic array architecture[4][5]. The acquisition performance of the proposed architectures is analyzed and compared with the existing architecture. The comparison results show that the systolic array architecture and the block systolic array architecture reduce the timing delay up to 73% and 31%, respectively. And the results show that the simple pipeline architecture reduces the timing delay up to 53%..

  • PDF