• 제목/요약/키워드: Processor Array

검색결과 234건 처리시간 0.03초

FFT Array Processor System with Easily Adjustable Computation speed and Hardware Complexity (계산속도와 하드웨어 양이 조절 용이한 FFT Array Processor 시스템)

  • Jae Hee Yoo
    • Journal of the Korean Institute of Telematics and Electronics A
    • /
    • 제30A권3호
    • /
    • pp.114-129
    • /
    • 1993
  • A FFT array processor algorithm and architecture which anc use a minumum required number of simple, duplicate multiplier-adder processing elements according to various computation speed, will be presented. It is based on the p fold symmetry in the radix p constant geometry FFT butterfly stage with shuffled inputs and normally ordered outputs. Also, a methodology to implement a high performance high radix FFT with VLSI by constructing a high radix processing element with the duplications of a simple lower radix processing element will be discussed. Various performances and the trade-off between computation speed and hardware complexity will be evaluated and compared. Bases on the presented architecture, a radix 2, 8 point FFT processing element chip has been designed and it structure and the results will be discusses.

  • PDF

The Design of a Code-String Matching Processor using an EWLD Algorithm (EWLD 알고리듬을 이용한 코드열 정합 프로세서의 설계)

  • 조원경;홍성민;국일호
    • Journal of the Korean Institute of Telematics and Electronics A
    • /
    • 제31A권4호
    • /
    • pp.127-135
    • /
    • 1994
  • In this paper we propose an EWLD(Enhanced Weighted Levenshtein Distance) algorithm to organize code-string pattern matching linear array processor based on the mappting to an one-dimensional array from a two-dimensional matching matrix, and design a processing element(PE) of the processor, N PEs are required instead of NS02T in the processor because of the mapping. Data input and output between PEs and all internal operations of each PE are performed in bit-serial fashion. The bit-serial operation consists of the computing of word distance (WD) by comparison and the selection of optimal code transformation path, and takes 22 clocks as a cycle. The layout of a PE is designed based on the double metal $1.5\mu$m CMOS rule. About 1,800 transistors consistute a processing element and 2 PEs are integrated on a 3mm$\times$3mm sized chip.

  • PDF

Design of Modular Exponentiation Processor for RSA Cryptography (RSA 암호시스템을 위한 모듈러 지수 연산 프로세서 설계)

  • 허영준;박혜경;이건직;이원호;유기영
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • 제10권4호
    • /
    • pp.3-11
    • /
    • 2000
  • In this paper, we design modular multiplication systolic array and exponentiation processor having n bits message black. This processor uses Montgomery algorithm and LR binary square and multiply algorithm. This processor consists of 3 divisions, which are control unit that controls computation sequence, 5 shift registers that save input and output values, and modular exponentiation unit. To verify the designed exponetion processor, we model and simulate it using VHDL and MAX+PLUS II. Consider a message block length of n=512, the time needed for encrypting or decrypting such a block is 59.5ms. This modular exponentiation unit is used to RSA cryptosystem.

Parallel Processing Implementation of Discrete Hartley Transform using Systolic Array Processor Architecture (Systolic Array Processor Architecture를 이용한 Discrete Hartley Transform 의 병렬 처리 실행)

  • Kang, J.K.;Joo, C.H.;Choi, J.S.
    • Proceedings of the KIEE Conference
    • /
    • 대한전기학회 1988년도 전기.전자공학 학술대회 논문집
    • /
    • pp.14-16
    • /
    • 1988
  • With the development of VLSI technology, research on special processors for high-speed processing is on the increase and studies are focused on designing VLSI-oriented processors for signal processing. This paper processes a one-dimensional systolic array for Discrete Hartley Transform implementation and also processes processing element which is well described for algorithm. The discrete Hartley Transform(DHT) is a real-valued transform closely related to the DFT of a real-valued sequence can be exploited to reduce both the storage and the computation requried to produce the transform of real-valued sequence to a real-valued spectrum while preserving some of the useful properties of the DFT is something preferred. Finally, the architecture of one-dimensional 8-point systolic array, the detailed diagram of PE, total time units concept on implementation this arrays, and modularity are described.

  • PDF

Design and Implementation Systolic Array FFT Processor Based on Shared Memory (공유 메모리 기반 시스토릭 어레이 FFT 프로세서 설계 및 구현)

  • Jeong, Dongmin;Roh, yunseok;Son, Hanna;Jung, Yongchul;Jung, Yunho
    • Journal of IKEEE
    • /
    • 제24권3호
    • /
    • pp.797-802
    • /
    • 2020
  • In this paper, we presents the design and implementation results of the FFT processor, which supports 4096 points of operation with less memory by sharing several memory used in the base-4 systolic array FFT processor into one memory. Sharing memory provides the advantage of reducing the area, and also simplifies the flow of data as I/O of the data progresses in one memory. The presented FFT processor was implemented and verified on the FPGA device. The implementation resulted in 51,855 CLB LUTs, 29,712 CLB registers, 8 block RAM tiles and 450 DSPs, and confirmed that the memory area could be reduced by 65% compared to the existing base-4 systolic array structure.

Design of a Dingle-chip Multiprocessor with On-chip Learning for Large Scale Neural Network Simulation (대규모 신경망 시뮬레이션을 위한 칩상 학습가능한 단일칩 다중 프로세서의 구현)

  • 김종문;송윤선;김명원
    • Journal of the Korean Institute of Telematics and Electronics B
    • /
    • 제33B권2호
    • /
    • pp.149-158
    • /
    • 1996
  • In this paper we describe designing and implementing a digital neural chip and a parallel neural machine for simulating large scale neural netsorks. The chip is a single-chip multiprocessor which has four digiral neural processors (DNP-II) of the same architecture. Each DNP-II has program memory and data memory, and the chip operates in MIMD (multi-instruction, multi-data) parallel processor. The DNP-II has the instruction set tailored to neural computation. Which can be sed to effectively simulate various neural network models including on-chip learning. The DNP-II facilitates four-way data-driven communication supporting the extensibility of parallel systems. The parallel neural machine consists of a host computer, processor boards, a buffer board and an interface board. Each processor board consists of 8*8 array of DNP-II(equivalently 2*2 neural chips). Each processor board acn be built including linear array, 2-D mesh and 2-D torus. This flexibility supports efficiency of mapping from neural network models into parallel strucgure. The neural system accomplishes the performance of maximum 40 GCPS(giga connection per second) with 16 processor boards.

  • PDF

Self-Testing for FFT processor with systolic array architecture (시스토릭 어레이 구조를 갖는 FFT 프로세서에 대한 Self-Testing)

  • Lee, J.K.;Kang, B.H.;Choi, B.I.;Shin, K.U.;Lee, M.K.
    • Proceedings of the KIEE Conference
    • /
    • 대한전기학회 1987년도 전기.전자공학 학술대회 논문집(II)
    • /
    • pp.1503-1506
    • /
    • 1987
  • This paper proposes the self test method for 16 point FFT processor with systolic array architecture. To test efficiently and solve the increased hardware problems due to built-in self test, we change the normal registers into Linear Feedback Shift Registers(LFSR). LFSR can be served as a test pattern generator or a signature analyzer during self test operation, while LFSR a ordering register or a accumulator during normal operation. From the results of logic simulation for 16 point FFT processor by YSLOG, the total time is estimated in about. 21.4 [us].

  • PDF

Development of the Digital Controller for High Precision Digital Power Supply (고정밀전원장치를 위한 디지털 제어기 개발)

  • Ha, K.M.;Lee, S.K.;Kim, Y.S.
    • Proceedings of the Korean Society of Marine Engineers Conference
    • /
    • 한국마린엔지니어링학회 2006년도 전기학술대회논문집
    • /
    • pp.249-250
    • /
    • 2006
  • In this paper, hardware design and implementation of digital controller for the High Precision Digital Power Supply (HPDPS) based on Digital Signal Processor (DSP) and Field Programmable Gate Array (FPGA) is presented. Developed digital controller is composed of high resolution Digital Pulse Width Modulation (DPWM) and high resolution analog to digital converter circuit with anti-aliasing filter. And Digital Signal Processor (DSP) has the capability of a few micro-second calculation time for one feedback loop. 32-bit DSP and DPWM with 150[ps] step resolution is used to implement the HPDPS. Also 18-bit 2 mega sample per second ADC board is adopted for the developed digital controller. Also, hardware structure of the developed digital controller and experimental results of the first prototype board for HPDPS is described.

  • PDF

A Study on Design and Implementation of Scalable Angle Estimator Based on ESPRIT Algorithm (ESPRIT 알고리즘 기반 재구성 가능한 각도 추정기 설계에 관한 연구)

  • Dohyun Lee;Byunghyun Kim;Jongwha Chong;Sungjin Lee;Kyeongyuk Min
    • Journal of IKEEE
    • /
    • 제27권4호
    • /
    • pp.624-629
    • /
    • 2023
  • Estimation of signal parameters via rotational invariance techniques (ESPRIT) is an algorithm that estimates the angle of a signal arriving at an array antenna using the shift invariance property of an array antenna. ESPRIT offers the good trade-off between performance and complexity. However, the ESPRIT algorithm still requires high-complexity operations such as covariance matrix and eigenvalue decomposition, so implementation with a hardware processor is essential to estimate the angle of arrival in real time. In addition, ESPRIT processors should have high performance. The performance is related to the number of antennas, and the number of antennas required for each application are different. Therefore, we proposed an ESPRIT processor that provides 2 to 8 variable antenna configurations to meet the performance and complexity requirements according to the applied field. The proposed ESPRIT processor was designed using the Verilog-HDL and implemented on a field programmable gate array (FPGA).

A Study on the Design of Format Converter for Pixel-Parallel Image Processing (픽셀-병렬 영상처리에 있어서 포맷 컨버터 설계에 관한 연구)

  • 김현기;김현호;하기종;최영규;류기환;이천희
    • Proceedings of the IEEK Conference
    • /
    • 대한전자공학회 2001년도 하계종합학술대회 논문집(2)
    • /
    • pp.269-272
    • /
    • 2001
  • In this paper we proposed the format converter design and implementation for real time image processing. This design method is based on realized the large processor-per-pixel array by integrated circuit technology in which this two types of integrated structure is can be classify associative parallel processor and parallel process with DRAM cell. Layout pitch of one-bit-wide logic is identical memory cell pitch to array high density PEs in integrate structure. This format converter design has control path implementation efficiently, and can be utilized the high technology without complicated controller hardware. Sequence of array instruction are generated by host computer before process start, and instructions are saved on unit controller. Host computer is executed the pixel-parallel operation starting at saved instructions after processing start

  • PDF