• Title/Summary/Keyword: special-purpose processor

Search Result 22, Processing Time 0.024 seconds

A Parallel-Architecture Processor Design for the Fast Multiplication of Homogeneous Transformation Matrices (Homogeneous Transformation Matrix의 곱셈을 위한 병렬구조 프로세서의 설계)

  • Kwon Do-All;Chung Tae-Sang
    • The Transactions of the Korean Institute of Electrical Engineers D
    • /
    • v.54 no.12
    • /
    • pp.723-731
    • /
    • 2005
  • The $4{\times}4$ homogeneous transformation matrix is a compact representation of orientation and position of an object in robotics and computer graphics. A coordinate transformation is accomplished through the successive multiplications of homogeneous matrices, each of which represents the orientation and position of each corresponding link. Thus, for real time control applications in robotics or animation in computer graphics, the fast multiplication of homogeneous matrices is quite demanding. In this paper, a parallel-architecture vector processor is designed for this purpose. The processor has several key features. For the accuracy of computation for real application, the operands of the processors are floating point numbers based on the IEEE Standard 754. For the parallelism and reduction of hardware redundancy, the processor takes column vectors of homogeneous matrices as multiplication unit. To further improve the throughput, the processor structure and its control is based on a pipe-lined structure. Since the designed processor can be used as a special purpose coprocessor in robotics and computer graphics, additionally to special matrix/matrix or matrix/vector multiplication, several other useful instructions for various transformation algorithms are included for wide application of the new design. The suggested instruction set will serve as standard in future processor design for Robotics and Computer Graphics. The design is verified using FPGA implementation. Also a comparative performance improvement of the proposed design is studied compared to a uni-processor approach for possibilities of its real time application.

A Study On Improving the Performance of One Dimensional Systolic Array Processor for Matrix.Vector Operation using Sub-Matrix (부분행렬을 사용한 행렬.벡터 연산용 1차원 시스톨릭 어레이 프로세서 설계에 관한 연구)

  • Kim, Yong-Sung
    • The Journal of Information Technology
    • /
    • v.10 no.3
    • /
    • pp.33-45
    • /
    • 2007
  • Systolic Array Processor is used for designing the special purpose processor in Digital Signal Processing, Computer Graphics, Neural Network Applications etc., since it has the characteristic of parallelism, pipeline processing and architecture of regularity. But, in case of using general design method, it has intial waiting period as large as No. of PE-1. And if the connected system needs parallel and simultaneous outputs, processor has some problems of the performance, since it generates only one output at each clock in output state. So in this paper, one dimensional Systolic Array Processor that is designed according to the dependance of data and operations using the partitioned sub-matrix is proposed for the purpose of improving the performance. 1-D Systolic Array using 4 partitioned sub-matrix has efficient method in case of considering those two problems.

  • PDF

Design and Implementation of a Crypto Processor and Its Application to Security System

  • Kim, Ho-Won;Park, Yong-Je;Kim, Moo-Seop
    • Proceedings of the IEEK Conference
    • /
    • 2002.07a
    • /
    • pp.313-316
    • /
    • 2002
  • This paper presents the design and implementation of a crypto processor, a special-purpose microprocessor optimized for the execution of cryptography algorithms. This crypto processor can be used fur various security applications such as storage devices, embedded systems, network routers, etc. The crypto processor consists of a 32-bit RISC processor block and a coprocessor block dedicated to the SEED and triple-DES (data encryption standard) symmetric key crypto (cryptography) algorithms. The crypto processor has been designed and fabricated as a single VLSI chip using 0.5 $\mu\textrm{m}$ CMOS technology. To test and demonstrate the capabilities of this chip, a custom board providing real-time data security for a data storage device has been developed. Testing results show that the crypto processor operates correctly at a working frequency of 30MHz and a bandwidth o1240Mbps.

  • PDF

Hardware Design of Special-Purpose Arithmetic Unit for 3-Dimensional Graphics Processor (3차원 그래픽프로세서용 특수 목적 연산장치의 하드웨어 설계)

  • Choi, Byeong-Yoon
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2011.05a
    • /
    • pp.140-142
    • /
    • 2011
  • In this paper, special purpose arithmetic unit for mobile graphics accelerator is designed. The designed processor supports six operations, such as $1/{\chi}$, $\frac{1}{{\sqrt{x}}$, $log_2x$, $2^x$, $sin(x)$, $cos(x)$. The processor adopts 2nd-order polynomial minimax approximation scheme based on IEEE floating point data format to satisfy accuracy conditions and has 5-stage pipeline structure to meet high operational rates. The SFAU processor consists of 23,000 gates and its estimated operating frequency is about 400 Mhz at operating condition of 65nm CMOS technology. Because the processor can execute all operations with 5-stage pipeline scheme, it has about 400 MOPS(million operations per second) execution rate. Thus, it can be applicable to the 3D mobile graphics processors.

  • PDF

A Study on VLSI-Oriented 2-D Systolic Array Processor Design for APP (Algebraic Path Problem) (VLSI 지향적인 APP용 2-D SYSTOLIC ARRAY PROCESSOR 설계에 관한 연구)

  • 이현수;방정희
    • Journal of the Korean Institute of Telematics and Electronics B
    • /
    • v.30B no.7
    • /
    • pp.1-13
    • /
    • 1993
  • In this paper, the problems of the conventional special-purpose array processor such as the deficiency of flexibility have been investigated. Then, a new modified methodology has been suggested and applied to obtain the common solution of the three typical App algorithms like SP(Shortest Path), TC(Transitive Closure), and MST(Minimun Spanning Tree) among the various APP algorithms using the similar method to obtain the solution. In the newly proposed APP parallel algorithm, real-time Processing is possible, without the structure enhancement and the functional restriction. In addition, we design 2-demensional bit-parallel low-triangular systolic array processor and the 1-PE in detail. For its evaluation, we consider its computational complexity according to bit-processing method and describe relationship of total chip size and execution time. Therefore, the proposed processor obtains, on which a large data inputs in real-time, 3n-4 execution time which is optimal o(n) time complexity, o(n$^{2}$) space complexity which is the number of total gate and pipeline period rate is one.

  • PDF

A study on the efficient method of constrained iterative regular expression pattern matching (제약 반복적인 정규표현식 패턴 매칭의 효율적인 방법에 관한 연구)

  • Seo, Byung-Suk
    • Design & Manufacturing
    • /
    • v.16 no.3
    • /
    • pp.34-38
    • /
    • 2022
  • Regular expression pattern matching is widely used in applications such as computer virus vaccine, NIDS and DNA sequencing analysis. Hardware-based pattern matching is used when high-performance processing is required due to time constraints. ReCPU, SMPU, and REMP, which are processor-based regular expression matching processors, have been proposed to solve the problem of the hardware-based method that requires resynthesis whenever a pattern is updated. However, these processor-based regular expression matching processors inefficiently handle repetitive operations of regular expressions. In this paper, we propose a new instruction set to improve the inefficient repetitive operations of ReCPU and SMPU. We propose REMPi, a regular expression matching processor that enables efficient iterative operations based on the REMP instruction set. REMPi improves the inefficient method of processing a particularly short sub-pattern as a repeat operation OR, and enables processing with a single instruction. In addition, by using a down counter and a counter stack, nested iterative operations are also efficiently processed. REMPi was described with Verilog and synthesized on Intel Stratix IV FPGA.

Low-Complexity Block Diagonalization Precoder Hardware Implementation for MU-MIMO 4×4

  • Khai, Lam Duc
    • Journal of information and communication convergence engineering
    • /
    • v.17 no.1
    • /
    • pp.1-7
    • /
    • 2019
  • In this paper, we present the block diagonalization (BD) algorithm for the multiple-user multiple input multiple output (MU-MIMO) $4{\times}4$ system using specific purpose processor (SPP) hardware. Our objective is to improve the single-user MIMO (SU-MIMO) system using the MU-MIMO technology, which is remarkably fast and allows more users to connect simultaneously. To that end, our MU-MIMO precoder uses the BD algorithm to ensure signal integrity when connecting multiple users; but remains accurate and stable. However, a precoder that uses the BD algorithm is computationally complex; therefore, we use an SPP with special functions designed to compute the BD algorithm. The implementation test results show that our SPP computes the BD algorithm faster than the software solution.

A Study On the Design of Mixed Radix Converter using Partitioned Residues. (분할 잉여수를 사용한 혼합기수변환기 설계에 관한 연구)

  • 김용성
    • The Journal of Information Technology
    • /
    • v.4 no.4
    • /
    • pp.51-63
    • /
    • 2001
  • Residue Number System has carry free operation and parallelism each modulus, So it is used for special purpose processor such as Digital Signal Processing and Neuron Processor. Magnitude comparison and sign detection are in need of Mixed Radix Conversion, and these operations are impediment to improve the operation speed. So in this Paper, MRC(Mixed Radix Converter) is designed using modified partitioned residue to speed up the operation of MRC, so it has progressed maximum twice operation time but increased the size of converter comparison to other converter.

  • PDF

System Design and Realization for Real Time DVR System with Robust Video Watermarking (강인한 비디오 워터마킹을 적용한 실시간 DVR 시스템의 설계 구현)

  • Ryu Kwang-Ryol;Kim Ja-Hwan
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.10 no.6
    • /
    • pp.1019-1024
    • /
    • 2006
  • A system design and realization for real time DVR system with robust video watermarking algorithm for contents security is presented in this paper. The robust video watermarking is used the intraframe space region and interframe insertion simultaneously, and to be processed at real time on image data and algorithm is used the 64bits special purpose quad DSP processor with assembly and soft pipeline codes. The experimental result shows that the processing time takes about 2.5ms in the D1 image per frame for 60% moving image.

Study on the Real Time Medical Image Processing (실시간 의학 영상 처리에 관한 연구)

  • 유선국;이건기
    • Journal of Biomedical Engineering Research
    • /
    • v.8 no.2
    • /
    • pp.118-122
    • /
    • 1987
  • The medical image processing system is intended for a diverse set of users in the medical Imaging Parts. This system consists of a 640 Kbyte IBM-PC/AT with 30 Mbyte hard disk, special purpose image processor with video input devices and display monitor. Image may be recorded and processed in real time at sampling rate up to 10 MHz. This system provides a wide range of image enhancement processing facilities via a menu-driven software packages. These facilities include point by point processing, image averaging, convolution filter and subtraction.

  • PDF