• Title/Summary/Keyword: FPGA processor

Search Result 382, Processing Time 0.022 seconds

A 3D graphic pipelines with an efficient clipping algorithm (효율적인 클리핑 기능을 갖는 3차원 그래픽 파이프라인 구조)

  • Lee, Chan-Ho
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.45 no.8
    • /
    • pp.61-66
    • /
    • 2008
  • Recently, portable devices which require small area and low power consumption employ applications using 3D graphics such as 3D games and 3D graphical user interfaces. We propose an efficient clipping engine algorithm which is suitable in 3D graphics pipeline. The clipping operation is divided into two steps: one is the selection process in the transformation engine and the other is the pixel clipping process in the scan conversion unit. The clipping operation is possible with addition of simple comparator. The clipping for the Y-axis is achieved in the edge walk stage and that for the X and Z-axis is performed in the span processing. The proposed clipping algorithm reduces the operation cycles and the area of of 3D graphics pipelines. We designed a 3D graphics pipeline with the proposed clipping algorithm using Verilog-HDL and verifies the operation using an FPGA.

Real-Time Object Detection System Based on Background Modeling in Infrared Images (적외선영상에서 배경모델링 기반의 실시간 객체 탐지 시스템)

  • Park, Chang-Han;Lee, Jae-Ik
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.46 no.4
    • /
    • pp.102-110
    • /
    • 2009
  • In this paper, we propose an object detection method for real-time in infrared (IR) images and PowerPC (PPC) and H/W design based on field programmable gate array (FPGA). An open H/W architecture has the advantages, such as easy transplantation of HW and S/W, support of compatibility and scalability for specification of current and previous versions, common module design using standardized design, and convenience of management and maintenance. Proposed background modeling for an open H/W architecture design decreases size of search area to construct a sparse block template of search area in IR images. We also apply to compensate for motion compensation when image moves in previous and current frames of IR sensor. Separation method of background and objects apply to adaptive values through time analysis of pixel intensity. Method of clutter reduction to appear near separated objects applies to median filter. Methods of background modeling, object detection, median filter, labeling, merge in the design embedded system execute in PFC processor. Based on experimental results, proposed method showed real-time object detection through global motion compensation and background modeling in the proposed embedded system.

An Efficient Hardware Implementation of Square Root Computation over GF(p) (GF(p) 상의 제곱근 연산의 효율적인 하드웨어 구현)

  • Choe, Jun-Yeong;Shin, Kyung-Wook
    • Journal of IKEEE
    • /
    • v.23 no.4
    • /
    • pp.1321-1327
    • /
    • 2019
  • This paper describes an efficient hardware implementation of modular square root (MSQR) computation over GF(p), which is the operation needed to map plaintext messages to points on elliptic curves for elliptic curve (EC)-ElGamal public-key encryption. Our method supports five sizes of elliptic curves over GF(p) defined by the National Institute of Standards and Technology (NIST) standard. For the Koblitz curves and the pseudorandom curves with 192-bit, 256-bit, 384-bit and 521-bit, the Euler's Criterion based on the characteristic of the modulo values was applied. For the elliptic curves with 224-bit, the Tonelli-Shanks algorithm was simplified and applied to compute MSQR. The proposed method was implemented using the finite field arithmetic circuit with 32-bit datapath and memory block of elliptic curve cryptography (ECC) processor, and its hardware operation was verified by implementing it on the Virtex-5 field programmable gate array (FPGA) device. When the implemented circuit operates with a 50 MHz clock, the computation of MSQR takes about 18 ms for 224-bit pseudorandom curves and about 4 ms for 256-bit Koblitz curves.

Design of Square Root and Inverse Square Root Arithmetic Units for Mobile 3D Graphic Processing (모바일 3차원 그래픽 연산을 위한 제곱근 및 역제곱근 연산기 구조 및 설계)

  • Lee, Chan-Ho
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.46 no.3
    • /
    • pp.20-25
    • /
    • 2009
  • We propose hardware architecture of floating-point square root and inverse square root arithmetic units using lookup tables. They are used for lighting engines and shader processor for 3D graphic processing. The architecture is based on Taylor series expansion and consists of lookup tables and correction units so that the size of look-up tables are reduced. It can be applied to 32 bit floating point formats of IEEE-754 and reduced 24 bit floating point formats. The square root and inverse square root arithmetic units for 32 bit and 24 bit floating format number are designed as the proposed architecture. They can operation in a single cycle, and satisfy the precision of $10^{-5}$ required by OpenGL 1.x ES. They are designed using Verilog-HDL and the RTL codes are verified using an FPGA.

VHDL Implementation of GEN2 Protocol for UHF RFID Tag (RFID GEN2 태그 표준의 VHDL 설계)

  • Jang, Il-Su;Yang, Hoon-Gee
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.32 no.12A
    • /
    • pp.1311-1319
    • /
    • 2007
  • This paper presents the VHDL implementation procedure of the passive RFID tag operating in Ultra High Frequency. The operation of the tag compatible with the EPCglobal Class1 Generation2(GEN2) protocol is verified by timing simulation after synthesis and implementation. Due to the reading range with relatively large distance, a passive tag needs digital processor which facilitates faster decoding, encoding and state transition for enhancement of an interrogation rate. In order to satisfy linking time, the pipe-line structure is used, which can minimize latency to serial input data stream. We also propose the sampling strategy to decode the Preamble, the Frame-sync and PIE symbols in reader commands. The simulation results with the fastest data rate and multi tags environment scenario show that the VHDL implemented tag performs faster operation than GEN2 proposed.

A Cryptoprocessor for AES-128/192/256 Rijndael Block Cipher Algorithm (AES-128/192/256 Rijndael 블록암호 알고리듬용 암호 프로세서)

  • 안하기;박광호;신경욱
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2002.05a
    • /
    • pp.257-260
    • /
    • 2002
  • This paper describes a design of cryptographic processor that implements the AES (Advanced Encryption Standard) block cipher algorithm“Rijndael”. To achieve high throughput rate, a sub-pipeline stage is inserted into the round transformation block, resulting that the second half of current round function and the first half of next round function are being simultaneously operated. For area-efficient and low-power implementation the round transformation block is designed to share the hardware resources in encryption and decryption. An efficient scheme for on-the-fly key scheduling, which supports the three master-key lengths of 128-b/192-b/256-b, is devised to generate round keys in the first sub-pipeline stage of each round processing. The cryptoprocessor designed in Verilog-HDL was verified using Xilinx FPGA board and test system. The core synthesized using 0.35-${\mu}{\textrm}{m}$ CMOS cell library consists of about 25,000 gates. Simulation results show that it has a throughput of about 520-Mbits/sec with 220-MHz clock frequency at 2.5-V supply.

  • PDF

AES-128/192/256 Rijndael Cryptoprocessor with On-the-fly Key Scheduler (On-the-fly 키 스케줄러를 갖는 AED-128/192/256 Rijndael 암호 프로세서)

  • Ahn, Ha-Kee;Shin, Kyung-Wook
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.39 no.11
    • /
    • pp.33-43
    • /
    • 2002
  • This paper describes a design of cryptographic processor that implements the AES (Advanced Encryption Standard) block cipher algorithm "Rijndael". To achieve high throughput rate, a sub-pipeline stage is inserted into a round transformation block, resulting that two consecutive round functions are simultaneously operated. For area-efficient and low-power implementation, the round transformation block is designed to share the hardware resources for encryption and decryption. An efficient on-the-fly key scheduler is devised to supports the three master-key lengths of 128-b/192-b/256-b, and it generates round keys in the first sub-pipeline stage of each round processing. The Verilog-HDL model of the cryptoprocessor was verified using Xilinx FPGA board and test system. The core synthesized using 0.35-${\mu}m$ CMOS cell library consists of about 25,000 gates. Simulation results show that it has a throughput of about 520-Mbits/sec with 220-MHz clock frequency at 2.5-V supply.

Measuring ultrasonic TOF using Zynq baremetal Multiprocessing (Zynq 기반 baremetal 멀티프로세싱에 의한 초음파 TOF 측정)

  • Kang, Moon ho
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.54 no.6
    • /
    • pp.93-99
    • /
    • 2017
  • In this research the TOF (time of flight) of ultrasonic signal is measured using Xilinx's Zynq SoC (system on chip). The TOF is calculated from the difference between periods during which RF (radio frequency) and ultrasonic signals come across a distance, and then travelling distance is obtained by multiplying the TOF by the ultrasonic speed in the air. For this purpose, a ultrasonic pulse is generated from a Zynq's internal ADC, a FIR (finite impulse response) filter, and a Kalman filter. And a RF reference pulse is generated from a RF interface. Based on baremetal multiprocessing, the Kalman filter and the RF interface are c-programmed on Zynq's dual processor cores, with other components fabricated on Zynq's FPGA. With this HW/SW co-design, both lower resource utilization and much smaller designing period were obtained than the HW design. As a design tool, Vivado IDE(integrated design environment) is used to design the whole signal processing system in hierarchical block diagrams.

Design of Caption-processing ASIC for On Screen Display (On Screen Display용 자막처리 ASIC 설계)

  • Jeong, Geun-Yeong;U, Jong-Sik;Park, Jong-In;Park, Ju-Seong;Park, Jong-Seok
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.37 no.5
    • /
    • pp.66-76
    • /
    • 2000
  • This paper describes design and implementation of caption-processing ASIC(Application Specific Integrated Circuits) for OSD(On Screen Display) of karaoke system. The OSD of conventional karaoke system was implemented by a general purpose DSP, however this paper suggest a design to save hardware resources. The ASIC receives commands and data of graphic and caption from host processor, and then modifies the data to have various graphic effects. The design has been done by schematic and VHDL coding. The design was verified by logic simulation and FPGA emulation on the real system. The chip was fabricated with 0.8${\mu}{\textrm}{m}$ CMOS SOG, and worked properly at the karaoke system.

  • PDF

A Study on 16/32 bit Bi-length Instruction Set Computer 32 bit Micro Processor (16/32비트 길이 명령어를 갖는 32비트 마이크로 프로세서에 관한 연구)

  • Cho, Gyoung-Youn
    • The Transactions of the Korea Information Processing Society
    • /
    • v.7 no.2
    • /
    • pp.520-528
    • /
    • 2000
  • he speed of microprocessor getting faster, the data transfer width between the microprocessor and the memory becomes a critical part to limit the system performance. So the study of the computer architecture with the high code density is cmerged. In this paper, a tentative Bi-Length Instruction Set Computer(BISC) that consists of 16 bit and 32 bit length instructions is proposed as the high code density 32 bit microprocessor architecture. The 32 bit BISC has 16 general purpose registers and two kinds of instructions due to the length of offset and the size of immediate operand. The proposed 32 bit BISC is implemented by FPGA, and all of its functions are tested and verified at 1.8432MHz. And the cross assembler, the cross C/C++ compiler and the instruction simulator of the 32 bit BISC are designed and verified. This paper also proves that the code density of 32 bit BISC is much higher than the one of traditional architecture, it accounts for 130~220% of RISC and 130~140% of CISC. As a consequence, the BISC is suitable for the next generation computer architecture because it needs less data transfer width. And its small memory requirement offers that it could be useful for the embedded microprocessor.

  • PDF