• Title/Summary/Keyword: Verilog-A

Search Result 450, Processing Time 0.023 seconds

Design of High-performance Pedestrian and Vehicle Detection Circuit using Haar-like Features (Haar-like 특징을 이용한 고성능 보행자 및 차량 인식 회로 설계)

  • Kim, Soo-Jin;Park, Sang-Kyun;Lee, Seon-Young;Cho, Kyeong-Soon
    • The KIPS Transactions:PartA
    • /
    • v.19A no.4
    • /
    • pp.175-180
    • /
    • 2012
  • This paper describes the design of high-performance pedestrian and vehicle detection circuit using the Haar-like features. The proposed circuit uses a sliding window for every image frame in order to extract Haar-like features and to detect pedestrians and vehicles. A total of 200 Haar-like features per sliding window is extracted from Haar-like feature extraction circuit and the extracted features are provided to AdaBoost classifier circuit. In order to increase the processing speed, the proposed circuit adopts the parallel architecture and it can process two sliding windows at the same time. We described the proposed high-performance pedestrian and vehicle detection circuit using Verilog HDL and synthesized the gate-level circuit using the 130nm standard cell library. The synthesized circuit consists of 1,388,260 gates and its maximum operating frequency is 203MHz. Since the proposed circuit processes about 47.8 $640{\times}480$ image frames per second, it can be used to provide the real-time detection of pedestrians and vehicles.

Implementation of Neural Network Accelerator for Rendering Noise Reduction on OpenCL (OpenCL을 이용한 랜더링 노이즈 제거를 위한 뉴럴 네트워크 가속기 구현)

  • Nam, Kihun
    • The Journal of the Convergence on Culture Technology
    • /
    • v.4 no.4
    • /
    • pp.373-377
    • /
    • 2018
  • In this paper, we propose an implementation of a neural network accelerator for reducing the rendering noise using OpenCL. Among the rendering algorithms, we selects a ray tracing to assure a high quality graphics. Ray tracing rendering uses ray to render, less use of the ray will result in noise. Ray used more will produce a higher quality image but will take operation time longer. To reduce operation time whiles using fewer rays, Learning Base Filtering algorithm using neural network was applied. it's not always produce optimize result. In this paper, a new approach to Matrix Multiplication that is based on General Matrix Multiplication for improved performance. The development environment, we used specialized in high speed parallel processing of OpenCL. The proposed architecture was verified using Kintex UltraScale XKU6909T-2FDFG1157C FPGA board. The time it takes to calculate the parameters is about 1.12 times fast than that of Verilog-HDL structure.

Design of Image Extraction Hardware for Hand Gesture Vision Recognition

  • Lee, Chang-Yong;Kwon, So-Young;Kim, Young-Hyung;Lee, Yong-Hwan
    • Journal of Advanced Information Technology and Convergence
    • /
    • v.10 no.1
    • /
    • pp.71-83
    • /
    • 2020
  • In this paper, we propose a system that can detect the shape of a hand at high speed using an FPGA. The hand-shape detection system is designed using Verilog HDL, a hardware language that can process in parallel instead of sequentially running C++ because real-time processing is important. There are several methods for hand gesture recognition, but the image processing method is used. Since the human eye is sensitive to brightness, the YCbCr color model was selected among various color expression methods to obtain a result that is less affected by lighting. For the CbCr elements, only the components corresponding to the skin color are filtered out from the input image by utilizing the restriction conditions. In order to increase the speed of object recognition, a median filter that removes noise present in the input image is used, and this filter is designed to allow comparison of values and extraction of intermediate values at the same time to reduce the amount of computation. For parallel processing, it is designed to locate the centerline of the hand during scanning and sorting the stored data. The line with the highest count is selected as the center line of the hand, and the size of the hand is determined based on the count, and the hand and arm parts are separated. The designed hardware circuit satisfied the target operating frequency and the number of gates.

Design of Scalable Intra-prediction Architecture for H.264 Decoders (H.264 복호기를 위한 스케일러블 인트라 예측기 구조 설계)

  • Lee, Chan-Ho
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.45 no.11
    • /
    • pp.77-82
    • /
    • 2008
  • H.264 is a video coding standard of ITU-T and ISO/IEC, and widely spreads its application due to its high compression ratio more than twice that of MPEG-2 and high image quality. It has different architecture depending on demands since it is a lied from small image of QVGA to large size of HD. In this paper, We propose a scalable architecture for intra-prediction of H.264 decoders. The proposed scheme has a scalable architecture that can accommodate up to 4 processing elements depending on performance demands and can reduce the number of access to memory using efficient memory management so as to be energy-efficient. We design the intra-prediction unit using Verilog-HDL and verily it by prototyping using an FPGA. The performance is analyzed using the results of design.

A Cryptoprocessor for AES-128/192/256 Rijndael Block Cipher Algorithm (AES-128/192/256 Rijndael 블록암호 알고리듬용 암호 프로세서)

  • 안하기;박광호;신경욱
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.6 no.3
    • /
    • pp.427-433
    • /
    • 2002
  • This paper describes a design of cryptographic processor that implements the AES(Advanced Encryption Standard) block cipher algorithm "Rijndael". To achieve high throughput rate, a sub-pipeline stage is inserted into the round transformation block, resulting that the second half of current round function and the first half of next round function are being simultaneously operated. For area-efficient and low-power implementation, the round block is designed to share the hardware resources in encryption and decryption. An efficient scheme for on-the-fly key scheduling, which supports the three master-key lengths of 128-b/192-b/256-b, is devised to generate round keys in the first sub-pipeline stage of each round processing. The cryptoprocessor designed in Verilog-HDL was verified using Xilinx FPGA board and test system. The core synthesized using 0.35-${\mu}{\textrm}{m}$ CMOS cell library consists of about 25,000 gates. Simulation results show that it has a throughput of about 520-Mbits/sec with 220-MHz clock frequency at 2.5-V supply.-V supply.

Region of Interest Extraction Method and Hardware Implementation of Matrix Pattern Image (매트릭스 패턴 영상의 관심 영역 추출 방법 및 하드웨어 구현)

  • Cho, Hosang;Kim, Geun-Jun;Kang, Bongsoon
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.19 no.4
    • /
    • pp.940-947
    • /
    • 2015
  • This paper presents the region of interest pattern image extraction method on a display printed matrix pattern. Proposed method can not use conventional method such as laser, ultrasonic waves and touch sensor. It searches feature point and rotation angle using luminance and pattern reliable feature points of input image, and then it extracts region of interest. In order to extract region of interest, we simulate proposed method using pattern image written various angles on display panel. The proposed method makes progress using the OpenCV and the window program, and was designed using Verilog-HDL and was verified through the FPGA Board(xc6vlx760) of Xilinx.

A Design of Pipeline Chain Algorithm Based on Circuit Switching for MPI Broadcast Communication System (MPI 브로드캐스트 통신을 위한 서킷 스위칭 기반의 파이프라인 체인 알고리즘 설계)

  • Yun, Heejun;Chung, Wonyoung;Lee, Yong-Surk
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.37B no.9
    • /
    • pp.795-805
    • /
    • 2012
  • This paper proposes an algorithm and a hardware architecture for a broadcast communication which has the worst bottleneck among multiprocessor using distributed memory architectures. In conventional system, The pipelined broadcast algorithm is an algorithm which takes advantage of maximum bandwidth of communication bus. But unnecessary synchronization process are repeated, because the pipelined broadcast sends the data divided into many parts. In this paper, the MPI unit for pipeline chain algorithm based on circuit switching removing the redundancy of synchronization process was designed, the proposed architecture was evaluated by modeling it with systemC. Consequently, the performance of the proposed architecture was highly improved for broadcast communication up to 3.3 times that of systems using conventional pipelined broadcast algorithm, it can almost take advantage of the maximum bandwidth of transmission bus. Then, it was implemented with VerilogHDL, synthesized with TSMC 0.18um library and implemented into a chip. The area of synthesis results occupied 4,700 gates(2 input NAND gate) and utilization of total area is 2.4%. The proposed architecture achieves improvement in total performance of MPSoC occupying relatively small area.

Design of DUC/DDC for the Underwater Basestation Based on Underwater Acoustic Communication (수중기지국 수중 음향 통신을 위한 DUC/DDC 설계)

  • Kim, Sunhee
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.18 no.5
    • /
    • pp.336-342
    • /
    • 2017
  • Recently, there has been an increasing need for underwater communication systems to monitor ocean environments and prevent marine disasters, as well as to secure ocean resources. Most underwater communication systems adopted acoustic communication with a consideration of attenuation, absorption, and scattering in conductive sea water, and developed fully digital modems based on processors. In this study, a digital up converter (DUC) and a digital down converter (DDC) was developed for an underwater basestation based on underwater acoustic communication systems. Because one of the most important issues in underwater acoustic communication systems is low power consumption due to environmental problems, this study developed a specific hardware module for DUC and DDC. It supported four links of underwater acoustic communication systems and converted the sampling rate and frequency. The systemwas designed and verified using Verilog-HDL in ModelSim environment with the test data generated from baseband layer parts for an underwater base station.

Design of A Reed-Solomon Code Decoder for Compact Disc Player using Microprogramming Method (마이크로프로그래밍 방식을 이용한 CDP용 Reed-Solomon 부호의 복호기 설계)

  • 김태용;김재균
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.18 no.10
    • /
    • pp.1495-1507
    • /
    • 1993
  • In this paper, an implementation of RS (Reed-Solomon) code decoder for CDP (Compact Disc Player) using microprogramming method is presented. In this decoding strategy, the equations composed of Newton's identities are used for computing the coefficients of the error locator polynomial and for checking the number of erasures in C2(outer code). Also, in C2 decoding the values of erasures are computed from syndromes and the results of C1(inner code) decoding. We pulled up the error correctability by correcting 4 erasures or less. The decoder contains an arithmetic logic unit over GF(28) for error correcting and a decoding controller with programming ROM, and also microinstructions. Microinstructions are used for an implementation of a decoding algorithm for RS code. As a result, it can be easily modified for upgrade or other applications by changing the programming ROM only. The decoder is implemented by the Logic Level Modeling of Verilog HDL. In the decoder, each microinstruction has 14 bits( = 1 word), and the size of the programming ROM is 360 words. The number of the maximum clock-cycle for decoding both C1 and C2 is 424.

  • PDF

Design of a High-Performance Information Security System-On-a-Chip using Software/Hardware Optimized Elliptic Curve Finite Field Computational Algorithms (소프트웨어/하드웨어 최적화된 타원곡선 유한체 연산 알고리즘의 개발과 이를 이용한 고성능 정보보호 SoC 설계)

  • Moon, San-Gook
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.13 no.2
    • /
    • pp.293-298
    • /
    • 2009
  • In this contribution, a 193-bit elliptic curve cryptography coprocessor was implemented on an FPGA board. Optimized algorithms and numerical expressions which had been verified through C program simulation, should be analyzed again with HDL (hardware description language) such as Verilog, so that the verified ones could be modified to be applied directly to hardware implementation. The reason is that the characteristics of C programming language design is intrinsically different from the hardware design structure. The hardware IP which was double-checked in view of hardware structure together with algoritunic verification, was implemented on the Altera CycloneII FPGA device equipped with ARM9 microprocessor core, to a real chip prototype, using Altera embedded system development tool kit. The implemented finite field calculation IPs can be used as library modules as Elliptic Curve Cryptography finite field operations which has more than 193 bit key length.