• Title/Summary/Keyword: verilog HDL

Search Result 416, Processing Time 0.024 seconds

VLSI Design of H.264/AVC CAVLC encoder for HDTV Application (실시간 HD급 영상 처리를 위한 H.264/AVC CAVLC 부호화기의 하드웨어 구조 설계)

  • Woo, Jang-Uk;Lee, Won-Jae;Kim, Jae-Seok
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.44 no.7 s.361
    • /
    • pp.45-53
    • /
    • 2007
  • In this paper, we propose an efficient hardware architecture for H.264/AVC CAVLC (Context-based Adaptive Variable Length Coding) encoding. Previous CAVLC architectures search all of the coefficients to find statistic characteristics in a block. However, it is unnecessary information that zero coefficients following the last position of a non-zero coefficient when CAVLC encodes residual coefficients. In order to reduce this unnecessary operation, we propose two techniques, which detect the first and last position of non-zero coefficients and arrange non-zero coefficients sequentially. By adopting these two techniques, the required processing time was reduced about 23% compared with previous architecture. It was designed in a hardware description language and total logic gate count is 16.3k using 0.18um standard cell library Simulation results show that our design is capable of real-time processing for $1920{\times}1088\;30fps$ videos at 81MHz.

Design of Degree-Computationless Modified Euclidean Algorithm using Polynomial Expression (다항식 표현을 이용한 DCME 알고리즘 설계)

  • Kang, Sung-Jin;Kim, Nam-Yong
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.36 no.10A
    • /
    • pp.809-815
    • /
    • 2011
  • In this paper, we have proposed and implemented a novel architecture which can be used to effectively design the modified Euclidean (ME) algorithm for key equation solver (KES) block in high-speed Reed-Solomon (RS) decoder. With polynomial expressions of newly-defined state variables for controlling each processing element (PE), the proposed architecture has simple input/output signals and requires less hardware complexity because no degree computation circuits are needed. In addition, since each PE circuit is independent of the error correcting capability t of RS codes, it has the advantage of linearly increase of the hardware complexity of KES block as t increases. For comparisons, KES block for RS(255,239,8) decoder is implemented using Verilog HDL and synthesized with 0.13um CMOS cell library. From the results, we can see that the proposed architecture can be used for a high-speed RS decoder with less gate count.

Study on Structure and Principle of Linear Block Error Correction Code (선형 블록 오류정정코드의 구조와 원리에 대한 연구)

  • Moon, Hyun-Chan;Kal, Hong-Ju;Lee, Won-Young
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.13 no.4
    • /
    • pp.721-728
    • /
    • 2018
  • This paper introduces various linear block error correction code and compares performances of the correction circuits. As the risk of errors due to power noise has increased, ECC(: Error Correction Code) has been introduced to prevent the bit error. There are two representatives of ECC structures which are SEC-DED(: Single Error Correction Double Error Detection) and SEC-DED-DAEC(: Double Adjacent Error Correction). According to simulation results, the SEC-DED circuit has advantages of small area and short delay time compared to SEC-DED-DAEC circuits. In case of SED-DED-DAEC, there is no big difference between Dutta's and Pedro's from performance point of view. Therefore, Pedro's code is more efficient than Dutta' code since the correction rate of Pedro's code is higher than that of Dutta's code.

A Design of AES-based CCMP Core for IEEE 802.11i Wireless LAN Security (IEEE 802.11i 무선 랜 보안을 위한 AES 기반 CCMP Core 설계)

  • Hwang, Seok-Ki;Lee, Jin-Woo;Kim, Chay-Hyeun;Song, You-Soo;Shin, Kyung-Wook
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • v.9 no.1
    • /
    • pp.367-370
    • /
    • 2005
  • This paper describes a design of AES(Advanced Encryption Standard)-based CCMP core for IEEE 802.11i wireless LAN security. To maximize its performance, two AES cores are used, one is for counter mode for data confidentiality and the other is for CBC(Cipher Block Chaining)mode for authentication and data integrity. The S-box that requires the largest hardware in AES core is implemented using composite field arithmetic, and the gate count is reduced by about 25% compared with conventional LUT(Lookup Table)-based design. The CCMP core designed in Verilog-HDL has 15,450 gates, and the estimated throughput is about 128 Mbps at 50-MHz clock frequency). The functionality of the CCMP core is verified by Excalibur SoC implementation.

  • PDF

A Design of the IP Lookup Architecture for High-Speed Internet Router (고속의 인터넷 라우터를 위한 IP 룩업구조 설계)

  • 서해준;안희일;조태원
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.28 no.7B
    • /
    • pp.647-659
    • /
    • 2003
  • LPM(Longest Prefix Matching)searching in If address lookup is a major bottleneck of IP packet processing in the high speed router. In the conventional lookup table for the LPM searching in CAM(Content Addressable Memory) the complexity of fast update take 0(1). In this paper, we designed pipeline architecture for fast update of 0(1) cycle of lookup table and high throughput and low area complexity on LPM searching. Lookup-table architecture was designed by CAM(Content Addressable Memory)away that uses 1bit RAM(Random Access Memory)cell. It has three pipeline stages. Its LPM searching rate is affected by both the number of key field blocks in stage 1 and stage 2, and distribution of matching Point. The RTL(Register Transistor Level) design is carried out using Verilog-HDL. The functional verification is thoroughly done at the gate level using 0.35${\mu}{\textrm}{m}$ CMOS SEC standard cell library.

Design of High Performance Multi-mode 2D Transform Block for HEVC (HEVC를 위한 고성능 다중 모드 2D 변환 블록의 설계)

  • Kim, Ki-Hyun;Ryoo, Kwang-Ki
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.18 no.2
    • /
    • pp.329-334
    • /
    • 2014
  • This paper proposes the hardware architecture of high performance multi-mode 2D forward transform for HEVC which has same number of cycles for processing any type of four TUs and yield high throughput. In order to make the original image which has high pixel and high resolution into highly compressed image effectively, the transform technique of HEVC supports 4 kinds of pixel units, TUs and it finds the optimal mode after performs each transform computation. As the proposed transform engine uses the common computation operator which is produced by analyzing the relationship among transform matrix coefficients, it can process every 4 kinds of TU mode matrix operation with 35cycles equally. The proposed transform block was designed by Verilog HDL and synthesized by using TSMC 0.18um CMOS processing technology. From the results of logic synthesis, the maximum operating frequency was 400MHz and total gate count was 214k gates which has the throughput of 10-Gpels/cycle with the $4k(3840{\times}2160)@30fps$ image.

Design of Multi-Mode Radar Signal Processor for UAV Detection (무인기 탐지를 위한 멀티모드 레이다 신호처리 프로세서 설계)

  • Lee, Seunghyeok;Jung, Yongchul;Jung, Yunho
    • Journal of Advanced Navigation Technology
    • /
    • v.23 no.2
    • /
    • pp.134-141
    • /
    • 2019
  • Radar systems are divided into the pulse Doppler (PD) radar and the frequency modulated continuous wave (FMCW) radar depending on the transmission waveform. In particular, the PD radar is advantageous for long-range target detection, and the FMCW radar is suitable for short-range target detection. In this paper, we present design and implementation results for a multi-mode radar signal processor (RSP) that can support both PD and FMCW radar systems to detect unmanned aerial vehicles (UAVs) at short distances as well as long distances. The proposed radar signal processor can be implemented based on Altera Cyclone-IV FPGA with 19,623 logic elements, 9,759 registers, and 25,190,400 memory bits. The logic elements and registers of the proposed radar signal processor are reduced by approximately 43% and 30%, respectively, compared to the sum of logic elements and registers of the conventional PD radar and FMCW radar signal processor.

Design of Multipliers Optimized for CNN Inference Accelerators (CNN 추론 연산 가속기를 위한 곱셈기 최적화 설계)

  • Lee, Jae-Woo;Lee, Jaesung
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.25 no.10
    • /
    • pp.1403-1408
    • /
    • 2021
  • Recently, FPGA-based AI processors are being studied actively. Deep convolutional neural networks (CNN) are basic computational structures performed by AI processors and require a very large amount of multiplication. Considering that the multiplication coefficients used in CNN inference operation are all constants and that an FPGA is easy to design a multiplier tailored to a specific coefficient, this paper proposes a methodology to optimize the multiplier. The method utilizes 2's complement and distributive law to minimize the number of bits with a value of 1 in a multiplication coefficient, and thereby reduces the number of required stacked adders. As a result of applying this method to the actual example of implementing CNN in FPGA, the logic usage is reduced by up to 30.2% and the propagation delay is also reduced by up to 22%. Even when implemented with an ASIC chip, the hardware area is reduced by up to 35% and the delay is reduced by up to 19.2%.

Vehicle ECU Design Incorporating LIN/CAN Vehicle Interface with Kalman Filter Function (LIN/CAN 차량용 인터페이스와 칼만 필터 기능을 통합한 차량용 ECU 설계)

  • Jeong, Seonwoo;Kim, Yongbin;Lee, Seongsoo
    • Journal of IKEEE
    • /
    • v.25 no.4
    • /
    • pp.762-765
    • /
    • 2021
  • In this paper, an automotive ECU (electronic control unit) with Kalman filter accelerator is designed and implemented. RISC-V is exploited as a processor core. Accelerator for Kalman filter matrix operation, CAN (controller area network) controller for in-vehicle network, and LIN (local interconnect network) controller are designed and embedded. Kalman filter operation consists of time update process and measurement update process. Current state variable and its error covariance are estimated in time update process. Final values are corrected from input measurement data and Kalman gain in measurement update process. Usually floating-point multiplication is exploited in software implementation, but fixed-point multiplier considering accuracy analysis is exploited in this paper to reduce hardware area. In 28nm silicon fabrication, its operating frequency, area, and gate counts are 100MHz, 0.37mm2, and 760k gates, respectively.

Optimization of FPGA-based DDR Memory Interface for better Compatibility and Speed (호환성 및 속도 향상을 위한 FPGA 기반 DDR 메모리 인터페이스의 최적화)

  • Kim, Dae-Woon;Kang, Bong-Soon
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.25 no.12
    • /
    • pp.1914-1919
    • /
    • 2021
  • With the development of advanced industries, research on image processing hardware is essential, and timing verification at the gate level is required for actual chip operation. For FPGA-based verification, DDR3 memory interface was previously applied. But recently, as the FPGA specification has improved, DDR4 memory is used. In this case, when a previously used memory interface is applied, the timing mismatch of signals may occur and thus cannot be used. This is due to the difference in performance between CPU and memory. In this paper, the problem is solved through state optimization of the existing interface system FSM. In this process, data read speed is doubled through AXI Data Width modification. For actual case analysis, ZC706 using DDR3 memory and ZCU106 using DDR4 memory among Xilinx's SoC boards are used.