• Title/Summary/Keyword: verilog HDL

Search Result 417, Processing Time 0.023 seconds

Efficient Intra Predictor Design for H.264/AVC Decoder (H.264/AVC 복호기를 위한 효율적인 인트라 예측기 설계)

  • Kim, Ok;Ryoo, Kwangki
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2009.10a
    • /
    • pp.175-178
    • /
    • 2009
  • H.264/AVC is a video coding standard of ITU-T and ISO/IEC, and widely spreads its application due to its high compression ratio more than twice that of MPEG-2 and high image quality. In this paper, we explained Intra Prediction in H.264/AVC, which is able to achieve higher compressing efficiency from correlation removal of adjacent samples in spatial domain, and proposed efficient Intra Predictor architecture design for H.264/AVC decoder. The proposed system reduced computation cycle using processing element and precomputation processing element and also reduced the number of access to external memory using efficient register. We designed the proposed system with Verilog-HDL and verified with suitable test vector. The proposed Intra Predictor achieved about 60% cycle reduction comparing with existing Intra Predictors.

  • PDF

Design of AES Cryptographic Processor with Modular Round Key Generator (모듈화된 라운드 키 생성회로를 갖는 AES 암호 프로세서의 설계)

  • 최병윤;박영수;전성익
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.12 no.5
    • /
    • pp.15-25
    • /
    • 2002
  • In this paper a design of high performance cryptographic processor which implements AES Rijndael algorithm is described. To eliminate performance degradation due to round-key computation delay of conventional processor, the on-the-fly precomputation of round key based on modified round structure is adopted. And on-the-fly round key generator which supports 128, 192, and 256-bit key has modular structure. The designed processor has iterative structure which uses 1 clock cycle per round and supports three operation modes, such as ECB, CBC, and CTR mode which is a candidate for new AES modes of operation. The cryptographic processor designed in Verilog-HDL and synthesized using 0.251$\mu\textrm{m}$ CMOS cell library consists of about 51,000 gates. Simulation results show that the critical path delay is about 7.5ns and it can operate up to 125Mhz clock frequency at 2.5V supply. Its peak performance is about 1.45Gbps encryption or decryption rate under 128-bit key ECB mode.

Optimized hardware implementation of CIE1931 color gamut control algorithms for FPGA-based performance improvement (FPGA 기반 성능 개선을 위한 CIE1931 색역 변환 알고리즘의 최적화된 하드웨어 구현)

  • Kim, Dae-Woon;Kang, Bong-Soon
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.25 no.6
    • /
    • pp.813-818
    • /
    • 2021
  • This paper proposes an optimized hardware implementation method for existing CIE1931 color gamut control algorithm. Among the post-processing methods of dehazing algorithms, existing algorithm with relatively low computations have the disadvantage of consuming many hardware resources by calculating large bits using Split multiplier in the computation process. The proposed algorithm achieves computational reduction and hardware miniaturization by reducing the predefined two matrix multiplication operations of the existing algorithm to one. And by optimizing the Split multiplier computation, it is implemented more efficient hardware to mount. The hardware was designed in the Verilog HDL language, and the results of logical synthesis using the Xilinx Vivado program were compared to verify real-time processing performance in 4K environments. Furthermore, this paper verifies the performance of the proposed hardware with mounting results on two FPGAs.

Hardware Architecture for Entropy Filter Implementation (엔트로피 필터 구현에 대한 Hardware Architecture)

  • Sim, Hwi-Bo;Kang, Bong-Soon
    • Journal of IKEEE
    • /
    • v.26 no.2
    • /
    • pp.226-231
    • /
    • 2022
  • The concept of information entropy has been widely applied in various fields. Recently, in the field of image processing, many technologies applying the concept of information entropy have been developed. As the importance and demand of computer vision technologies increase in modern industry, real-time processing must be possible in order for image processing technologies to be efficiently applied to modern industries. Extracting the entropy value of an image is difficult to process in real-time due to the complexity of computation in software, and a hardware structure of an image entropy filter capable of real-time processing has never been proposed. In this paper, we propose for the first time a hardware structure of a histogram-based entropy filter that can be processed in real time using a barrel shifter. The proposed hardware was designed using Verilog HDL, and Xilinx's xczu7ev-2ffvc1156 was set as the target device and FPGA was implemented. As a result of logic synthesis using the Xilinx Vivado program, it has a maximum operating frequency of 750.751 MHz in a 4K UHD high-resolution environment, and it processes more than 30 images per second and satisfies the real-time processing standard.

Hardware implementation of automated haze removal method capable of real-time processing based on Hazy Particle Map (Hazy Particle Map 기반 실시간 처리 가능한 자동화 안개 제거방법의 하드웨어 구현)

  • Sim, Hwi-Bo;Kang, Bong-Soon
    • Journal of IKEEE
    • /
    • v.26 no.3
    • /
    • pp.401-407
    • /
    • 2022
  • Recently, image processing technology for autonomous driving by recognizing objects and lanes through camera images to realize autonomous vehicles is being studied. Haze reduces the visibility of images captured by the camera and causes malfunctions of autonomous vehicles. To solve this, it is necessary to apply the haze removal function that can be processed in real time to the camera. Therefore, in this paper, the fog removal method of Sim with excellent performance is implemented with hardware capable of real-time processing. The proposed hardware was designed using Verilog HDL, and FPGA was implemented by setting Xilinx's xc7z045-2ffg900 as the target device. As a result of logic synthesis using Xilinx Vivado program, it has a maximum operating frequency of 276.932MHz and a maximum processing speed of 31.279fps in a 4K (4096×2160) high-resolution environment, thus satisfying the real-time processing standard.

Approximate Multiplier with High Density, Low Power and High Speed using Efficient Partial Product Reduction (효율적인 부분 곱 감소를 이용한 고집적·저전력·고속 근사 곱셈기)

  • Seo, Ho-Sung;Kim, Dae-Ik
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.17 no.4
    • /
    • pp.671-678
    • /
    • 2022
  • Approximate computing is an computational technique that is acceptable degree of inaccurate results of accurate results. Approximate multiplication is one of the approximate computing methods for high-performance and low-power computing. In this paper, we propose a high-density, low-power, and high-speed approximate multiplier using approximate 4-2 compressor and improved full adder. The approximate multiplier with approximate 4-2 compressor consists of three regions of the exact, approximate and constant correction regions, and we compared them by adjusting the size of region by applying an efficient partial product reduction. The proposed approximate multiplier was designed with Verilog HDL and was analyzed for area, power and delay time using Synopsys Design Compiler (DC) on a 25nm CMOS process. As a result of the experiment, the proposed multiplier reduced area by 10.47%, power by 26.11%, and delay time by 13% compared to the conventional approximate multiplier.

Implementation of Exchange Rate Forecasting Neural Network Using Heterogeneous Computing (이기종 컴퓨팅을 활용한 환율 예측 뉴럴 네트워크 구현)

  • Han, Seong Hyeon;Lee, Kwang Yeob
    • Asia-pacific Journal of Multimedia Services Convergent with Art, Humanities, and Sociology
    • /
    • v.7 no.11
    • /
    • pp.71-79
    • /
    • 2017
  • In this paper, we implemented the exchange rate forecasting neural network using heterogeneous computing. Exchange rate forecasting requires a large amount of data. We used a neural network that could leverage this data accordingly. Neural networks are largely divided into two processes: learning and verification. Learning took advantage of the CPU. For verification, RTL written in Verilog HDL was run on FPGA. The structure of the neural network has four input neurons, four hidden neurons, and one output neuron. The input neurons used the US $ 1, Japanese 100 Yen, EU 1 Euro, and UK £ 1. The input neurons predicted a Canadian dollar value of $ 1. The order of predicting the exchange rate is input, normalization, fixed-point conversion, neural network forward, floating-point conversion, denormalization, and outputting. As a result of forecasting the exchange rate in November 2016, there was an error amount between 0.9 won and 9.13 won. If we increase the number of neurons by adding data other than the exchange rate, it is expected that more precise exchange rate prediction will be possible.

Protocol Design and Controller Implementation of Automotive LED Matrix Headlamp Control (차량용 LED 매트릭스 헤드램프 제어를 위한 LED 제어 프로토콜 설계 및 제어기 구현)

  • Changmin Lee;Wonchae Kim;Seonghyun Yang;Seongsoo Lee
    • Journal of IKEEE
    • /
    • v.27 no.4
    • /
    • pp.368-378
    • /
    • 2023
  • Automotive headlamp with LED matrix exploits low-cost low-speed serial buses such as I2C and SPI for digital LED control. When headlamp resolution increases, LED control data significantly increases to exceed capacity of control bus. This paper proposes HLCP (Headlamp LED Control Protocol), a novel LED maxtrix headlamp protocol. The proposed protocol exploits dedicated instructions to control many LEDs simultaneously, so it can control much more LEDs than conventional control buses although it is basically based on I2C bus. It is designed and verified in Verilog HDL. Simulation results show that HLCP can control LED matrix headlamp more efficiently than I2C and SPI.

Design of a Delayed Dual-Core Lock-Step Processor with Automatic Recovery in Soft Errors (소프트 에러 발생 시 자동 복구하는 이중 코어 지연 락스텝 프로세서의 설계)

  • Juho Kim;Seonghyun Yang;Seongsoo Lee
    • Journal of IKEEE
    • /
    • v.27 no.4
    • /
    • pp.683-686
    • /
    • 2023
  • In this paper, we designed a Delayed Dual Core Lock-Step (D-DCLS) processor where two cores operate same instructions with delay and the result is compared to mitigate soft errors and common mode failures in automotive electronic systems. Because D-DCLS does not know which core an error occurred in, each core must be recovered to the point before the error occurred, but complex hardware modifications are required to return all intermediate values on the pipeline stage. In this paper, in order for easy hardware implementation, all register values are saved to a buffer whenever a branch instruction is executed. When an error is detected, the saved register values are automatically restored, and then 'BX LR' instruction is executed to return to the last branch point. The proposed D-DCLS processor was designed using Verilog HDL and was confirmed to continue normal operation after automatically recovering error.

Design of 32-bit Floating Point Multiplier for FPGA (FPGA를 위한 32비트 부동소수점 곱셈기 설계)

  • Xuhao Zhang;Dae-Ik Kim
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.19 no.2
    • /
    • pp.409-416
    • /
    • 2024
  • With the expansion of floating-point operation requirements for fast high-speed data signal processing and logic operations, the speed of the floating-point operation unit is the key to affect system operation. This paper studies the performance characteristics of different floating-point multiplier schemes, completes partial product compression in the form of carry and sum, and then uses a carry look-ahead adder to obtain the result. Intel Quartus II CAD tool is used for describing Verilog HDL and evaluating performance results of the floating point multipliers. Floating point multipliers are analyzed and compared based on area, speed, and power consumption. The FMAX of modified Booth encoding with Wallace tree is 33.96 Mhz, which is 2.04 times faster than the booth encoding, 1.62 times faster than the modified booth encoding, 1.04 times faster than the booth encoding with wallace tree. Furthermore, compared to modified booth encoding, the area of modified booth encoding with wallace tree is reduced by 24.88%, and power consumption of that is reduced by 2.5%.