• Title/Summary/Keyword: 하드웨어구조

Search Result 1,786, Processing Time 0.03 seconds

Effective hardware design for DCT-based Intra prediction encoder (DCT 기반 인트라 예측 인코더를 위한 효율적인 하드웨어 설계)

  • Cha, Ki-Jong;Ryoo, Kwang-Ki
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.16 no.4
    • /
    • pp.765-770
    • /
    • 2012
  • In this paper, we proposed an effective hardware structure using DCT-based inra-prediction mode selection to reduce computational complexity caused by intra mode decision. In this hardware structure, the input block is transformed at first and then analyzed to determine its texture directional tendency. the complexity has solved by performing intra prediction in only predicted edge direction. $4{\times}4$ DCT is calculated in one cycle using Multitransform_PE and Inta_pred_PE calculates one prediction mode in two cycles. Experimental results show that the proposed Intra prediction encoding needs only 517 cycles for one macroblock encoding. This architecture improves the performance by about 17% than previous designs. For hardware implementation, the proposed intra prediction encoder is implemented using Verilog HDL and synthesized with Megnachip $0.18{\mu}m$ standard cell library. The synthesis results show that the proposed architecture can run at 125MHz.

Area Efficient Hardware Design for Performance Improvement of SAO (SAO의 성능개선을 위한 저면적 하드웨어 설계)

  • Choi, Jisoo;Ryoo, Kwangki
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.17 no.2
    • /
    • pp.391-396
    • /
    • 2013
  • In this paper, for HEVC decoding, an SAO hardware design with less processing time and reduced area is proposed. The proposed SAO hardware architecture introduces the design processing $8{\times}8$ CU to reduce the hardware area and uses internal registers to support $64{\times}64$ CU processing. Instead of previous top-down block partitioning, it uses bottom-up block partitioning to minimize the amount of calculation and processing time. As a result of synthesizing the proposed architecture with TSMC $0.18{\mu}m$ library, the gate area is 30.7k and the maximum frequency is 250MHz. The proposed SAO hardware architecture can process the decode of a macroblock in 64 cycles.

Hardware Implementation of Genetic Algorithm and Its Analysis (유전알고리즘의 하드웨어 구현 및 실험과 분석)

  • Dong, Sung-Soo;Lee, Chong-Ho
    • 전자공학회논문지 IE
    • /
    • v.46 no.2
    • /
    • pp.7-10
    • /
    • 2009
  • This paper presents the implementation of libraries of hardware modules for genetic algorithm using VHDL. Evolvable hardware refers to hardware that can change its architecture and behavior dynamically and autonomously by interacting with its environment. So, it is especially suited to applications where no hardware specifications can be given in advance. Evolvable hardware is based on the idea of combining reconfigurable hardware device with evolutionary computation, such as genetic algorithm. Because of parallel, no function call overhead and pipelining, a hardware genetic algorithm give speedup over a software genetic algorithm. This paper suggests the hardware genetic algorithm for evolvable embedded system chip. That includes simulation results and analysis for several fitness functions. It can be seen that our design works well for the three examples.

Parallel 2D-DWT Hardware Architecture for Image Compression Using the Lifting Scheme (이미지 압축을 위한 Lifting Scheme을 이용한 병렬 2D-DWT 하드웨어 구조)

  • Kim, Jong-Woog;Chong, Jong-Wha
    • Journal of IKEEE
    • /
    • v.6 no.1 s.10
    • /
    • pp.80-86
    • /
    • 2002
  • This paper presents a fast hardware architecture to implement a 2-D DWT(Discrete Wavelet Transform) computed by lifting scheme framework. The conventional 2-D DWT hardware architecture has problem in internal memory, hardware resource, and latency. The proposed architecture was based on the 4-way partitioned data set. This architecture is configured with a pipelining parallel architecture for 4-way partitioning method. Due to the use of this architecture, total latency was improved by 50%, and memory size was reduced by using lifting scheme.

  • PDF

A Hardware Architecture of Hough Transform Using an Improved Voting Scheme (개선된 보팅 정책을 적용한 허프 변환 하드웨어 구조)

  • Lee, Jeong-Rok;Bae, Kyeong-Ryeol;Moon, Byungin
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.38A no.9
    • /
    • pp.773-781
    • /
    • 2013
  • The Hough transform for line detection is widely used in many machine vision applications due to its robustness against data loss and distortion. However, it is not appropriate for real-time embedded vision systems, because it has inefficient computation structure and demands a large number of memory accesses. Thus, this paper proposes an improved voting scheme of the Hough transform, and then applies this scheme to a Hough transform hardware architecture so that it can provide real-time performance with less hardware resource. The proposed voting scheme reduces computation overhead of the voting procedure using correlation between adjacent pixels, and improves computational efficiency by increasing reusability of vote values. The proposed hardware architecture, which adopts this improved scheme, maximizes its throughput by computing and storing vote values for many adjacent pixels in parallel. This parallelization for throughput improvement is accomplished with little hardware overhead compared with sequential computation.

FEC design with low complexity and efficient structure for DAB system (DAB 시스템에서 낮은 복잡도와 효율적인 구조를 갖는 FEC 설계)

  • 김주병;임영진;이문호;이광재
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.26 no.8A
    • /
    • pp.1348-1357
    • /
    • 2001
  • 본 논문에서는 DAB 시스템에서 사용하는 FEC(Forward Error Correction) 블록을 하드웨어 크기를 고려하여 효율적인 구조를 갖도록 설계하였다. DAB 시스템의 FEC 블록은 크게 스크램블러(에너지분산), 리드-솔로몬 코더, 길쌈 인터리버로 구성된다. RS 디코더 블록 중 키 방정식을 계산해 내는 블록과 길쌈 인터리버가 차지하는 하드웨어 비중은 굉장히 크다. 본 논문에서는 스크램블러 부분에서 데이터의 시작을 알려주는 신호의 효율적인 검출기법을 제안하고, 리드-솔로몬 디코더 블록의 수정 유클리드 알고리즘을 효율적인 하드웨어로 구현하기 위한 새로운 구조와 길쌈 인터리버에서 최적의 메모리 구조를 효과적인 구조를 제안한다. 제안한 구조에서는 단지 8개의 GF 곱셈기와 4개의 덧셈기만을 가지고 RS 디코더의 수정 유클리드 알고리즘을 구현하였으며, 2 RAM(128)과 4 RAM(256)을 가지고 컨벌루셔널 인터리버를 구현하였다. 제안한 구조로 설계했을 경우 디코더 블록이 Altera-FPGA 칩(FLEX10K)에 모두 들어갈 수 있었다.

  • PDF

VLSI Design for Folded Wavelet Transform Processor using Multiple Constant Multiplication (MCM과 폴딩 방식을 적용한 웨이블릿 변환 장치의 VLSI 설계)

  • Kim, Ji-Won;Son, Chang-Hoon;Kim, Song-Ju;Lee, Bae-Ho;Kim, Young-Min
    • Journal of Korea Multimedia Society
    • /
    • v.15 no.1
    • /
    • pp.81-86
    • /
    • 2012
  • This paper presents a VLSI design for lifting-based discrete wavelet transform (DWT) 9/7 filter using multiplierless multiple constant multiplication (MCM) architecture. This proposed design is based on the lifting scheme using pattern search for folded architecture. Shift-add operation is adopted to optimize the multiplication process. The conventional serial operations of the lifting data flow can be optimized into parallel ones by employing paralleling and pipelining techniques. This optimized design has simple hardware architecture and requires less computation without performance degradation. Furthermore, hardware utilization reaches 100%, and the number of registers required is significantly reduced. To compare our work with previous methods, we implemented the architecture using Verilog HDL. We also executed simulation based on the logic synthesis using $0.18{\mu}m$ CMOS standard cells. The proposed architecture shows hardware reduction of up to 60.1% and 44.1% respectively at 200 MHz clock compared to previous works. This implementation results indicate that the proposed design performs efficiently in hardware cost, area, and power consumption.

Hardware Implementation of Integer Transform and Quantization for H.264 (하드웨어 기반의 H.264 정수 변환 및 양자화 구현)

  • 임영훈;정용진
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.28 no.12C
    • /
    • pp.1182-1191
    • /
    • 2003
  • In this paper, we propose a new hardware architecture for integer transform, quantizer, inverse quantizer, and inverse integer transform of a new video coding standard H.264/JVT. We describe the algorithm and derive hardware architecture emphasizing the importance of area for low cost and low power consumption. The proposed architecture has been verified by PCI-interfaced emulation board using APEX-II Alters FPGA and also by ASIC synthesis using Samsung 0.18 um CMOS cell library. The ASIC synthesis result shows that the proposed hardware can operate at 100 MHz, processing more than 1,300 QCIF video frames per second. The hardware is going to be used as a core module when implementing a complete H.264 video encoder/decoder ASIC for real-time multimedia application.

The Hardware Design of Real-time Image Processing System-on-chip for Visual Auxiliary Equipment (시각보조기기를 위한 실시간 영상처리 SoC 하드웨어 설계)

  • Jo, Heungsun;Kim, Jiho;Shin, Hyuntaek;Im, Junseong;Ryoo, Kwangki
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2013.11a
    • /
    • pp.1525-1527
    • /
    • 2013
  • 본 논문에서는 저시력자의 개선된 독서 환경을 제공하는 시각보조기기를 위한 실시간 영상처리 SoC(System on Chip) 하드웨어 구조 설계에 대해서 기술한다. 기존의 시각보조기기는 화면 영상이 실제 움직임보다 늦게 출력되는 잔상 현상이 발생하며, 색 변환 기능도 제한적이다. 따라서 본 논문에서 제안하는 실시간 영상처리 SoC 하드웨어 구조는 데이터 연산을 최소화함으로써 잔상 현상이 감소되며, 저시력자를 위한 다양한 색상 모드를 지원한다. 제안하는 영상처리 SoC 하드웨어 구조는 Core-A 모듈, Memory Controller 모듈, AMBA AHB bus 모듈, ISP(Image Signal Processing) 모듈, TFT-LCD Controller 모듈, VGA Controller 모듈, CIS Controller 모듈, UART 모듈, Block Memory 모듈로 구성된다. 시각보조기기를 위한 실시간 영상처리 SoC 하드웨어 구조는 Virtex4 XC4VLX80 FPGA 디바이스를 이용하여 검증하였으며, TSMC 180nm 셀 라이브러리로 합성한 결과 동작주파수는 54MHz, 게이트 수 197k이다.

High-Performance VLSI Architecture for Stereo Vision (스테레오 비전을 위한 고성능 VLSI 구조)

  • Seo, Youngho;Kim, Dong-Wook
    • Journal of Broadcast Engineering
    • /
    • v.18 no.5
    • /
    • pp.669-679
    • /
    • 2013
  • This paper proposed a new VLSI (Very Large Scale Integrated Circuit) architecture for stereo matching in real time. We minimized the amount of calculation and the number of memory accesses through analyzing calculation of stereo matching. From this, we proposed a new stereo matching calculating cell and a new hardware architecture by expanding it in parallel, which concurrently calculates cost function for all pixels in a search range. After expanding it, we proposed a new hardware architecture to calculate cost function for 2-dimensional region. The implemented hardware can be operated with minimum 250Mhz clock frequence in FPGA (Field Programmable Gate Array) environment, and has the performance of 805fps in case of the search range of 64 pixels and the image size of $640{\times}480$.