• Title/Summary/Keyword: Architecture Description Language

Search Result 115, Processing Time 0.028 seconds

Hardware Architecture and its Design of Real-Time Video Compression Processor for Motion JPEG2000 (Motion JPEG2000을 위한 실시간 비디오 압축 프로세서의 하드웨어 구조 및 설계)

  • 서영호;김동욱
    • The Transactions of the Korean Institute of Electrical Engineers D
    • /
    • v.53 no.1
    • /
    • pp.1-9
    • /
    • 2004
  • In this paper, we proposed a hardware(H/W) structure which can compress and recontruct the input image in real time operation and implemented it into a FPGA platform using VHDL(VHSIC Hardware Description Language). All the image processing element to process both compression and reconstruction in a FPGA were considered each of them was mapped into a H/W with the efficient structure for FPGA. We used the DWT(discrete wavelet transform) which transforms the data from spatial domain to the frequency domain, because use considered the motion JPEG2000 as the application. The implemented H/W is separated to both the data path part and the control part. The data path part consisted of the image processing blocks and the data processing blocks. The image processing blocks consisted of the DWT Kernel for the filtering by DWT, Quantizer/Huffman Encoder, Inverse Adder/Buffer for adding the low frequency coefficient to the high frequency one in the inverse DWT operation, and Huffman Decoder. Also there existed the interface blocks for communicating with the external application environments and the timing blocks for buffering between the internal blocks. The global operations of the designed H/W are the image compression and the reconstruction, and it is operated by the unit or a field synchronized with the A/D converter. The implemented H/W used the 54%(12943) LAB(Logic Array Block) and 9%(28352) ESB(Embedded System Block) in the APEX20KC EP20K600CB652-7 FPGA chip of ALTERA, and stably operated in the 70MHz clock frequency. So we verified the real time operation. that is. processing 60 fields/sec(30 frames/sec).

Efficient VLSI Architectures for the Two-Dimensional Discrete Wavelet Transform (2차원 이산 웨이브렛 변환을 위한 효율적인 VLSI 구조)

  • Pan, Sung-Bum;Park, Rae-Hong;Jee, Yong
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.37 no.1
    • /
    • pp.59-68
    • /
    • 2000
  • This paper proposes efficient VLSI architectures for computation of the 2- D discrete wavelet transform (DWT). The two proposed VLSI architectures for the 2- D DWT are constructed based on block-based computation Each $M{\times}N$ ($N{\times}M$) block DWT is performed along the row (column) direction simultaneously, where M and N denote the number of filter taps and the number of columns (rows), respectively The proposed architectures compute the lowpass and highpass output sequences of the 1 - DWT along the row and column directions using a single architecture In alternate clock cycles Therefore the extra processing units required for the proposed architectures are much smaller than those of the conventional architectures They are modeled In very high speed Integrated circuit hardware description language (HIDL) and Simulated to show their functional validity.

  • PDF

VLSI Design of Parallel Scheme for Comparison of Multiple Digital Signals (다중 디지털 신호의 비교를 위한 병렬 기법의 VLSI 설계)

  • Seo, Young-Ho;Lee, Yong-Seok;Kim, Dong-Wook
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.21 no.4
    • /
    • pp.781-788
    • /
    • 2017
  • This paper proposes a new algorithm for comparing amplitude between multiple digital input signals and its digital logic architecture. After simultaneously comparing multiple inputs, the proposed algorithm can provide the information of the largest (or smallest) value among them by using a simple digital logic function. The drawback of the method is to increase hardware resource. To overcome this we propose a reuse method of the overlapped logic operation. The proposed method focuses on enhancing the operational clock frequency, in other words decreasing combinational delay time. After implementing the comparing method with HDL (hardware description language), we experiment on it with environment of Cyclone III EP3C40F324A7 FPGA of Altera Inc. In case of 4 input signals, it can increase the operational speed as mush as 1.66 times with 1.20 times the hardware resource. In case of 8, it can also have 2.29 times the clock frequency and 2.15 times the hardware resource.

Design of Intra Prediction Circuit for HEVC and H.264 Multi-decoder Supporting UHD Images (UHD 영상을 지원하는 HEVC 및 H.264 멀티 디코더 용 인트라 예측 회로 설계)

  • Yu, Sanghyun;Cho, Kyeongsoon
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.53 no.12
    • /
    • pp.50-56
    • /
    • 2016
  • This paper proposes the architecture and design of intra prediction circuit for a multi-decoder supporting UHD images. The proposed circuit supports not only the latest video compression standard HEVC but also H.264. In addition to the basic function of performing intra prediction, this circuit has the capability of performing the reference sample filter operation defined in the H.264 standard, and the smoothing and strong sample filter operations defined in the HEVC standard. We reduced the circuit size by sharing the circuit blocks for common operations and internal storage, and improved the circuit performance by parallel processing. The proposed circuit was described at RTL using Verilog HDL and its functionality was verified by using NC-Verilog of Cadence. The RTL circuit was synthesized by using Design Compiler of Synopsys and 130nm standard cell library. The synthesized gate-level circuit consists of 69,694 gates and processes 100 ~ 280 frames per second for 4K-UHD HEVC images at the maximum operation frequency of 157MHz.

VLSI Design of Reed-Solomon Decoder over GF($2^8$) with Extreme Use of Resource Sharing (하드웨어 공유 극대화에 의한 GF($2^8$) Reed-Solomon Decoder의 VLSI설계)

  • 이주태;이승우;조중휘
    • Journal of the Korean Institute of Telematics and Electronics C
    • /
    • v.36C no.3
    • /
    • pp.8-16
    • /
    • 1999
  • This paper describes a VLSI design of Reed-Solomon(RS) decoder using the modified Euclid algorithm, with the main theme focused on the $\textit{GF}(2^8)$. To get area-efficient design, a number of new architectures have been devised with maximal register and Euclidean ALU unit sharing. One ALU is shared to replace 18 ALUs which computes an error locator polynomial and an error evaluation polynomial. Also, 18 registers are shared to replace 24 registers which stores coefficients of those polynomials. The validity and efficiency of the proposed architecture have been verified by simulation and by FLEX$^TM$ FPGA implementation in hardware description language VHDL. The proposed Reed-Solomon decoder, which has the capability of decoding RS(208,192,17) and RS(182,172,11) for Digital Versatile Disc(DVD), has been designed by using O.6$\mu\textrm{m}$ CMOS TLM Compass$^TM$ technology library, which contains totally 17k gates with a core area of 2.299$\times$2.284 (5.25$\textrm{mm}^2$). The chip can run at 20MHz while the DVD requirement is 3.74MHz.

  • PDF

Efficient CAVLC Decoder VLSI Design for HD Images (HD급 영상을 효율적으로 복호하기 위한 CAVLC 복호화기 VLSI 설계)

  • Oh, Myung-Seok;Lee, Won-Jae;Kim, Jae-Seok
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.44 no.4 s.316
    • /
    • pp.51-59
    • /
    • 2007
  • In this paper, we propose an efficient hardware architecture for H.264/AVC CAVLC (Context-based Adaptive Variable Length Coding) decoding which used for baseline profile and extended profile. Previous CAVLC architectures are consisted of five step block and each block gets effective bits from Controller block and Accumulator. If large number of non-zero coefficients exist, process for getting effective bits has to iterates many times. In order to reduce this unnecessary process, we propose two techniques, which combine five steps into four steps and reduce process to get efficiency bit by skipping addition step. By adopting these two techniques, the required processing time was reduced about 26% compared with previous architectures. It was designed in a hardware description language and total logic gate count was 16.83k using 0.18um standard cell library.

Construction of a Compiled-code Simulator Generation System for Efficient Design Exploration in Embedded Core Design (임베디드 코어 설계시 효율적인 설계 공간 탐색을 위한 컴파일드 코드 방식 시뮬레이터 생성 시스템 구축)

  • Kim, Sang-Woo;Hwang, Sun-Young
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.36 no.1B
    • /
    • pp.71-79
    • /
    • 2011
  • This paper proposes a compiled-code simulator generation system based-on machine description language for efficient design space exploration in designing an embedded system optimized for a specific application. The proposed system generates a compiled-code simulator which maintains the functional accuracy of an event-driven simulator by determining instruction fetch and decoding processes statically. Generated simulator takes instruction-level and cycle-level simulation for estimating performances in embedded core. To show the efficiency of the constructed compiled-code simulator generator, architecture exploration had been performed for the JPEG encoder application. Starting with MIPS R3000 processor for one embedded core, the proposed system can produce the core showing optimized execution time for the application programming. In this process, a huge amount of simulation time has been used. Cycle-level compiled-code simulator has the functional accuracy and shows performance improvement by 21.7% in terms of simulation speed on the average when compared with an event-driven simulator.

Design and Implementation of a Scalable Framework for Parallel Program Performance Visualization (병렬 프로그램 성능가시화를 위한 확장성 있는 프레임워크 설계 및 구현)

  • Moon, Sang-Su;Moon, Young-Shik;Kim, Jung-Sun
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.7 no.2
    • /
    • pp.109-120
    • /
    • 2001
  • In this paper, we propose the design and implementation of a portable, extensible, and efficient performance visualization framework for high performance parallel program development. The framework adopts a layered architecture:consists of three independent layers instrumentation layer, trace interface layer and visualization layer. The instrumentation layer was constructed as an ECL which captures generated events, and the EDL/JPAL constitutes the trace interface layer to provide problem-oriented interfaces between visualization layer and instrumentation layer. Finally, the visualization layer was designed as plug-and-play style for easy elimination, addition and composition of various filters, views and view groups, The proposed performance visualization framework is expected to be used as an independent performance debugging and analysis tool and as a core component in an integrated parallel programming environment.

  • PDF

Design of Iterative Divider in GF(2163) Based on Improved Binary Extended GCD Algorithm (개선된 이진 확장 GCD 알고리듬 기반 GF(2163)상에서 Iterative 나눗셈기 설계)

  • Kang, Min-Sup;Jeon, Byong-Chan
    • The KIPS Transactions:PartC
    • /
    • v.17C no.2
    • /
    • pp.145-152
    • /
    • 2010
  • In this paper, we first propose a fast division algorithm in GF($2^{163}$) using standard basis representation, and then it is mapped into divider for GF($2^{163}$) with iterative hardware structure. The proposed algorithm is based on the binary ExtendedGCD algorithm, and the arithmetic operations for modular reduction are performed within only one "while-statement" unlike conventional approach which uses two "while-statement". In this paper, we use reduction polynomial $f(x)=x^{163}+x^7+x^6+x^3+1$ that is recommended in SEC2(Standards for Efficient Cryptography) using standard basis representation, where degree m = 163. We also have implemented the proposed iterative architecture in FPGA using Verilog HDL, and it operates at a clock frequency of 85 MHz on Xilinx-VirtexII XC2V8000 FPGA device. From implementation results, we will show that computation speed of the proposed scheme is significantly improved than the existing two approaches.

Development of a Translator for Automatic Generation of Ubiquitous Metaservice Ontology (유비쿼터스 메타서비스 온톨로지 자동 생성을 위한 번역기 개발)

  • Lee, Mee-Yeon;Lee, Jung-Won;Park, Seung-Soo;Cho, We-Duke
    • Journal of the Korea Society of Computer and Information
    • /
    • v.14 no.1
    • /
    • pp.191-203
    • /
    • 2009
  • To provide dynamic services for users in ubiquitous computing environments by considering context in real-time, in our previous work we proposed Metaservice concept, the description specification and the process for building a Metaservice library. However, our previous process generates separated models - UML, OWL, OWL-S based models - from each step, so it did not provide the established method for translation between models. Moreover, it premises aid of experts in various ontology languages, ontology editing tools and the proposed Metaservice specification. In this paper, we design the translation process from domain ontology in OWL to Metaservice Library in OWL-S and develop a visual tool in order to enable non-experts to generate consistent models and to construct a Metaservice library. The purpose of the Metaservice Library translation process is to maintain consistency in all models and to automatically generate OWL-S code for Metaservice library by integrating existing OWL model and Metaservice model.