• Title/Summary/Keyword: Fully Pipelined

Search Result 21, Processing Time 0.026 seconds

Pipelined Implementation of JPEG Baseline Encoder IP

  • Kim, Kyung-Hyun;Sonh, Seung-Il
    • Journal of information and communication convergence engineering
    • /
    • v.6 no.1
    • /
    • pp.29-33
    • /
    • 2008
  • This paper presents the proposal and hardware design of JPEG baseline encoder. The JPEG encoder system consists of line buffer, 2-D DCT, quantization, entropy encoding, and packer. A fully pipelined scheme for JPEG encoder is adopted to speed-up an image compression. The proposed architecture was described in VHDL and synthesized in Xilinx ISE 7.1i and simulated by modelsim 6.1i. The results showed that the performance of the designed JPEG baseline encoder is higher than that demanded by real-time applications for $1024{\times}768$ image size. The designed JPEG encoder IP can be easily integrated into various application systems, such as scanner, PC camera, color FAX, and network camera, etc.

A design of synchronous nonlinear and parallel for pipeline stage on IP-based H.264 decoder implementation (IP기반 H.264 디코더 설계를 위한 동기식 비선형 및 병렬화 파이프라인 설계)

  • Ko, Byung-Soo;Kong, Jin-Hyeung
    • Proceedings of the IEEK Conference
    • /
    • 2008.06a
    • /
    • pp.409-410
    • /
    • 2008
  • This paper presents nonlinear and parallel design for synchronous pipelining in IP-based H.264 decoder implementation. Since H.264 decoder includes the dataflow of feedback loop, the data dependency requires one NOP stage per pipelining latency to drop the throughput into 1/2. Further, it is found that, in execution time, the stage scheduled for MC is more occupied than that for CAVLD/ITQ/DF. The less efficient stage would be improved by nonlinear scheduling, while the fully-utilized stage could be accelerated by parallel scheduling of IP. The optimization yields 3 nonlinear {CAVLD&ITQ}|3 parallel (MC/IP&Rec.)| 3 nonlinear {DF} pipelined architecture for IP-based H.264 decoder. In experiments, the nonlinear and parallel pipelined H.264 decoder, including existing IPs, could deal with full HD video at 41.86MHz, in real time processing.

  • PDF

Wafer-Level Three-Dimensional Monolithic Integration for Intelligent Wireless Terminals

  • Gutmann, R.J.;Zeng, A.Y.;Devarajan, S.;Lu, J.Q.;Rose, K.
    • JSTS:Journal of Semiconductor Technology and Science
    • /
    • v.4 no.3
    • /
    • pp.196-203
    • /
    • 2004
  • A three-dimensional (3D) IC technology platform is presented for high-performance, low-cost heterogeneous integration of silicon ICs. The platform uses dielectric adhesive bonding of fully-processed wafer-to-wafer aligned ICs, followed by a three-step thinning process and copper damascene patterning to form inter-wafer interconnects. Daisy-chain inter-wafer via test structures and compatibility of the process steps with 130 nm CMOS sal devices and circuits indicate the viability of the process flow. Such 3D integration with through-die vias enables high functionality in intelligent wireless terminals, as vertical integration of processor, large memory, image sensors and RF/microwave transceivers can be achieved with silicon-based ICs (Si CMOS and/or SiGe BiCMOS). Two examples of such capability are highlighted: memory-intensive Si CMOS digital processors with large L2 caches and SiGe BiCMOS pipelined A/D converters. A comparison of wafer-level 3D integration 'lith system-on-a-chip (SoC) and system-in-a-package (SiP) implementations is presented.

A Low-Voltage Low-Power Opamp-Less 8-bit 1-MS/s Pipelined ADC in 90-nm CMOS Technology

  • Abbasizadeh, Hamed;Rikan, Behnam Samadpoor;Lee, Dong-Soo;Hayder, Abbas Syed;Lee, Kang-Yoon
    • IEIE Transactions on Smart Processing and Computing
    • /
    • v.3 no.6
    • /
    • pp.416-424
    • /
    • 2014
  • This paper presents an 8-bit pipelined analog-to-digital converter. The supply voltage applied for comparators and other sub-blocks of the ADC were 0.7V and 0.5V, respectively. This low power ADC utilizes the capacitive charge pump technique combined with a source-follower and calibration to resolve the need for the opamp. The differential charge pump technique does not require any common mode feedback circuit. The entire structure of the ADC is based on fully dynamic circuits that enable the design of a very low power ADC. The ADC was designed to operate at 1MS/s in 90nm CMOS process, where simulated results using ADS2011 show the peak SNDR and SFDR of the ADC to be 47.8 dB (7.64 ENOB) and 59 dB respectively. The ADC consumes less than 1mW for all active dynamic and digital circuitries.

Design of a Scalable Systolic Synchronous Memory

  • Jeong, Gab-Joong;Kwon, Kyoung-Hwan;Lee, Moon-Key
    • Journal of Electrical Engineering and information Science
    • /
    • v.2 no.4
    • /
    • pp.8-13
    • /
    • 1997
  • This paper describes a scalable systolic synchronous memory for digital signal processing and packet switching. The systolic synchronous memory consists of the 2-D array of small memory blocks which are fully pipelined and communicated in three directions with adjacent blocks. The maximum delay of a small memory block becomes the operation speed of the chip. The array configuration is scalable for the entire memory size requested by an application. it has the initial latency of N+3 cycles with NxN array configuration. We designed an experimental 200 MHz 4Kb static RAM chip with the 4x4 array configuration of 256 SRAM blocks. It was fabricated is 0.8$\mu\textrm{m}$ twin-well single-poly double-metal CMOS technology.

  • PDF

A Multi-Level Accumulation-Based Rectification Method and Its Circuit Implementation

  • Son, Hyeon-Sik;Moon, Byungin
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.11 no.6
    • /
    • pp.3208-3229
    • /
    • 2017
  • Rectification is an essential procedure for simplifying the disparity extraction of stereo matching algorithms by removing vertical mismatches between left and right images. To support real-time stereo matching, studies have introduced several look-up table (LUT)- and computational logic (CL)-based rectification approaches. However, to support high-resolution images, the LUT-based approach requires considerable memory resources, and the CL-based approach requires numerous hardware resources for its circuit implementation. Thus, this paper proposes a multi-level accumulation-based rectification method as a simple CL-based method and its circuit implementation. The proposed method, which includes distortion correction, reduces addition operations by 29%, and removes multiplication operations by replacing the complex matrix computations and high-degree polynomial calculations of the conventional rectification with simple multi-level accumulations. The proposed rectification circuit can rectify $1,280{\times}720$ stereo images at a frame rate of 135 fps at a clock frequency of 125 MHz. Because the circuit is fully pipelined, it continuously generates a pair of left and right rectified pixels every cycle after 13-cycle latency plus initial image buffering time. Experimental results show that the proposed method requires significantly fewer hardware resources than the conventional method while the differences between the results of the proposed and conventional full rectifications are negligible.

Design Automation of High-Performance Operational Amplifiers (고성능 연산 증폭기의 설계 자동화)

  • Yu, Sang-Dae
    • Journal of Sensor Science and Technology
    • /
    • v.6 no.2
    • /
    • pp.145-154
    • /
    • 1997
  • Based on a new search strategy using circuit simulation and simulated annealing with local search, a technique for design automation of high-performance operational amplifiers is proposed. For arbitrary circuit topology and performance specifications, through discrete optimization of a cost function with discrete design variables the design of operational amplifiers is performed. A special-purpose circuit simulator and some heuristics are used to reduce the design time. Through the design of a low-power high-speed fully differential CMOS operational amplifier usable in smart sensors and 10-b 25-MS/s pipelined A/D converters, it has been demonstrated that a design tool developed using the proposed technique can be used for designing high-performance operational amplifiers with less design knowledge and less design effort.

  • PDF

Energy Efficient and Low-Cost Server Architecture for Hadoop Storage Appliance

  • Choi, Do Young;Oh, Jung Hwan;Kim, Ji Kwang;Lee, Seung Eun
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.14 no.12
    • /
    • pp.4648-4663
    • /
    • 2020
  • This paper proposes the Lempel-Ziv 4(LZ4) compression accelerator optimized for scale-out servers in data centers. In order to reduce CPU loads caused by compression, we propose an accelerator solution and implement the accelerator on an Field Programmable Gate Array(FPGA) as heterogeneous computing. The LZ4 compression hardware accelerator is a fully pipelined architecture and applies 16 dictionaries to enhance the parallelism for high throughput compressor. Our hardware accelerator is based on the 20-stage pipeline and dictionary architecture, highly customized to LZ4 compression algorithm and parallel hardware implementation. Proposing dictionary architecture allows achieving high throughput by comparing input sequences in multiple dictionaries simultaneously compared to a single dictionary. The experimental results provide the high throughput with intensively optimized in the FPGA. Additionally, we compare our implementation to CPU implementation results of LZ4 to provide insights on FPGA-based data centers. The proposed accelerator achieves the compression throughput of 639MB/s with fine parallelism to be deployed into scale-out servers. This approach enables the low power Intel Atom processor to realize the Hadoop storage along with the compression accelerator.

A Development of JPEG-LS Platform for Mirco Display Environment in AR/VR Device. (AR/VR 마이크로 디스플레이 환경을 고려한 JPEG-LS 플랫폼 개발)

  • Park, Hyun-Moon;Jang, Young-Jong;Kim, Byung-Soo;Hwang, Tae-Ho
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.14 no.2
    • /
    • pp.417-424
    • /
    • 2019
  • This paper presents the design of a JPEG-LS codec for lossless image compression from AR/VR device. The proposed JPEG-LS(: LosSless) codec is mainly composed of a context modeling block, a context update block, a pixel prediction block, a prediction error coding block, a data packetizer block, and a memory block. All operations are organized in a fully pipelined architecture for real time image processing and the LOCO-I compression algorithm using improved 2D approach to compliant with the SBT coding. Compared with a similar study in JPEG-LS, the Block-RAM size of proposed STB-FLC architecture is reduced to 1/3 compact and the parallel design of the predication block could improved the processing speed.

Pipelining Semantically-operated Services Using Ontology-based User Constraints (온톨로지 기반 사용자 제시 조건을 이용한 시맨틱 서비스 조합)

  • Jung, Han-Min;Lee, Mi-Kyoung;You, Beom-Jong
    • The Journal of the Korea Contents Association
    • /
    • v.9 no.10
    • /
    • pp.32-39
    • /
    • 2009
  • Semantically-operated services, which is different from Web services or semantic Web services with semantic markup, can be defined as the services providing search function or reasoning function using ontologies. It performs a pre-defined task by exploiting URI, ontology classes, and ontology properties. This study introduces a method for pipelining semantically-operated services based on a semantic broker which refers to ontologies and service description stored in a service manager and invokes by user constraints. The constraints consist of input instances, an output class, a visualization type, service names, and properties. This method provides automatically-generated service pipelines including composit services and a simple workflow to the user. The pipelines provided by the semantic broker can be executed in a fully-automatic manner to find a set of meaningful semantic pipelines. After all, this study would epochally contribute to develop a portal service by ways of supporting human service planners who want to find specific composit services pipelined from distributed semantically-operated services.