• Title/Summary/Keyword: pipelined architecture

Search Result 176, Processing Time 0.04 seconds

High-Performance Low-Power FFT Cores

  • Han, Wei;Erdogan, Ahmet T.;Arslan, Tughrul;Hasan, Mohd.
    • ETRI Journal
    • /
    • v.30 no.3
    • /
    • pp.451-460
    • /
    • 2008
  • Recently, the power consumption of integrated circuits has been attracting increasing attention. Many techniques have been studied to improve the power efficiency of digital signal processing units such as fast Fourier transform (FFT) processors, which are popularly employed in both traditional research fields, such as satellite communications, and thriving consumer electronics, such as wireless communications. This paper presents solutions based on parallel architectures for high throughput and power efficient FFT cores. Different combinations of hybrid low-power techniques are exploited to reduce power consumption, such as multiplierless units which replace the complex multipliers in FFTs, low-power commutators based on an advanced interconnection, and parallel-pipelined architectures. A number of FFT cores are implemented and evaluated for their power/area performance. The results show that up to 38% and 55% power savings can be achieved by the proposed pipelined FFTs and parallel-pipelined FFTs respectively, compared to the conventional pipelined FFT processor architectures.

  • PDF

High-Speed Low-Complexity Reed-Solomon Decoder using Pipelined Berlekamp-Massey Algorithm and Its Folded Architecture

  • Park, Jeong-In;Lee, Ki-Hoon;Choi, Chang-Seok;Lee, Han-Ho
    • JSTS:Journal of Semiconductor Technology and Science
    • /
    • v.10 no.3
    • /
    • pp.193-202
    • /
    • 2010
  • This paper presents a high-speed low-complexity pipelined Reed-Solomon (RS) (255,239) decoder using pipelined reformulated inversionless Berlekamp-Massey (pRiBM) algorithm and its folded version (PF-RiBM). Also, this paper offers efficient pipelining and folding technique of the RS decoders. This architecture uses pipelined Galois-Field (GF) multipliers in the syndrome computation block, key equation solver (KES) block, Forney block, Chien search block and error correction block to enhance the clock frequency. A high-speed pipelined RS decoder based on the pRiBM algorithm and its folded version have been designed and implemented with 90-nm CMOS technology in a supply voltage of 1.1 V. The proposed RS(255,239) decoder operates at a clock frequency of 700 MHz using the pRiBM architecture and also operates at a clock frequency of 750 MHz using the PF-RiBM, respectively. The proposed architectures feature high clock frequency and low-complexity.

Pipelined Scheduling of Functional HW/SW Modules for Platform-Based SoC Design

  • Kim, Won-Jong;Chang, June-Young;Cho, Han-Jin
    • ETRI Journal
    • /
    • v.27 no.5
    • /
    • pp.533-538
    • /
    • 2005
  • We developed a pipelined scheduling technique of functional hardware and software modules for platform-based system-on-a-chip (SoC) designs. It is based on a modified list scheduling algorithm. We used the pipelined scheduling technique for a performance analysis of an MPEG4 video encoder application. Then, we applied it for architecture exploration to achieve a better performance. In our experiments, the modified SoC platform with 6 pipelines for the 32-bit dual layer architecture shows a 118% improvement in performance compared to the given basic SoC platform with 4 pipelines for the 16-bit single-layer architecture.

  • PDF

An Optimum Architecture for Implementing SEED Cipher Algorithm with Efficiency (효율적인 SEED 암호알고리즘 구현을 위한 최적화 회로구조)

  • Shin Kwang-Cheul;Lee Haeng-Woo
    • Journal of Internet Computing and Services
    • /
    • v.7 no.1
    • /
    • pp.49-57
    • /
    • 2006
  • This paper describes the architecture for reducing its size and increasing the computation rate in implementing the SEED algorithm of a 12B-bit block cipher, and the result of the circuit design. In order to increase the computation rate, it is used the architecture of the pipelined systolic array, This architecture is a simple thing without involving any buffer at the input and output part. By this circuit, it can be recorded 320 Mbps encryption rate at 10 MHz clock. We have designed the circuit with the VHDL coding, implemented with a FPGA of 50,000 gates.

  • PDF

Design of an Area-Efficient Reed-Solomon Decoder using Pipelined Recursive Technique (파이프라인 재귀적인 기술을 이용한 면적 효율적인 Reed-Solomon 복호기의 설계)

  • Lee, Han-Ho
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.42 no.7 s.337
    • /
    • pp.27-36
    • /
    • 2005
  • This paper presents an area-efficient architecture to implement the high-speed Reed-Solomon(RS) decoder, which is used in a variety of communication systems such as wireless and very high-speed optical communications. We present the new pipelined-recursive Modified Euclidean(PrME) architecture to achieve high-throughput rate and reducing hardware-complexity using folding technique. The proposed pipelined recursive architecture can reduce the hardware complexity about 80$\%$ compared to the conventional systolic-array and fully-parallel architecture. The proposed RS decoder has been designed and implemented with the 0.13um CMOS technology in a supply voltage of 1.2 V. The result show that total number of gate is 393 K and it has a data processing rate of S Gbits/s at clock frequency of 625 MHz. The proposed area-efficient architecture can be readily applied to the next generation FEC devices for high-speed optical communications as well as wireless communications.

A Design of Giga-bit security module Using Fully pipelined CTR-AES (Full-pipelined CTR-AES를 이용한 Giga-bit 보안모듈 설계)

  • Vinh, T.Q.;Park, Ju-Hyun;Kim, Young-Chul
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2008.05a
    • /
    • pp.225-228
    • /
    • 2008
  • In this paper, we presented our implementation of a counter mode AES based on Virtex4 FPGA. Our design exploits three advanced features: composite field arithmetic SubByte, efficient MixColumn transformation, and On-the-Fly Key-Scheduling for fully pipelined architecture. By pipelining the composite field implementation of the S-box, the area cost is reduced to average 17 percent. By designing the On-the-Fly key scheduling, we implemented an efficient key-expander module which is specialized for a pipelined architecture.

  • PDF

Design of Pipelined LMS Filter for Noise Cancelling of High speed Communication Receivers System (고속통신시스템 수신기의 잡음소거를 위한 파이프라인 LMS 필터설계)

  • Cho Sam-Ho;Kwon Seung-Tag;Kim Young-Suk
    • Proceedings of the IEEK Conference
    • /
    • 2004.06a
    • /
    • pp.7-10
    • /
    • 2004
  • This paper describes techniques to implement low-cost adapt ive Pipelined LMS filter for ASIC implement ions of high communication receivers. Power consumpiton can be reduced using a careful selection of architectural, algorithmic, and VLSI circuit techlifue A Pipelined architecture for the strength-reduced algorithm is then developed via the relaxed look-ahead transformation. This technique, which is an approximation of the conventional look-ahead compution, maintains the functionality of the algorithm rather than the input-output behavior Convergence maiysis of the Proposed architecture has been presented and support via simulation results. The resulting pipelined adaptive filter achives a higher though put requires lower power as compared to the filter using the serial algorithm.

  • PDF

A Design of 3D Graphics Lighting Processor for Mobile Applications (휴대 단말기용 3D Graphics Lighting Processor 설계)

  • Yang, Joon-Seok;Kim, Ki-Chul
    • Proceedings of the IEEK Conference
    • /
    • 2005.11a
    • /
    • pp.837-840
    • /
    • 2005
  • This paper presents 3D graphics lighting processor based on vector processing using pipeline chaining. The lighting process of 3D graphics rendering contains many arithmetic operations and its complexity is very high. For high throughput, proposed processor uses pipelined functional units. To implement fully pipelined architecture, we have to use many functional units. Hence, the number of functional units is restricted. However, with the restricted number of pipelined functional units, the utilization of the units is reduced and a resource reservation problem is caused. To resolve these problems, the proposed architecture uses vector processing using pipeline chaining. Due to its pipeline chaining based architecture, it can perform 4.09M vertices per 1 second with 100MHz frequency. The proposed 3D graphics lighting processor is compatible with OpenGL ES API and the design is implemented and verified on FPGA.

  • PDF

A Pipelined Hardware Architecture of an H.264 Deblocking Filter with an Efficient Data Distribution

  • Lee, Sang-Heon;Lee, Hyuk-Jae
    • JSTS:Journal of Semiconductor Technology and Science
    • /
    • v.6 no.4
    • /
    • pp.227-233
    • /
    • 2006
  • In order to reduce blocking artifacts and improve compression efficiency, H.264/AVC standard employs an adaptive in-loop deblocking filter. This paper proposes a new hardware architecture of the deblocking filter that employs a four-stage pipelined structure with an efficient data distribution. The proposed architecture allows a simultaneous supply of eight data samples to fully utilize the pipelined filter in both horizontal and vertical filterings. This paper also presents a new filtering order and data reuse scheme between consecutive macroblock filterings to reduce the communication for external memory access. The number of required cycles for filtering one macroblock (MB) is 357 cycles when the proposed filter uses dual port SRAMs. This execution speed is only 41.3% of that of the fastest previous work.

Design of the Charge-Shared Switching MDAC for a Pipelined A/D Converter (Pipelined A/D 변환기 용 Charge-Shared Switching MDAC의 설계)

  • 박만규;이종훈;김상호;김상민;손영철;김대정;김동명
    • Proceedings of the IEEK Conference
    • /
    • 2002.06b
    • /
    • pp.69-72
    • /
    • 2002
  • This paper proposed a new charge-shared switching MDAC for a pipelined A/D converter The proposed architecture accomplishes the same function of a conventional multiplying-digital-to-analog converter (MDAC). By adopting the proposed scheme, about 40% of the total capacitances could be reduced and the speed of the MDAC increases. The performance of the charge-shared switching MDAC has been Proved by HSPICE simulations.

  • PDF