• Title/Summary/Keyword: pipelined architecture

Search Result 176, Processing Time 0.028 seconds

Design of a Low Power MictoController Core for Intellectual Property applications (IP활용에 적합한 저전력 MCU CORE 설계)

  • Lee, Kwang-Youb;Lee, Dong-Yup
    • The Transactions of the Korea Information Processing Society
    • /
    • v.7 no.2
    • /
    • pp.470-476
    • /
    • 2000
  • This paper describes an IP design of a low-power microcontroller using an architecture level design methodology instead of a transistor level. To reduce switching capacitance, the register-toregister data transfer is adopted to frequently used register transfer micro-operations. Also, distributed buffers are proposed to reduce a input data rising edge time. To reduce power consumption without any loss of performance, pipeline processing should be used. In this paper, a 4-stage pipelined datapath being able to process CISC instructions is designed. Designed microcontroller lessens power consumption by 20%. To measure a power consumption, the SYNOPSYS EPIC powermill is used.

  • PDF

Hardware Design with Efficient Pipelining for High-throughput AES (높은 처리량을 가지는 AES를 위한 효율적인 파이프라인을 적용한 하드웨어 설계)

  • Antwi, Alexander O.A;Ryoo, Kwangki
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2017.10a
    • /
    • pp.578-580
    • /
    • 2017
  • IoT technology poses a lot of security threats. Various algorithms are thus employed in ensuring security of transactions between IoT devices. Advanced Encryption Standard (AES) has gained huge popularity among many other symmetric key algorithms due to its robustness till date. This paper presents a hardware based implementation of the AES algorithm. We present a four-stage pipelined architecture of the encryption and key generation. This method allowed a total plain text size of 512 bits to be encrypted in 46 cycles. The proposed hardware design achieved a maximum frequency of 1.18GHz yielding a throughput of 13Gbps and 800MHz yielding a throughput of 8.9Gbps on the 65nm and 180nm processes respectively.

  • PDF

Design of Floating-Point Multiplier for Mobile Graphics Application (모바일 그래픽스 응용을 위한 부동소수점 승산기의 설계)

  • Choi, Byeong-Yoon;Salcic, Zoran
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.12 no.3
    • /
    • pp.547-554
    • /
    • 2008
  • In this paper, two-stage pipelined floating-point multiplier (FP-MUL) is designed. The FP-MUL processor supports single precision multiplication for 3D graphic APIs, such as OpenGL and Direct3D and has area-efficient and low-latency architecture via saturated arithmetic, area-efficient sticky-bit generator, and flagged prefix adder. The FP-MUL has about 4-ns delay time under $0.13{\mu}m$ CMOS standard cell library and consists of about 7,500 gates. Because its maximum performance is about 250 MFLOPS, it can be applicable to mobile 3D graphics application.

High Throughput Parallel Design of 2-D $8{\times}8$ Integer Transforms for H.264/AVC (H.264/AVC 를 위한 높은 처리량의 2-D $8{\times}8$ integer transforms 병렬 구조 설계)

  • Sharma, Meeturani;Tiwari, Honey;Cho, Yong-Beom
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.49 no.8
    • /
    • pp.27-34
    • /
    • 2012
  • In this paper, the implementation of high throughput two-dimensional (2-D) $8{\times}8$ forward and inverse integer DCT transform for H.264 is presented. The forward and inverse transforms are represented using simple shift and addition operations. Matrix decomposition and matrix operation such as the Kronecker product and direct sum are used to reduce the computation complexity. The proposed design uses integer computations and does not use transpose memory and hence, the resource consumption is also reduced. The maximum operating frequency of the proposed pipelined architecture is 1.184 GHz, which achieves 25.27 Gpixels/sec throughput rate with the hardware cost of 44864 gates. High throughput and low hardware makes the proposed design useful for real time H.264/AVC high definition processing.

A Study on Pipelined Transform Coding and Quantization Core for H.264/AVC Encoder (H.264/AVC 인코더용 파이프라인 방식의 변환 코딩 및 양자화 코어 연구)

  • Sonh, Seung-Il
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.16 no.1
    • /
    • pp.119-126
    • /
    • 2012
  • H.264/AVC can use three transforms depending on types of residual data which are to be coded. H.264/AVC always executes $4{\times}4$ DCT transform. In $16{\times}16$ intra mode only, $4{\times}4$ Hadamard transform for luma DC coefficients and $2{\times}2$ Hadamard transform for chroma DC coefficients are performed additionally. Quantization is carried out to achieve further data compression after transform coding is completed. In this paper, the hardware implementation for DCT transform, Hadamard transform and quantization is studied. Especially, the proposed architecture adopting the pipeline technique can output a quantized result per clock cycle after 33-clock cycle latency. The proposed architecture is coded in Verilog-HDL and synthesized using Xilinx 7.1i ISE tool. The operating frequency is 106MHz at SPARTAN3S-1000. The designed IP can process maximum 33-frame at $1920{\times}1080$ HD resolution.

Design and Implementation of a Low-Complexity Real-Time Barrel Distortion Corrector for Wide-Angle Cameras (광각 카메라를 위한 저 복잡도 실시간 베럴 왜곡 보정 프로세서의 설계 및 구현)

  • Jeong, Hui-Seong;Kim, Won-Tae;Lee, Gwang-Ho;Kim, Tae-Hwan
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.50 no.6
    • /
    • pp.131-137
    • /
    • 2013
  • The barrel distortion makes serious problems in a wide-angle camera employing a lens of a short focal length. This paper presents a low-complexity hardware architecture for a real-time barrel distortion corrector and its implementation. In the proposed barrel distortion corrector, the conventional algorithm is modified so that the correction is performed incrementally, which results in the reduction of the number of required hardware modules for the distortion correction. The proposed barrel distortion corrector has a pipelined architecture so as to achieve a high-throughput correction. The correction rate is 74.86 frames per sec at the operating frequency of 314MHz in a $0.11{\mu}m$ CMOS process, where the frame size is $2048{\times}2048$. The proposed barrel distortion corrector is implemented with 14.3K logic gates.

Design and Implementation of Interference-Immune Architecture for Digital Transponder of Military Satellite (군통신위성 디지털 중계기의 간섭 회피 처리 구조 설계 및 구현)

  • Sirl, Young-Wook;Yoo, Jae-Sun;Jeong, Gun-Jin;Lee, Dae-Il;Lim, Cheol-Min
    • Journal of the Korean Society for Aeronautical & Space Sciences
    • /
    • v.42 no.7
    • /
    • pp.594-600
    • /
    • 2014
  • In modern warfare, securing communication channel by combatting opponents' electromagnetic attack is a crucial factor to win the war. Military satellite digital transponder is a communication payload of the next generation military satellite that maintains warfare networks operational in the presence of interfering signals by securely relaying signals between ground terminals. The transponder in this paper is classified as a partial processing transponder which performs cost effective secure relaying in satellite communication links. The control functions of transmission security achieve immunity to hostile interferences which may cause malicious effects on the link. In this paper, we present an efficient architecture for implementing the control mechanism. Two major ideas of pipelined processing in per-group control and software processing of blocked band information dramatically reduce the complexity of the hardware. A control code sequence showing its randomness with uniform distribution is exemplified and qualification test results are briefly presented.

Investigation of Small MPU Design and its Pipelining by Research CAD Tools (연구용 CAD툴에 의한 소형 MPU의 설계 및 파이프라인화의 고찰)

  • Lee, Su-Jeong;Park, Do-Sun;Song, Nak-Yun
    • The Transactions of the Korea Information Processing Society
    • /
    • v.1 no.4
    • /
    • pp.517-530
    • /
    • 1994
  • In this paper, design of small microprocessor unit is implemented using research purpose VHDL and CAD tools by top-down design method. For this, original basic MPU and its pipelining architectures are suggested. Once, design target, instruction sets, architecture are decided, the operation is confirmed by C language simulation, and then the operation is confirmed by checking internal register contents for given inputs in the case of VHDL simulation. Then, design layouts are made by full/semi-custom design methods by research CAD tools and related simulation is implemented. The feasibility of suggested pipelined structure for performance improvement is confirmed by simulation, and related problems and future research directions are discussed. In conclusion, the MPU design methodology is set up and the design change of architecture is possible by this paper.

  • PDF

Open-Loop Pipeline ADC Design Techniques for High Speed & Low Power Consumption (고속 저전력 동작을 위한 개방형 파이프라인 ADC 설계 기법)

  • Kim Shinhoo;Kim Yunjeong;Youn Jaeyoun;Lim Shin-ll;Kang Sung-Mo;Kim Suki
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.30 no.1A
    • /
    • pp.104-112
    • /
    • 2005
  • Some design techniques for high speed and low power pipelined 8-bit ADC are described. To perform high-speed operation with relatively low power consumption, open loop architecture is adopted, while closed loop architecture (with MDAC) is used in conventional pipeline ADC. A distributed track and hold amplifier and a cascading structure are also adopted to increase the sampling rate. To reduce the power consumption and the die area, the number of amplifiers in each stage are optimized and reduced with proposed zero-crossing point generation method. At 500-MHz sampling rate, simulation results show that the power consumption is 210mW including digital logic with 1.8V power supply. And the targeted ADC achieves ENOB of about 8-bit with input frequency up to 200-MHz and input range of 1.2Vpp (Differential). The ADC is designed using a $0.18{\mu}m$ 6-Metal 1-Poly CMOS process and occupies an area of $900{\mu}m{\times}500{\mu}m$

Enhancing Instruction Queue Efficiency with Return Address Stack in Shallow-Pipelined EISC Architecture (복귀주소 스택을 활용한 얕은 파이프라인 EISC 아키텍처의 명령어 큐 효율성 향상연구)

  • Kim, Han-Yee;Lee, SeungEun;Kim, Kwan-Young;Suh, Taeweon
    • The Journal of Korean Association of Computer Education
    • /
    • v.18 no.2
    • /
    • pp.71-81
    • /
    • 2015
  • In the EISC processor, the Instruction Queue (IQ) supporting LERI folding and loop buffering occupies roughly 20% of real estate, and its efficient utilization is a key for performance. This paper presents an architectural enhancement for the IQ utilization with return address stack (RAS) in the EISC processor. The proposed architecture eliminates the RAS corruption from the wrong-path, taking advantage of shallow pipeline. In experiments, a 4-entry RAS reduces the number of IQ flushes by up to 58.90% over baseline, and an 8-entry RAS by up to 61.28%. The experiments show up to 3.47% performance improvement with 8-entry RAS and up to 3.15% performance improvement with 4-entry RAS.