• Title/Summary/Keyword: 덧셈기

Search Result 164, Processing Time 0.031 seconds

Sign-Extension Reduction Method in Common Subexpression Elimination Circuit (Common Subexpression Elimination 회로의 부호 확장 제거)

  • Kim, Yong-Eun;Chung, Jin-Gyun;Lee, Moon-Ho
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.45 no.9
    • /
    • pp.65-70
    • /
    • 2008
  • In FIR filter design, multipliers occupy most of the area. To efficiently reduce the area occupied by multipliers, Common Subexpression Elimination (CSE) algorithm can be used instead of separate multipliers. However, the filter computation time can be increased due to the long carry propagation in CSE circuits. More specifically, when the difference of weights between the two inputs to an adder in CSE circuits is large, long carry propagation time is required due to large sign extension. In this paper, we propose a sign-extension reduction method in common subexpression elimination circuit. By Synopsys simulation using Samsung 0.35um library, it is shown that the proposed method leads to 17%, 31% and 12% reduction in the area, time delay and power consumption, respectively, compared with conventional method.

Design of a Booth's Multiplier Suitable for Embedded Systems (임베디드 시스템에 적용이 용이한 Booth 알고리즘 방식의 곱셈기 설계)

  • Moon, San-Gook
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2007.10a
    • /
    • pp.838-841
    • /
    • 2007
  • In this study, we implemented a $17^*17b$ binary digital multiplier using radix-4 Booth's algorithm. Two stage pipeline architecture was applied to achieve higher throughput and 4:2 adders were used for regular layout structure in the Wallace tree partition. To evaluate the circuit, several MPW chips were fabricated using Hynix 0.6-um 3M N-well CMOS technology. Also we proposed an efficient test methodology and did fault simulations. The chip contains 9115 transistors and the core area occupies about $1135^*1545$ mm2. The functional tests using ATS-2 tester showed that it can operate with 24 MHz clock at 5.0 V at room temperature.

  • PDF

Post-Quantum Security Evaluation Through SPECK Quantum Circuit Optimization (SPECK 양자 회로 최적화를 통한 양자 후 보안 강도 평가)

  • Jang, Kyung-Bae;Eum, Si-Woo;Song, Gyeong-Ju;Yang, Yu-jin;Seo, Hwa-Jeong
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2021.11a
    • /
    • pp.243-246
    • /
    • 2021
  • 양자 알고리즘이 수행 가능한 양자 컴퓨터는 기존 암호 시스템의 보안성을 낮추거나 깨뜨릴 수 있다. 이에 양자 컴퓨터의 공격 관점에서 기존 암호 시스템의 보안성을 재평가하는 연구들이 활발히 수행되고 있다. NIST는 대칭키 암호 시스템에 대한 양자 후 보안 강도에 평가에 Grover 알고리즘의 적용 비용을 채택하고 있다. Grover 알고리즘이 대칭키 암호 시스템의 보안성을 절반으로 줄일 수 있는 시점에서 중요한 건 공격 비용이다. 본 논문에서는 경량블록암호 SPECK 양자 회로 최적화 구현을 제시한다. ARX 구조의 SPECK에 대해 최적의 양자 덧셈기를 채택하고 병렬 덧셈을 수행한다. 그 결과, 최신 구현물과 비교하여 depth 측면에서 56%의 성능향상을 제공한다. 최종적으로, 제시하는 SPECK 양자 회로를 기반으로 Grover 알고리즘 적용 비용을 추정하고 양자 후 보안 강도를 평가한다.

Design of High Performance Dual Channel Pipelined Interpolators for H.264 Decoder (이중 채널 파이프라인 구조의 H.264용 고성능 보간 연산기 설계)

  • Lee, Chan-Ho
    • Journal of IKEEE
    • /
    • v.13 no.4
    • /
    • pp.110-115
    • /
    • 2009
  • The motion compensation is the most time-consuming and complex unit in the H.264 decoder. The performance of the motion compensation is determined by the calculation of pixel interpolation. The quarter-pixel interpolation is achieved using 6-tap horizontal or vertical FIR filters for luminance data and bilinear FIR filters for chroma data. We propose the architecture for interpolation of luminance and chroma data in H.264 decoders. It is composed of dual-channel pipelined processing elements and can interpolate integer-, half- and quarter-pixel data. The number of the processing cycles is different depending on the position. The processing elements are composed of adders and shifters to reduce the complexity while the accuracy of the pixel data are maintained. We design interpolators for luminance and chroma data using Verilog-HDL and verify the function and performance by implementing using an FPGA.

  • PDF

An Implementation of Low Power MAC using Improvement of Multiply/Subtract Operation Method and PTL Circuit Design Methodology (승/감산 연산방법의 개선 및 PTL회로설계 기법을 이용한 저전력 MAC의 구현)

  • Sim, Gi-Hak;O, Ik-Gyun;Hong, Sang-Min;Yu, Beom-Seon;Lee, Gi-Yeong;Jo, Tae-Won
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.37 no.4
    • /
    • pp.60-70
    • /
    • 2000
  • An 8$\times$8+20-bit MAC is designed with low power design methodologies at each of the system design levels. At algorithm level, a new method for multipl $y_tract operation is proposed, and it saves the transistor counts over conventional methods in hardware realization. A new Booth selector circuit using NMOS pass-transistor logic is also proposed at circuit level. It is superior to other circuits designed by CMOS in power-delay-product. And at architecture level, we adopted an ELM adder that is known to be the most efficient in power consumption, operating frequency, area and design regularity as the final adder. For registers, dynamic CMOS single-edge triggered flip-flops are used because they need less transistors per bit. To increase the operating frequency 2-stage pipeline architecture is adopted, and fast 4:2 compressors are applied in Wallace tree block. As a simulation result, the designed MAC in 0.6${\mu}{\textrm}{m}$ 1-poly 3-metal CMOS process is operated at 200MHz, 3.3V and consumed 35㎽ of power in multiply operation, and operated at 100MHz consuming 29㎽ in MAC operations, respectively.ly.

  • PDF

Low-Power FFT Design for NC-OFDM in Cognitive Radio Systems (Cognitive Radio 시스템의 NC-OFDM을 위한 저전력 FFT 설계)

  • Jang, In-Gul;Chung, Jin-Gyun
    • Journal of the Institute of Electronics Engineers of Korea TC
    • /
    • v.48 no.6
    • /
    • pp.28-33
    • /
    • 2011
  • Recently, the investigation of the cognitive radio (CR) system is actively progressed as one of the methods for using the frequency resources more efficiently. In CR systems, when the frequency band allocated to the incumbent user is not used, the unused frequency band is assigned to the secondary user. Thus, the FFT input signals corresponding to the actually used frequency band by the incumbent user are assigned as '0'. In this paper, based on the fact that there are many '0' input signals in CR systems, a low-power FFT design method for NC-OFDM is proposed. An efficient zero flag generation technique for each stage is first presented. Then, to increase the utility of the zero flag signals, modified architectures for memory and arithmetic circuits are presented. To verify the performance of the proposed algorithm, 2048 point FFT with radix-24SDFstructureisdesignedusingVerilog HDL. The simulation results show that the power consumption of FFT is reduced considerably by the proposed algorithm.

A Study on the Hardware Implementation of Competitive Learning Neural Network with Constant Adaptaion Gain and Binary Reinforcement Function (일정 적응이득과 이진 강화함수를 가진 경쟁학습 신경회로망의 디지탈 칩 개발과 응용에 관한 연구)

  • 조성원;석진욱;홍성룡
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.7 no.5
    • /
    • pp.34-45
    • /
    • 1997
  • In this paper, we present hardware implemcntation of self-organizing feature map (SOFM) neural networkwith constant adaptation gain and binary reinforcement function on FPGA. Whereas a tnme-varyingadaptation gain is used in the conventional SOFM, the proposed SOFM has a time-invariant adaptationgain and adds a binary reinforcement function in order to compensate for the lowered abilityof SOFM due to the constant adaptation gain. Since the proposed algorithm has no multiplication operation.it is much easier to implement than the original SOFM. Since a unit neuron is composed of 1adde $r_tracter and 2 adders, its structure is simple, and thus the number of neurons fabricated onFPGA is expected to he large. In addition, a few control signal: ;:rp sufficient for controlling !he neurons.Experimental results show that each componeni ot thi inipiemented neural network operates correctlyand the whole system also works well.stem also works well.

  • PDF

Floating Point Unit Design for the IEEE754-2008 (IEEE754-2008을 위한 고속 부동소수점 연산기 설계)

  • Hwang, Jin-Ha;Kim, Hyun-Pil;Park, Sang-Su;Lee, Yong-Surk
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.48 no.10
    • /
    • pp.82-90
    • /
    • 2011
  • Because of the development of Smart phone devices, the demands of high performance FPU(Floating-point Unit) becomes increasing. Therefore, we propose the high-speed single-/double-precision FPU design that includes an elementary add/sub unit and improved multiplier and compare and convert units. The most commonly used add/sub unit is optimized by the parallel rounding unit. The matrix operation is used in complex calculation something like a graphic calculation. We designed the Multiply-Add Fused(MAF) instead of multiplier to calculate the matrix more quickly. The branch instruction that is decided by the compare operation is very frequently used in various programs. We bypassed the result of the compare operation before all the pipeline processes ended to decrease the total execution time. And we included additional convert operations that are added in IEEE754-2008 standard. To verify our RTL designs, we chose four hundred thousand test vectors by weighted random method and simulated each unit. The FPU that was synthesized by Samsung's 45-nm low-power process satisfied the 600-MHz operation frequency. And we confirm a reduction in area by comparing the improved FPU with the existing FPU.

Low-area FFT Processor Structure using Common Sub-expression Sharing (Common Sub-expression Sharing을 사용한 저면적 FFT 프로세서 구조)

  • Jang, Young-Beom;Lee, Dong-Hoon
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.12 no.4
    • /
    • pp.1867-1875
    • /
    • 2011
  • In this paper, a low-area 256-point FFT structure is proposed. For low-area implementation CSD(Canonic Signed Digit) multiplier method is chosen. Because multiplication type should be less for efficient CSD multiplier application to the FFT structure, the Radix-$4^2$ algorithm is chosen for those purposes. After, in the proposed structure, the number of multiplication type is minimized in each multiplication block, the CSD multipliers are applied for implementation of multiplication. Furthermore, in CSD multiplier implementation, cell-area is more reduced through common sub-expression sharing(CSS). The Verilog-HDL coding result shows 29.9% cell area reduction in the complex multiplication part and 12.54% cell area reduction in overall 256-point FFT structure comparison with those of the conventional structure.

Hardware Module for Real-time Integer Pel Motion Estimation of H.264 (H.264 의 실시간 부호화를 위한 정수 단위 화소 움직임 예측 모듈 구조)

  • Shin, Ji-Yong;Lee, In-Jik;Kim, Shin-Dug
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2007.05a
    • /
    • pp.330-332
    • /
    • 2007
  • 본 논문에서는 정수 단위 화소 움직임 예측(ME: Motion Estimation)을 위한 Unsymmetrical-cross Multi-Hexagon-grid Search (UMHexagonS) 알고리즘을 기반으로, CIF 크기의 영상을 실시간으로 부호화 하기 위한 정수 단위 화소 움직임 예측 모듈을 제안한다. 제안하는 정수 단위 화소 움직임 예측 모듈은 32 개의 1 차원 연산유닛(PE: Processing Element) 배열, 데이터 선택/재배열 유닛, 내부버퍼, 그리고 트리 구조의 덧셈기로 구성되며, CIF 크기의 영상 100 프레임을 부호화 하기 위한 클럭 사이클을 계산하여 실험결과로 제시하였다. 그 결과 제안하는 구조는 400MHz 의 클럭 속도에서 CIF 크기의 영상을 실시간으로 부호화 할 수 있다.

  • PDF