• Title/Summary/Keyword: Adders

Search Result 129, Processing Time 0.029 seconds

Implementation and Performance Enhancement of Arithmetic Adder for Fully Homomorphic Encrypted Data (완전동형암호로 암호화된 데이터에 적합한 산술 가산기의 구현 및 성능향상에 관한 연구)

  • Seo, Kyongjin;Kim, Pyong;Lee, Younho
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.27 no.3
    • /
    • pp.413-426
    • /
    • 2017
  • In this paper, we propose an adder that can be applied to data encrypted with a fully homomorphic encryption scheme and an addition method with improved performance that can be applied when adding multiple data. The proposed arithmetic adder is based on the Kogge-Stone Adder method with the optimal circuit level among the existing hardware-based arithmetic adders and suitable to apply the cryptographic SIMD (Single Instruction for Multiple Data) function on encrypted data. The proposed multiple addition method does not add a large number of data by repeatedly using Kogge-Stone Adder which guarantees perfect addition result. Instead, when three or more numbers are to be added, three numbers are added to C (Carry-out) and S (Sum) using the full-adder circuit implementation. Adding with Kogge-Stone Adder is only when two numbers are finally left to be added. The performance of the proposed method improves dramatically as the number of data increases.

frequency Domain processor nor ADSL G.LITE Modem (ADSL G.LITE모뎀을 위한 주파수 영역 프로세서의 설계)

  • 고우석;기준석;고태호;윤대희
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.26 no.12C
    • /
    • pp.233-239
    • /
    • 2001
  • Among the operations in frequency domain for ADSL G.LITE Modem to perform, FFT and FEQ are most computation-intensive part, of which many researches have been focused on the efficient implementation. Previous papers suggested hardwares suitable for ADSL G.DMT system, which is not feasible for simple G.LITE system. The analysis of frequency domain operations and computational efficiency according to the allocation of hardware resources is performed in this paper. The suggested processor has the structure of one real multiplier and two real adders connected in parallel, which can perform the operations efficiently through the pipeline- and/or parallel-type job scheduling. The suggested processor uses less hardware resources than Kiss\`s ALU structure or FFT/IFFT processor suggested by Wang, so the suggested one is more suitable for G.LITE system than previous works.

  • PDF

Design and Implementation of Depolarized FOG based on Digital Signal Processing (All DSP 기반의 비편광 FOG 설계 및 제작)

  • Yoon, Yeong-Gyoo;Kim, Jae-Hyung;Lee, Sang-Hyuk
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.14 no.8
    • /
    • pp.1776-1782
    • /
    • 2010
  • The interferometric fiber optic gyroscopes (FOGs) are well known as sensors of rotation, which are based on Sagnac effect, and have been under development for a number of years to meet a wide range of performance requirements. This paper describes the development of open-loop FOG and digital signal processing techniques implemented on FPGA. Our primary goal was to obtain intermediate accuracy (pointing grade) with a good bias stability (0.22deg) and scale factor stability, extremely low angle random walk (0.07deg) and significant cost savings by using a single mode fiber. A secondary goal is to design all digital FOG signal processing algorithms with which the SNR at the digital demodulator output is enhanced substantially due to processing gain. The Cascaded integrator bomb(CIC) type of decimation filter only requires adders and shift registers, low cost processors which has low computing power still can used in this all digital FOG processor.

Design of DSP based Depolarized Fiber Optic Gyroscope (DSP 기반의 비편광 광자이로스코프 설계)

  • Yoon, Yeong-gyoo;Joo, Min-sik;Kim, Yeong-jin;Kim, Jae-hyoung
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2009.10a
    • /
    • pp.153-156
    • /
    • 2009
  • The interferometric fiber optic gyroscopes (FOGs) are well known as sensors of rotation, which are based on Sagnac effect, and have been under development for a number of years to meet a wide range of performance requirements. This paper describes the development of open-loop FOG and digital signal processing techniques implemented on FPGA. Our primary goal was to obtain intermediate accuracy (pointing grade) with a good bias stability ($0.22^{\circ}/hr$) and scale factor stability, extremely low angle random walk ($0.07^{\circ}/\sqrt{hr}$) and significant cost savings by using a single mode fiber. A secondary goal is to design all digital FOG signal processing algorithms with which the SNR at the digital demodulator output is enhanced substantially due to processing gain. The CIC type of decimation block only requires adders and shift registers, low cost processors which has low computing power still can used in this all digital FOG processor.

  • PDF

Compact CNN Accelerator Chip Design with Optimized MAC And Pooling Layers (MAC과 Pooling Layer을 최적화시킨 소형 CNN 가속기 칩)

  • Son, Hyun-Wook;Lee, Dong-Yeong;Kim, HyungWon
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.25 no.9
    • /
    • pp.1158-1165
    • /
    • 2021
  • This paper proposes a CNN accelerator which is optimized Pooling layer operation incorporated in Multiplication And Accumulation(MAC) to reduce the memory size. For optimizing memory and data path circuit, the quantized 8bit integer weights are used instead of 32bit floating-point weights for pre-training of MNIST data set. To reduce chip area, the proposed CNN model is reduced by a convolutional layer, a 4*4 Max Pooling, and two fully connected layers. And all the operations use specific MAC with approximation adders and multipliers. 94% of internal memory size reduction is achieved by simultaneously performing the convolution and the pooling operation in the proposed architecture. The proposed accelerator chip is designed by using TSMC65nmGP CMOS process. That has about half size of our previous paper, 0.8*0.9 = 0.72mm2. The presented CNN accelerator chip achieves 94% accuracy and 77us inference time per an MNIST image.

Design of Multipliers Optimized for CNN Inference Accelerators (CNN 추론 연산 가속기를 위한 곱셈기 최적화 설계)

  • Lee, Jae-Woo;Lee, Jaesung
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.25 no.10
    • /
    • pp.1403-1408
    • /
    • 2021
  • Recently, FPGA-based AI processors are being studied actively. Deep convolutional neural networks (CNN) are basic computational structures performed by AI processors and require a very large amount of multiplication. Considering that the multiplication coefficients used in CNN inference operation are all constants and that an FPGA is easy to design a multiplier tailored to a specific coefficient, this paper proposes a methodology to optimize the multiplier. The method utilizes 2's complement and distributive law to minimize the number of bits with a value of 1 in a multiplication coefficient, and thereby reduces the number of required stacked adders. As a result of applying this method to the actual example of implementing CNN in FPGA, the logic usage is reduced by up to 30.2% and the propagation delay is also reduced by up to 22%. Even when implemented with an ASIC chip, the hardware area is reduced by up to 35% and the delay is reduced by up to 19.2%.

System Development and IC Implementation of High-quality and High-performance Image Downscaler Using 2-D Phase-correction Digital Filters (2차원 위상 교정 디지털 필터를 이용한 고성능/고화질의 영상 축소기 시스템 개발 및 IC 구현)

  • 강봉순;이영호;이봉근
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.2 no.3
    • /
    • pp.93-101
    • /
    • 2001
  • In this paper, we propose an image downscaler used in multimedia video applications, such as DTV, TV-PIP, PC-video, camcorder, videophone and so on. The proposed image downscaler provides a scaled image of high-quality and high-performance. This paper will explain the scaling theory using two-dimensional digital filters. It is the method that removes an aliasing noise and decreases the hardware complexity, compared with Pixel-drop and Upsamling. Also, this paper will prove it improves scaling precisians and decreases the loss of data, compared with the Scaler32, the Bt829 of Brooktree, and the SAA7114H of Philips. The proposed downscaler consists of the following four blocks: line memory, vertical scaler, horizontal scaler, and FIFO memory. In order to reduce the hardware complexity, the using digital filters are implemented by the multiplexer-adder type scheme and their all the coefficients can be simply implemented by using shifters and adders. It also decreases the loss of high frequency data because it provides the wider BW of 6MHz as adding the compensation filter. The proposed downscaler is modeled by using the Verilog-HDL and the model is verified by using the Cadence simulator. After the verification is done, the model is synthesized into gates by using the Synopsys. The synthesized downscaler is Placed and routed by the Mentor with the IDEC-C632 0.65${\mu}{\textrm}{m}$ library for further IC implementation. The IC master is fixed in size by 4,500${\mu}{\textrm}{m}$$\times$4,500${\mu}{\textrm}{m}$. The active layout size of the proposed downscaler is 2,528${\mu}{\textrm}{m}$$\times$3,237${\mu}{\textrm}{m}$.

  • PDF

A New Structural Carry-out Circuit in Full Adder (새로운 구조의 전가산기 캐리 출력 생성회로)

  • Kim, Young-Woon;Seo, Hae-Jun;Han, Se-Hwan;Cho, Tae-Won
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.46 no.12
    • /
    • pp.1-9
    • /
    • 2009
  • A full adders is an important component in applications of digital signal processors and microprocessors. Thus it is imperative to improve the power dissipation and operating speed for designing a full adder. We propose a new adder with modified version of conventional static CMOS and pass transistor logic. The carry-out generation circuit of the proposed full adder is different from the conventional XOR-XNOR structure. The output Cout of module III is generated from input A, B and Cin directly without passing through module I as in conventional structure. Thus output Cout is faster by reducing operation step. The proposed module III uses the static CMOS logic style, which results full-swing operation and good driving capability. The proposed 1bit full adder has the advantages over the conventional static CMOS, CPL, TGA, TFA, HPSC, 14T, and TSAC logic. The delay time is improved by 4.3% comparing to the best value known. PDP(power delay product) is improved by 9.8% comparing to the best value. Simulation has been carried out using a $0.18{\mu}m$ CMOS design rule for simulation purposes. The physical design has been verified using HSPICE.

Design of digital decimation filter for sigma-delta A/D converters (시그마-델타 A/D 컨버터용 디지털 데시메이션 필터 설계)

  • Byun, San-Ho;Ryu, Seong-Young;Choi, Young-Kil;Roh, Hyung-Dong;Nam, Hyun-Seok;Roh, Jeong-Jin
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.44 no.2
    • /
    • pp.34-45
    • /
    • 2007
  • Digital decimation filter is inevitable in oversampled sigma-delta A/D converters for the sake of reducing the oversampled rate to Nyquist rate. This paper presented a Verilog-HDL design and implementation of an area-efficient digital decimation filter that provides time-to-market advantage for sigma-delta analog-to-digital converters. The digital decimation filter consists of CIC(cascaded integrator-comb) filter and two cascaded half-band FIR filters. A CSD(canonical signed digit) representation of filter coefficients is used to minimize area and reduce in hardware complexity of multiplication arithmetic. Coefficient multiplications are implemented by using shifters and adders. This three-stage decimation filter is fabricated in $0.25-{\mu}m$ CMOS technology and incorporates $1.36mm^2$ of active area, shows 4.4 mW power consumption at clock rate of 2.8224 MHz. Measured results show that this digital decimation filter is suitable for digital audio decimation filters.

Design of a Low Power Reconfigurable DSP with Fine-Grained Clock Gating (정교한 클럭 게이팅을 이용한 저전력 재구성 가능한 DSP 설계)

  • Jung, Chan-Min;Lee, Young-Geun;Chung, Ki-Seok
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.45 no.2
    • /
    • pp.82-92
    • /
    • 2008
  • Recently, many digital signal processing(DSP) applications such as H.264, CDMA and MP3 are predominant tasks for modern high-performance portable devices. These applications are generally computation-intensive, and therefore, require quite complicated accelerator units to improve performance. Designing such specialized, yet fixed DSP accelerators takes lots of effort. Therefore, DSPs with multiple accelerators often have a very poor time-to-market and an unacceptable area overhead. To avoid such long time-to-market and high-area overhead, dynamically reconfigurable DSP architectures have attracted a lot of attention lately. Dynamically reconfigurable DSPs typically employ a multi-functional DSP accelerator which executes similar, yet different multiple kinds of computations for DSP applications. With this type of dynamically reconfigurable DSP accelerators, the time to market reduces significantly. However, integrating multiple functionalities into a single IP often results in excessive control and area overhead. Therefore, delay and power consumption often turn out to be quite excessive. In this thesis, to reduce power consumption of dynamically reconfigurable IPs, we propose a novel fine-grained clock gating scheme, and to reduce size of dynamically reconfigurable IPs, we propose a compact multiplier-less multiplication unit where shifters and adders carry out constant multiplications.