• Title/Summary/Keyword: arithmetic circuit

Search Result 115, Processing Time 0.025 seconds

An Efficient Hardware Implementation of Square Root Computation over GF(p) (GF(p) 상의 제곱근 연산의 효율적인 하드웨어 구현)

  • Choe, Jun-Yeong;Shin, Kyung-Wook
    • Journal of IKEEE
    • /
    • v.23 no.4
    • /
    • pp.1321-1327
    • /
    • 2019
  • This paper describes an efficient hardware implementation of modular square root (MSQR) computation over GF(p), which is the operation needed to map plaintext messages to points on elliptic curves for elliptic curve (EC)-ElGamal public-key encryption. Our method supports five sizes of elliptic curves over GF(p) defined by the National Institute of Standards and Technology (NIST) standard. For the Koblitz curves and the pseudorandom curves with 192-bit, 256-bit, 384-bit and 521-bit, the Euler's Criterion based on the characteristic of the modulo values was applied. For the elliptic curves with 224-bit, the Tonelli-Shanks algorithm was simplified and applied to compute MSQR. The proposed method was implemented using the finite field arithmetic circuit with 32-bit datapath and memory block of elliptic curve cryptography (ECC) processor, and its hardware operation was verified by implementing it on the Virtex-5 field programmable gate array (FPGA) device. When the implemented circuit operates with a 50 MHz clock, the computation of MSQR takes about 18 ms for 224-bit pseudorandom curves and about 4 ms for 256-bit Koblitz curves.

A Design and Verification of an Efficient Control Unit for Optical Processor (광프로세서를 위한 효율적인 제어회로 설계 및 검증)

  • Lee Won-Joo
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.43 no.4 s.310
    • /
    • pp.23-30
    • /
    • 2006
  • This paper presents design andd verification of a circuit that improves the control-operation problems of Stored Program Optical Computer (SPOC), which is an optical computer using $LiNbO_3$ optical switching element. Since the memory of SPOC takes the Delay Line Memory (DLM) architecture and instructions that are needless of operands should go though memory access stages, SPOC memory have problems; it takes immoderate access time and unnecessary operations are executed in Arithmetic Logical Unit (ALU) because desired operations can't be selectively executed. In this paper, improvement on circuit has been achieved by removing the memory access of instructions that are needless of operands by decoding instructions before locating operand. Unnecessary operations have been reduced by sending operands to some specific operational units, not to all the operational units in ALD. We show that total execution time of a program is minimized by using the Dual Instruction Register(DIR) architecture.

A Study on the Information Reversibility of Quantum Logic Circuits (양자 논리회로의 정보 가역성에 대한 고찰)

  • Park, Dong-Young
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.12 no.1
    • /
    • pp.189-194
    • /
    • 2017
  • The reversibility of a quantum logic circuit can be realized when two reversible conditions of information reversible and energy reversible circuits are satisfied. In this paper, we have modeled the computation cycle required to recover the information reversibility from the multivalued quantum logic to the original state. For modeling, we used a function embedding method that uses a unitary switch as an arithmetic exponentiation switch. In the quantum logic circuit, if the adjoint gate pair is symmetric, the unitary switch function shows the balance function characteristic, and it takes 1 cycle operation to recover the original information reversibility. Conversely, if it is an asymmetric structure, it takes two cycle operations by the constant function. In this paper, we show that the problem of 2-cycle restoration according to the asymmetric structure when the hybrid MCT gate is realized with the ternary M-S gate can be solved by equivalent conversion of the asymmetric gate to the gate of the symmetric structure.

Design of Radix-4 FFT Processor Using Twice Perfect Shuffle (이중 완전 Shuffle을 이용한 Radix-4 FFT 프로세서의 설계)

  • Hwang, Myoung-Ha;Hwang, Ho-Jung
    • Journal of the Korean Institute of Telematics and Electronics
    • /
    • v.27 no.2
    • /
    • pp.144-150
    • /
    • 1990
  • This paper describes radix-4 Fast Fourier Transform (FFT) Processor designed with the new twice perfect shuffle developed from a perfect shuffle used in radix-2 FFT algorithm. The FFT Processor consists of a butterfly arithmetic circuit, address generators for input, output and coefficient, input and output registers and controller. Also, it requires the external ROM for storage of coefficient and RAM for input and output. The butterfly circuit includes 12 bit-serial ($16{\times}8$) multipliers, adders, subtractors and delay shift registers. Operating on 25 MHz two phase clock, this processor can compute 256 point FFT in 6168 clocks, i.e. 247 us and provides flexibility by allowing the user to select any size among 4,16,64,and256points. Being fabricated with 2-um double metal CMOS process, it includes about 28000 transistors and 55 pads in $8.0{\times}8.2mm^2$area.

  • PDF

Low Power ADC Design for Mixed Signal Convolutional Neural Network Accelerator (혼성신호 컨볼루션 뉴럴 네트워크 가속기를 위한 저전력 ADC설계)

  • Lee, Jung Yeon;Asghar, Malik Summair;Arslan, Saad;Kim, HyungWon
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.25 no.11
    • /
    • pp.1627-1634
    • /
    • 2021
  • This paper introduces a low-power compact ADC circuit for analog Convolutional filter for low-power neural network accelerator SOC. While convolutional neural network accelerators can speed up the learning and inference process, they have drawback of consuming excessive power and occupying large chip area due to large number of multiply-and-accumulate operators when implemented in complex digital circuits. To overcome these drawbacks, we implemented an analog convolutional filter that consists of an analog multiply-and-accumulate arithmetic circuit along with an ADC. This paper is focused on the design optimization of a low-power 8bit SAR ADC for the analog convolutional filter accelerator We demonstrate how to minimize the capacitor-array DAC, an important component of SAR ADC, which is three times smaller than the conventional circuit. The proposed ADC has been fabricated in CMOS 65nm process. It achieves an overall size of 1355.7㎛2, power consumption of 2.6㎼ at a frequency of 100MHz, SNDR of 44.19 dB, and ENOB of 7.04bit.

A Dual band CMOS Voltage Controlled Oscillator of an arithmetic functionality with a 50% duty cycle buffer (50%듀티 싸이클 버퍼를 가진 산술 연산 구조의 이중 대역 CMOS 전압 제어 발진기)

  • 한윤철;김광일;이상철;변기영;윤광섭
    • Journal of the Institute of Electronics Engineers of Korea TC
    • /
    • v.41 no.10
    • /
    • pp.79-86
    • /
    • 2004
  • This paper proposes a dual band Voltage Controlled Oscillator(VCO) with a standard 0.3${\mu}{\textrm}{m}$ CMOS process to generate 1.07GHz and 2.07GHz. The proposed VCO architecture with 50% duty cycle circuit and a half adder(HA) was capable of producing a frequency two times higher than that of the conventional VCOs. The measurement results demonstrate that the gain of VCO and power dissipation are 561MHz/V and 14.6mW, respectively. The phase noises of the dual band VCO are measured to be -102.55dBc/Hz and -95.88dBc/Hz at 2MHz offset from 1.07GHz and 2.07GHz, respectively.

MPEG-4 Audio Decoding Technique using Integer Operations for Real-time Playback on Embedded Processor (휴대용 임베디드 프로세서에서의 MPEG-4 오디오의 실시간 재생을 위한 정수 디코딩 기법)

  • Cha, Kyung-Ae
    • Journal of Broadcast Engineering
    • /
    • v.13 no.3
    • /
    • pp.415-418
    • /
    • 2008
  • Some embedded microprocessors do not have an FPU(Floating Point Unit) due to a circuit complexity and power consumption. The performance speed of MPEG-4 AAC decoder on this hardware environment would be slower than corresponding speed for playing back of the decoded results. Therefore, irritating and high-pitched noises are interleaved in the original the audio data. So, in order to play MPEG-4 AAC file on such PDA, a new algorithm that transforms floating-point arithmetic to one with integers, is needed. We have developed a transformation algorithm from floating-point operation to integer operation and implemented the PDA's AAC Player. We also show the efficiency of our proposed method with the experimental results.

음성인식용 DTW PE의 IC화를 위한 ADD 및 ABS 회로의 설계

  • 정광재;문홍진;최규훈;김종교
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.15 no.8
    • /
    • pp.648-658
    • /
    • 1990
  • There are many methods for speed up counting in speech recongition. A multiple processing method is the one way to achieve the aim using systolic array. This arithmetic operation by the array is achieved pipelining skill. And the operation is multiprocessing by processing element(PE) that is incresing counting efficiencies. The DTW PE cell is seperated into three large blocks. "MIN" is the one block for counting accumulated minimum distance, "ADD" block calculated these minimum distances, and "ABS" seeks for the absolut values to the total sum of local distances. We have accomplished circuit design and verification about the "ADD" and "ABS" blocks, and performed total layout '||'&'||' DRC(design rule check) using 3um CMOS N-Well rule base.le check) using 3$\mu$m CMOS N-Well rule base.

  • PDF

High Precision Logarithm Converters for Binary Floating Point Approximation Operations (고속 부동소수점 근사연산용 로그변환 회로)

  • Moon, Sang-Ook
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2010.05a
    • /
    • pp.809-811
    • /
    • 2010
  • In most floating-point operations related with 3D graphic applications for mobile devices, properly approximated data calculations with reduced complexity and low power are preferable to exactly rounded floating-point operations with unnecessary preciseness with cost. Among all the sophisticated floating-point arithmetic operations, multiplication and division are the most complicated and time-consuming, and they can be transformed into addition and subtraction repectively by adopting the logarithmic conversion. In this process, the most important factor for performance is how high we can make an approximation of the logarithm conversion. In this paper, we cover the trends in studying the logarithm conversion circuit designs. We also discuss the important factor in design issues and the applicable fields in detail.

  • PDF

Iris Ciphertext Authentication System Based on Fully Homomorphic Encryption

  • Song, Xinxia;Chen, Zhigang;Sun, Dechao
    • Journal of Information Processing Systems
    • /
    • v.16 no.3
    • /
    • pp.599-611
    • /
    • 2020
  • With the application and promotion of biometric technology, biometrics has become more and more important to identity authentication. In order to ensure the privacy of the user, the biometrics cannot be stored or manipulated in plaintext. Aiming at this problem, this paper analyzes and summarizes the scheme and performance of the existing biometric authentication system, and proposes an iris-based ciphertext authentication system based on fully homomorphic encryption using the FV scheme. The implementation of the system is partly powered by Microsoft's SEAL (Simple Encrypted Arithmetic Library). The entire system can complete iris authentication without decrypting the iris feature template, and the database stores the homomorphic ciphertext of the iris feature template. Thus, there is no need to worry about the leakage of the iris feature template. At the same time, the system does not require a trusted center for authentication, and the authentication is completed on the server side directly using the one-time MAC authentication method. Tests have shown that when the system adopts an iris algorithm with a low depth of calculation circuit such as the Hamming distance comparison algorithm, it has good performance, which basically meets the requirements of real application scenarios.