• Title/Summary/Keyword: arithmetic unit

Search Result 167, Processing Time 0.026 seconds

Study of the Superconductive Pipelined Multi-Bit ALU (초전도 Pipelined Multi-Bit ALU에 대한 연구)

  • Kim, Jin-Young;Ko, Ji-Hoon;Kang, Joon-Hee
    • Progress in Superconductivity
    • /
    • v.7 no.2
    • /
    • pp.109-113
    • /
    • 2006
  • The Arithmetic Logic Unit (ALU) is a core element of a computer processor that performs arithmetic and logic operations on the operands in computer instruction words. We have developed and tested an RSFQ multi-bit ALU constructed with half adder unit cells. To reduce the complexity of the ALU, We used half adder unit cells. The unit cells were constructed of one half adder and three de switches. The timing problem in the complex circuits has been a very important issue. We have calculated the delay time of all components in the circuit by using Josephson circuit simulation tools of XIC, $WRspice^{TM}$, and Julia. To make the circuit work faster, we used a forward clocking scheme. This required a careful design of timing between clock and data pulses in ALU. The designed ALU had limited operation functions of OR, AND, XOR, and ADD. It had a pipeline structure. The fabricated 1-bit, 2-bit, and 4-bit ALU circuits were tested at a few kilo-hertz clock frequency as well as a few tens giga-hertz clock frequency, respectively. For high-speed tests, we used an eye-diagram technique. Our 4-bit ALU operated correctly at up to 5 GHz clock frequency.

  • PDF

Hardware Design of Pipelined Special Function Arithmetic Unit for Mobile Graphics Application (모바일 그래픽 응용을 위한 파이프라인 구조 특수 목적 연산회로의 하드웨어 설계)

  • Choi, Byeong-Yoon
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.17 no.8
    • /
    • pp.1891-1898
    • /
    • 2013
  • To efficiently execute 3D graphic APIs, such as OpenGL and Direct3D, special purpose arithmetic unit(SFU) which supports floating-point sine, cosine, reciprocal, inverse square root, base-two exponential, and logarithmic operations is designed. The SFU uses second order minimax approximation method and lookup table method to satisfy both error less than 2 ulp(unit in the last place) and high speed operation. The designed circuit has about 2.3-ns delay time under 65nm CMOS standard cell library and consists of about 23,300 gates. Due to its maximum performance of 400 MFLOPS and high accuracy, it can be efficiently applicable to mobile 3D graphics application.

UNITARY ANALOGUES OF A GENERALIZED NUMBER-THEORETIC SUM

  • Traiwat Intarawong;Boonrod Yuttanan
    • Communications of the Korean Mathematical Society
    • /
    • v.38 no.2
    • /
    • pp.355-364
    • /
    • 2023
  • In this paper, we investigate the sums of the elements in the finite set $\{x^k:1{\leq}x{\leq}{\frac{n}{m}},\;gcd_u(x,n)=1\}$, where k, m and n are positive integers and gcdu(x, n) is the unitary greatest common divisor of x and n. Moreover, for some cases of k and m, we can give the explicit formulae for the sums involving some well-known arithmetic functions.

The design of quantization and inverse quantization unit (Q_IQ unit) module with video encoder (비디오 인코더용 양자화 및 역양자화기(Q_IQ unit) 모듈의 설계)

  • 김은원;조원경
    • Journal of the Korean Institute of Telematics and Electronics C
    • /
    • v.34C no.11
    • /
    • pp.20-28
    • /
    • 1997
  • In this paper, quantization and inverse quantizatio unit, a sa component of MPEG-2 moving picture compression system, ar edesigned. In the processing of quantization, this design adopted newly designed arithmetic units in which quantization matrices and scale code was expressed with SD(signed-digit) code. In the arithmetic unit of inverse quantization, quantization scale code, which has 5-bits length, is splited into two pieces; 2-bits for control code, 3-bits for quantization data, and the method to devise quantization step size is proposed. The design was coded with VHDL and synthesis results in that it consumed about 6,110 gates, and operating speed is 52MHz.

  • PDF

RADIX-2 BUTTERFLY 연산회로의 설계

  • 최병윤;신경욱;유종근;임충빈;김봉열;이문기
    • Proceedings of the Korean Institute of Communication Sciences Conference
    • /
    • 1986.04a
    • /
    • pp.177-180
    • /
    • 1986
  • A high performance Butterfly Arithmetic Unit for FFT processor using two adders is proposed in this papers, which is Based on the distributed and merged arithmetic. Due to simple and easy architecture to implement, this proposed processor is well suited to systolic FFT processor. Simulation was performance using YSLOG (Yonsei logic simulator) on IBM AT computer, to verify logic. By using 3um double Metal CMOS technology,Butterfly arithmetic have been achieved in 1.2 usec.

  • PDF

Design of a Floating Point Multiplier for IEEE 754 Single-Precision Operations (IEEE 754 단정도 부동 소수점 연산용 곱셈기 설계)

  • Lee, Ju-Hun;Chung, Tae-Sang
    • Proceedings of the KIEE Conference
    • /
    • 1999.11c
    • /
    • pp.778-780
    • /
    • 1999
  • Arithmetic unit speed depends strongly on the algorithms employed to realize the basic arithmetic operations.(add, subtract multiply, and divide) and on the logic design. Recent advances in VLSI have increased the feasibility of hardware implementation of floating point arithmetic units and microprocessors require a powerful floating-point processing unit as a standard option. This paper describes the design of floating-point multiplier for IEEE 754-1985 Single-Precision operation. Booth encoding algorithm method to reduce partial products and a Wallace tree of 4-2 CSA is adopted in fraction multiplication part to generate the $32{\times}32$ single-precision product. New scheme of rounding and sticky-bit generation is adopted to reduce area and timing. Also there is a true sign generator in this design. This multiplier have been implemented in a ALTERA FLEX EPF10K70RC240-4.

  • PDF

A VLSI Architecture of Systolic Array for FET Computation (고속 퓨리어 변환 연산용 VLSI 시스토릭 어레이 아키텍춰)

  • 신경욱;최병윤;이문기
    • Journal of the Korean Institute of Telematics and Electronics
    • /
    • v.25 no.9
    • /
    • pp.1115-1124
    • /
    • 1988
  • A two-dimensional systolic array for fast Fourier transform, which has a regular and recursive VLSI architecture is presented. The array is constructed with identical processing elements (PE) in mesh type, and due to its modularity, it can be expanded to an arbitrary size. A processing element consists of two data routing units, a butterfly arithmetic unit and a simple control unit. The array computes FFT through three procedures` I/O pipelining, data shuffling and butterfly arithmetic. By utilizing parallelism, pipelining and local communication geometry during data movement, the two-dimensional systolic array eliminates global and irregular commutation problems, which have been a limiting factor in VLSI implementation of FFT processor. The systolic array executes a half butterfly arithmetic based on a distributed arithmetic that can carry out multiplication with only adders. Also, the systolic array provides 100% PE activity, i.e., none of the PEs are idle at any time. A chip for half butterfly arithmetic, which consists of two BLC adders and registers, has been fabricated using a 3-um single metal P-well CMOS technology. With the half butterfly arithmetic execution time of about 500 ns which has been obtained b critical path delay simulation, totla FFT execution time for 1024 points is estimated about 16.6 us at clock frequency of 20MHz. A one-PE chip expnsible to anly size of array is being fabricated using a 2-um, double metal, P-well CMOS process. The chip was layouted using standard cell library and macrocell of BLC adder with the aid of auto-routing software. It consists of around 6000 transistors and 68 I/O pads on 3.4x2.8mm\ulcornerarea. A built-i self-testing circuit, BILBO (Built-In Logic Block Observation), was employed at the expense of 3% hardware overhead.

  • PDF

An Arithmetic System over Finite Fields

  • Park, Chun-Myoung
    • Journal of information and communication convergence engineering
    • /
    • v.9 no.4
    • /
    • pp.435-440
    • /
    • 2011
  • This paper propose the method of constructing the highly efficiency adder and multiplier systems over finite fields. The addition arithmetic operation over finite field is simple comparatively because that addition arithmetic operation is analyzed by each digit modP summation independently. But in case of multiplication arithmetic operation, we generate maximum k=2m-2 degree of ${\alpha}^k$ terms, therefore we decrease k into m-1 degree using irreducible primitive polynomial. We propose two method of control signal generation for the purpose of performing above decrease process. One method is the combinational logic expression and the other method is universal signal generation. The proposed method of constructing the highly adder/multiplier systems is as following. First of all, we obtain algorithms for addition and multiplication arithmetic operation based on the mathematical properties over finite fields, next we construct basic cell of A-cell and M-cell using T-gate and modP cyclic gate. Finally we construct adder module and multiplier module over finite fields after synthesizing ${\alpha}^k$ generation module and control signal CSt generation module with A-cell and M-cell. Next, we constructing the arithmetic operation unit over finite fields. Then, we propose the future research and prospects.

Algebraic Accuracy Verification for Division-by-Convergence based 24-bit Floating-point Divider Complying with OpenGL (Division-by-Convergence 방식을 사용하는 24-비트 부동소수점 제산기에 대한 OpenGL 정확도의 대수적 검증)

  • Yoo, Sehoon;Lee, Jungwoo;Kim, Kichul
    • Journal of IKEEE
    • /
    • v.17 no.3
    • /
    • pp.346-351
    • /
    • 2013
  • Low-cost and low-power are important requirements in mobile systems. Thus, when a floating-point arithmetic unit is needed, 24-bit floating-point format can be more useful than 32-bit floating-point format. However, a 24-bit floating-point arithmetic unit can be risky because it usually has lower accuracy than a 32-bit floating-point arithmetic unit. Consecutive floating-point operations are performed in 3D graphic processors. In this case, the verification of the floating-point operation accuracy is important. Among 3D graphic arithmetic operations, the floating-point division is one of the most difficult operations to satisfy the accuracy of $10^{-5}$ which is the required accuracy in OpenGL ES 3.0. No 24-bit floating-point divider, whose accuracy is algebraically verified, has been reported. In this paper, a 24-bit floating-point divider is analyzed and it is algebraically verified that its accuracy satisfies the OpenGL requirement.