• Title/Summary/Keyword: 가변 비트 곱셈기

Search Result 11, Processing Time 0.022 seconds

Newton-Raphson's Double Precision Reciprocal Using 32 bit multiplier (32 비트 곱셈기를 사용한 뉴톤-랍손 배정도실수 역수 계산기)

  • Cho, Gyeong-Yeon
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.18 no.6
    • /
    • pp.31-37
    • /
    • 2013
  • Modern graphic processors, multimedia processors and audio processors mostly use floating-point number. High-level language such as C and Java use both single precision and double precision floating-point number. In this paper, an algorithm which computes the reciprocal of double precision floating-point number using a 32 bit multiplier is proposed. It divides the mantissa of double precision floating-point number to upper part and lower part, and calculates the reciprocal of the upper part with Newton-Raphson algorithm. And it computes the reciprocal of double precision floating-point number with calculated upper part reciprocal as the initial value. Since the number of multiplications performed by the proposed algorithm is dependent on the mantissa of floating-point number, the average number of multiplications per an operation is derived from some reciprocal tables with varying sizes.

Scalable multiplier and inversion unit on normal basis for ECC operation (ECC 연산을 위한 가변 연산 구조를 갖는 정규기저 곱셈기와 역원기)

  • 이찬호;이종호
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.40 no.12
    • /
    • pp.80-86
    • /
    • 2003
  • Elliptic curve cryptosystem(ECC) offers the highest security per bit among the known publick key system. The benefit of smaller key size makes ECC particularly attractive for embedded applications since its implementation requires less memory and processing power. In this paper, we propose a new multiplier structure with configurable output sizes and operation cycles. The number of output bits can be freely chosen in the new architecture with the performance-area trade-off depending on the application. Using the architecture, a 193-bit normal basis multiplier and inversion unit are designed in GF(2$^{m}$ ). It is implemented using HDL and 0.35${\mu}{\textrm}{m}$ CMOS technology and the operation is verified by simulation.

Elliptic Curve Cryptography Coprocessors Using Variable Length Finite Field Arithmetic Unit (크기 가변 유한체 연산기를 이용한 타원곡선 암호 프로세서)

  • Lee Dong-Ho
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.42 no.1
    • /
    • pp.57-67
    • /
    • 2005
  • Fast scalar multiplication of points on elliptic curve is important for elliptic curve cryptography applications. In order to vary field sizes depending on security situations, the cryptography coprocessors should support variable length finite field arithmetic units. To determine the effective variable length finite field arithmetic architecture, two well-known curve scalar multiplication algorithms were implemented on FPGA. The affine coordinates algorithm must use a hardware division unit, but the projective coordinates algorithm only uses a fast multiplication unit. The former algorithm needs the division hardware. The latter only requires a multiplication hardware, but it need more space to store intermediate results. To make the division unit versatile, we need to add a feedback signal line at every bit position. We proposed a method to mitigate this problem. For multiplication in projective coordinates implementation, we use a widely used digit serial multiplication hardware, which is simpler to be made versatile. We experimented with our implemented ECC coprocessors using variable length finite field arithmetic unit which has the maximum field size 256. On the clock speed 40 MHz, the scalar multiplication time is 6.0 msec for affine implementation while it is 1.15 msec for projective implementation. As a result of the study, we found that the projective coordinates algorithm which does not use the division hardware was faster than the affine coordinate algorithm. In addition, the memory implementation effectiveness relative to logic implementation will have a large influence on the implementation space requirements of the two algorithms.

Design of ECC Scalar Multiplier based on a new Finite Field Division Algorithm (새로운 유한체 나눗셈기를 이용한 타원곡선암호(ECC) 스칼라 곱셈기의 설계)

  • 김의석;정용진
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.29 no.5C
    • /
    • pp.726-736
    • /
    • 2004
  • In this paper, we proposed a new scalar multiplier structure needed for an elliptic curve cryptosystem(ECC) over the standard basis in GF(2$^{163}$ ). It consists of a bit-serial multiplier and a divider with control logics, and the divider consumes most of the processing time. To speed up the division processing, we developed a new division algorithm based on the extended Euclid algorithm. Dynamic data dependency of the Euclid algorithm has been transformed to static and fixed data flow by a localization technique, to make it independent of the input and field polynomial. Compared to other existing scalar multipliers, the new scalar multiplier requires smaller gate counts with improved processor performance. It has been synthesized using Samsung 0.18 um CMOS technology, and the maximum operating frequency is estimated 250 MHz. The resulting performance is 148 kbps, that is, it takes 1.1 msec to process a 163-bit data frame. We assure that this performance is enough to be used for digital signature, encryption/decryption, and key exchanges in real time environments.

Hardware Design of High Performance Arithmetic Unit with Processing of Complex Data for Multimedia Processor (복소수 데이터 처리가 가능한 멀티미디어 프로세서용 고성능 연산회로의 하드웨어 설계)

  • Choi, Byeong-yoon
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.20 no.1
    • /
    • pp.123-130
    • /
    • 2016
  • In this paper, a high-performance arithmetic unit which can efficiently accelerate a number of algorithms for multimedia application was designed. The 3-stage pipelined arithmetic unit can execute 38 operations for complex and fixed-point data by using efficient configuration for four 16-bit by 16-bit multipliers, new sign extension method for carry-save data, and correction constant scheme to eliminate sign-extension in compression operation of multiple partial multiplication results. The arithmetic unit has about 300-MHz operating frequency and about 37,000 gates on 45nm CMOS technology and its estimated performance is 300 MCOPS(Million Complex Operations Per Second). Because the arithmetic unit has high processing rate and supports a number of operations dedicated to various applications, it can be efficiently applicable to multimedia processors.

Variable Radix-Two Multibit Coding and Its VLSI Implementation of DCT/IDCT (가변길이 다중비트 코딩을 이용한 DCT/IDCT의 설계)

  • 김대원;최준림
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.39 no.12
    • /
    • pp.1062-1070
    • /
    • 2002
  • In this paper, variable radix-two multibit coding algorithm is presented and applied in the implementation of discrete cosine transform(DCT) and inverse discrete cosine transform(IDCT). Variable radix-two multibit coding means the 2k SD (signed digit) representation of overlapped multibit scanning with variable shift method. SD represented by 2k generates partial products, which can be easily implemented with shifters and adders. This algorithm is most powerful for the hardware implementation of DCT/IDCT with constant coefficient matrix multiplication. This paper introduces the suggested algorithm, it's proof and the implementation of DCT/IDCT The implemented IDCT chip with 8 PEs(Processing Elements) and one transpose memory runs at a tate of 400 Mpixels/sec at 54MHz frequency for high speed parallel signal processing, and it's verified in HDTV and MPEG decoder.

A Variable Latency Newton-Raphson's Floating Point Number Reciprocal Computation (가변 시간 뉴톤-랍손 부동소수점 역수 계산기)

  • Kim Sung-Gi;Cho Gyeong-Yeon
    • The KIPS Transactions:PartA
    • /
    • v.12A no.2 s.92
    • /
    • pp.95-102
    • /
    • 2005
  • The Newton-Raphson iterative algorithm for finding a floating point reciprocal which is widely used for a floating point division, calculates the reciprocal by performing a fixed number of multiplications. In this paper, a variable latency Newton-Raphson's reciprocal algorithm is proposed that performs multiplications a variable number of times until the error becomes smaller than a given value. To find the reciprocal of a floating point number F, the algorithm repeats the following operations: '$'X_{i+1}=X=X_i*(2-e_r-F*X_i),\;i\in\{0,\;1,\;2,...n-1\}'$ with the initial value $'X_0=\frac{1}{F}{\pm}e_0'$. The bits to the right of p fractional bits in intermediate multiplication results are truncated, and this truncation error is less than $'e_r=2^{-p}'$. The value of p is 27 for the single precision floating point, and 57 for the double precision floating point. Let $'X_i=\frac{1}{F}+e_i{'}$, these is $'X_{i+1}=\frac{1}{F}-e_{i+1},\;where\;{'}e_{i+1}, is less than the smallest number which is representable by floating point number. So, $X_{i+1}$ is approximate to $'\frac{1}{F}{'}$. Since the number of multiplications performed by the proposed algorithm is dependent on the input values, the average number of multiplications per an operation is derived from many reciprocal tables $(X_0=\frac{1}{F}{\pm}e_0)$ with varying sizes. The superiority of this algorithm is proved by comparing this average number with the fixed number of multiplications of the conventional algorithm. Since the proposed algorithm only performs the multiplications until the error gets smaller than a given value, it can be used to improve the performance of a reciprocal unit. Also, it can be used to construct optimized approximate reciprocal tables. The results of this paper can be applied to many areas that utilize floating point numbers, such as digital signal processing, computer graphics, multimedia scientific computing, etc.

A Variable Latency Goldschmidt's Floating Point Number Divider (가변 시간 골드스미트 부동소수점 나눗셈기)

  • Kim Sung-Gi;Song Hong-Bok;Cho Gyeong-Yeon
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.9 no.2
    • /
    • pp.380-389
    • /
    • 2005
  • The Goldschmidt iterative algorithm for a floating point divide calculates it by performing a fixed number of multiplications. In this paper, a variable latency Goldschmidt's divide algorithm is proposed, that performs multiplications a variable number of times until the error becomes smaller than a given value. To calculate a floating point divide '$\frac{N}{F}$', multifly '$T=\frac{1}{F}+e_t$' to the denominator and the nominator, then it becomes ’$\frac{TN}{TF}=\frac{N_0}{F_0}$'. And the algorithm repeats the following operations: ’$R_i=(2-e_r-F_i),\;N_{i+1}=N_i{\ast}R_i,\;F_{i+1}=F_i{\ast}R_i$, i$\in${0,1,...n-1}'. The bits to the right of p fractional bits in intermediate multiplication results are truncated, and this truncation error is less than ‘$e_r=2^{-p}$'. The value of p is 29 for the single precision floating point, and 59 for the double precision floating point. Let ’$F_i=1+e_i$', there is $F_{i+1}=1-e_{i+1},\;e_{i+1}',\;where\;e_{i+1}, If '$[F_i-1]<2^{\frac{-p+3}{2}}$ is true, ’$e_{i+1}<16e_r$' is less than the smallest number which is representable by floating point number. So, ‘$N_{i+1}$ is approximate to ‘$\frac{N}{F}$'. Since the number of multiplications performed by the proposed algorithm is dependent on the input values, the average number of multiplications per an operation is derived from many reciprocal tables ($T=\frac{1}{F}+e_t$) with varying sizes. 1'he superiority of this algorithm is proved by comparing this average number with the fixed number of multiplications of the conventional algorithm. Since the proposed algorithm only performs the multiplications until the error gets smaller than a given value, it can be used to improve the performance of a divider. Also, it can be used to construct optimized approximate reciprocal tables. The results of this paper can be applied to many areas that utilize floating point numbers, such as digital signal processing, computer graphics, multimedia, scientific computing, etc

A Variable Latency Newton-Raphson's Floating Point Number Reciprocal Square Root Computation (가변 시간 뉴톤-랍손 부동소수점 역수 제곱근 계산기)

  • Kim Sung-Gi;Cho Gyeong-Yeon
    • The KIPS Transactions:PartA
    • /
    • v.12A no.5 s.95
    • /
    • pp.413-420
    • /
    • 2005
  • The Newton-Raphson iterative algorithm for finding a floating point reciprocal square mot calculates it by performing a fixed number of multiplications. In this paper, a variable latency Newton-Raphson's reciprocal square root algorithm is proposed that performs multiplications a variable number of times until the error becomes smaller than a given value. To find the rediprocal square root of a floating point number F, the algorithm repeats the following operations: '$X_{i+1}=\frac{{X_i}(3-e_r-{FX_i}^2)}{2}$, $i\in{0,1,2,{\ldots}n-1}$' with the initial value is '$X_0=\frac{1}{\sqrt{F}}{\pm}e_0$'. The bits to the right of p fractional bits in intermediate multiplication results are truncated and this truncation error is less than '$e_r=2^{-p}$'. The value of p is 28 for the single precision floating point, and 58 for the double precision floating point. Let '$X_i=\frac{1}{\sqrt{F}}{\pm}e_i$, there is '$X_{i+1}=\frac{1}{\sqrt{F}}-e_{i+1}$, where '$e_{i+1}{<}\frac{3{\sqrt{F}}{{e_i}^2}}{2}{\mp}\frac{{Fe_i}^3}{2}+2e_r$'. If '$|\frac{\sqrt{3-e_r-{FX_i}^2}}{2}-1|<2^{\frac{\sqrt{-p}{2}}}$' is true, '$e_{i+1}<8e_r$' is less than the smallest number which is representable by floating point number. So, $X_{i+1}$ is approximate to '$\frac{1}{\sqrt{F}}$. Since the number of multiplications performed by the proposed algorithm is dependent on the input values, the average number of multiplications Per an operation is derived from many reciprocal square root tables ($X_0=\frac{1}{\sqrt{F}}{\pm}e_0$) with varying sizes. The superiority of this algorithm is proved by comparing this average number with the fixed number of multiplications of the conventional algorithm. Since the proposed algorithm only performs the multiplications until the error gets smaller than a given value, it can be used to improve the performance of a reciprocal square root unit. Also, it can be used to construct optimized approximate reciprocal square root tables. The results of this paper can be applied to many areas that utilize floating point numbers, such as digital signal processing, computer graphics, multimedia, scientific computing, etc.

Design of Programmable and Configurable Elliptic Curve Cryptosystem Coprocessor (재구성 가능한 타원 곡선 암호화 프로세서 설계)

  • Lee Jee-Myong;Lee Chanho;Kwon Woo-Suk
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.42 no.6 s.336
    • /
    • pp.67-74
    • /
    • 2005
  • Crypto-systems have difficulties in designing hardware due to the various standards. We propose a programmable and configurable architecture for cryptography coprocessors to accommodate various crypto-systems. The proposed architecture has a 32 bit I/O interface and internal bus width, and consists of a programmable finite field arithmetic unit, an input/output unit, a register file, and a control unit. The crypto-system is determined by the micro-codes in memory of the control unit, and is configured by programming the micro-codes. The coprocessor has a modular structure so that the arithmetic unit can be replaced if a substitute has an appropriate 32 bit I/O interface. It can be used in many crypto-systems by re-programming the micro-codes for corresponding crypto-system or by replacing operation units. We implement an elliptic curve crypto-processor using the proposed architecture and compare it with other crypto-processors