• Title/Summary/Keyword: 곱셈기

Search Result 535, Processing Time 0.024 seconds

New High Speed Parallel Multiplier for Real Time Multimedia Systems (실시간 멀티미디어 시스템을 위한 새로운 고속 병렬곱셈기)

  • Cho, Byung-Lok;Lee, Mike-Myung-Ok
    • The KIPS Transactions:PartA
    • /
    • v.10A no.6
    • /
    • pp.671-676
    • /
    • 2003
  • In this paper, we proposed a new First Partial product Addition (FPA) architecture with new compressor (or parallel counter) to CSA tree built in the process of adding partial product for improving speed in the fast parallel multiplier to improve the speed of calculating partial product by about 20% compared with existing parallel counter using full Adder. The new circuit reduces the CLA bit finding final sum by N/2 using the novel FPA architecture. A 5.14nS of multiplication speed of the $16{\times}16$ multiplier is obtained using $0.25\mu\textrm{m}$ CMOS technology. The architecture of the multiplier is easily opted for pipeline design and demonstrates high speed performance.

Design of ENMODL CLA for Low Power High Speed Multipier (고속 저전력 곱셈기에 적합한 ENMODL CLA 설계)

  • 백한석;한석붕
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.2 no.4
    • /
    • pp.91-96
    • /
    • 2001
  • In this paper we propose a new ENMODL(Enhanced-NORA-MODL) CLA(Carry-Look Ahead Adder) for high speed and low power multiplier. To reduce transistor counts, area and power dissipation we developed new-approaches. The method makes use of a dynamic CMOS logic ENMODL CLA. The advantage of ENMODL is small area and high speed The speed of ENMODL CLA is invreased by 6.27 % as compared with conventional NMOCL CLA. The proposed method was verified by HSPICE simulation and layout througth 0.6${\mu}{\textrm}{m}$ CMOS process.

  • PDF

Efficient Bit-Parallel Multiplier for Binary Field Defind by Equally-Spaced Irreducible Polynomials (Equally Spaced 기약다항식 기반의 효율적인 이진체 비트-병렬 곱셈기)

  • Lee, Ok-Suk;Chang, Nam-Su;Kim, Chang-Han;Hong, Seok-Hie
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.18 no.2
    • /
    • pp.3-10
    • /
    • 2008
  • The choice of basis for representation of element in $GF(2^m)$ affects the efficiency of a multiplier. Among them, a multiplier using redundant representation efficiently supports trade-off between the area complexity and the time complexity since it can quickly carry out modular reduction. So time of a previous multiplier using redundant representation is faster than time of multiplier using others basis. But, the weakness of one has a upper space complexity compared to multiplier using others basis. In this paper, we propose a new efficient multiplier with consideration that polynomial exponentiation operations are frequently used in cryptographic hardwares. The proposed multiplier is suitable fer left-to-right exponentiation environment and provides efficiency between time and area complexity. And so, it has both time delay of $T_A+({\lceil}{\log}_2m{\rceil})T_x$ and area complexity of (2m-1)(m+s). As a result, the proposed multiplier reduces $2(ms+s^2)$ compared to the previous multiplier using equally-spaced polynomials in area complexity. In addition, it reduces $T_A+({\lceil}{\log}_2m+s{\rceil})T_x$ to $T_A+({\lceil}{\log}_2m{\rceil})T_x$ in the time complexity.($T_A$:Time delay of one AND gate, $T_x$:Time delay of one XOR gate, m:Degree of equally spaced irreducible polynomial, s:spacing factor)

Design of A High Performance 1-D Discrete Wavelet Transform Filter Using Pipelined Architecture (파이프라인 구조를 이용한 고성능 1 차원 이산 웨이블렛 변환 필터 설계)

  • Park, Tae-Geun;Song, Chang-Joo
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2001.10a
    • /
    • pp.711-714
    • /
    • 2001
  • 본 논문에서는 파이프라인 구조를 이용하여 고성능 1 차원 이산 웨이블렛 변환 필터를 설계하였다. 각 레벨에서 입력이 다운샘플링(downsampling, decimation)되므로 각 레벨의 하드웨어를 폴딩(folding) 기법을 이용하여 곱셈기와 덧셈기를 공유함으로써 복잡도를 개선하였다. 즉, 제안한 구조에서는 레벨 2 와 레벨 3 에서 폴딩된 구조의 C.S.R(Circular Shift Register)곱셈기와 덧셈기를 사용함으로써 하드웨어 효율(hardware utilization)을 각 레벨에서 100%로 높일 수 있다. 또한, 홀수와 짝수의 샘플을 병렬로 입력함으로써 단일 입력의 시스템과 비교할 때, 동일 시간에 병렬화 만큼의 이득을 얻을 수 있었고, 필터 계수는 미러 필터(mirror filter)의 특성을 이용하여 쳐대한 고역 필터(high pass filter)와 저역 필터(low pass filter)의 계수들을 공유함으로써 곱셈기와 덧셈기의 수를 반으로 줄였다. 그리고 임계 경로(critical path)를 줄이기 위한 파이프라인 레지스터를 삽입하여 고성능 시스템을 구현하였다.

  • PDF

Low-Cost Elliptic Curve Cryptography Processor Based On Multi-Segment Multiplication (멀티 세그먼트 곱셈 기반 저비용 타원곡선 암호 프로세서)

  • LEE Dong-Ho
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.42 no.8 s.338
    • /
    • pp.15-26
    • /
    • 2005
  • In this paper, we propose an efficient $GF(2^m)$ multi-segment multiplier architecture and study its application to elliptic curve cryptography processors. The multi-segment based ECC datapath has a very small combinational multiplier to compute partial products, most of its internal data buses are word-sized, and it has only a single m bit multiplexer and a single m bit register. Hence, the resource requirements of the proposed ECC datapath can be minimized as the segment number increases and word-size is decreased. Hence, as compared to the ECC processor based on digit-serial multiplication, the proposed ECC datapath is more efficient in resource usage. The resource requirement of ECC Processor implementation depends not only on the number of basic hardware components but also on the complexity of interconnection among them. To show the realistic area efficiency of proposed ECC processors, we implemented both the ECC processors based on the proposed multi-segment multiplication and digit serial multiplication and compared their FPGA resource usages. The experimental results show that the Proposed multi-segment multiplication method allows to implement ECC coprocessors, requiring about half of FPGA resources as compared to digit serial multiplication.

Design of a $54{\times}54$-bit Multiplier Based on a Improved Conditional Sum Adder (개선된 조건 합 가산기를 이용한 $54{\times}54$-bit 곱셈기의 설계)

  • Lee, Young-Chul;Song, Min-Kyu
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.37 no.1
    • /
    • pp.67-74
    • /
    • 2000
  • In this paper, a $54{\times}54$-bit multiplier based on a improved conditional sum adder is proposed. To reduce the multiplication time, high compression-rate compressors without Booth's Encoding, and a 108-bit conditional sum adder with separated carry generation block, are developed. Furthermore, a design technique based on pass-transistor logic is utilized for optimize the multiplication time and the power consumption by about 5% compared to that of conventional one. With $0.65{\mu}m$, single-poly, triple-metal CMOS process, its chip size is $6.60{\times}6.69\;mm^2$ and the multiplication time is 135.ns at a 3.3V power supply.

  • PDF

Design of Partitioned $AB^2$ Systolic Modular Multiplier (분할된 $AB^2$ 시스톨릭 모듈러 곱셈기 설계)

  • Lee, Jin-Ho;Kim, Hyun-Sung
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.31 no.1C
    • /
    • pp.87-92
    • /
    • 2006
  • An $AB^2$ modular operation is an efficient basic operation for the public key cryptosystems and various systolic architectures for $AB^2$ modular operation have been proposed. However, these architectures have a shortcoming for cryptographic applications due to their high area complexity. Accordingly, this paper presents an partitioned $AB^2$ systolic modular multiplier over GF($2^m$). A dependency graph from the MSB $AB^2$ modular multiplication algorithm is partitioned into 1/3 to get an partitioned $AB^2$ systolic multiplier. The multiplier reduces the area complexity about 2/3 compared with the previous multiplier. The multiplier could be used as a basic building block to implement the modular exponentiation for the public key cryptosystems based on smartcard which has a restricted hardware requirements.

Design of Multipliers Optimized for CNN Inference Accelerators (CNN 추론 연산 가속기를 위한 곱셈기 최적화 설계)

  • Lee, Jae-Woo;Lee, Jaesung
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.25 no.10
    • /
    • pp.1403-1408
    • /
    • 2021
  • Recently, FPGA-based AI processors are being studied actively. Deep convolutional neural networks (CNN) are basic computational structures performed by AI processors and require a very large amount of multiplication. Considering that the multiplication coefficients used in CNN inference operation are all constants and that an FPGA is easy to design a multiplier tailored to a specific coefficient, this paper proposes a methodology to optimize the multiplier. The method utilizes 2's complement and distributive law to minimize the number of bits with a value of 1 in a multiplication coefficient, and thereby reduces the number of required stacked adders. As a result of applying this method to the actual example of implementing CNN in FPGA, the logic usage is reduced by up to 30.2% and the propagation delay is also reduced by up to 22%. Even when implemented with an ASIC chip, the hardware area is reduced by up to 35% and the delay is reduced by up to 19.2%.

Design of 32-bit Floating Point Multiplier for FPGA (FPGA를 위한 32비트 부동소수점 곱셈기 설계)

  • Xuhao Zhang;Dae-Ik Kim
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.19 no.2
    • /
    • pp.409-416
    • /
    • 2024
  • With the expansion of floating-point operation requirements for fast high-speed data signal processing and logic operations, the speed of the floating-point operation unit is the key to affect system operation. This paper studies the performance characteristics of different floating-point multiplier schemes, completes partial product compression in the form of carry and sum, and then uses a carry look-ahead adder to obtain the result. Intel Quartus II CAD tool is used for describing Verilog HDL and evaluating performance results of the floating point multipliers. Floating point multipliers are analyzed and compared based on area, speed, and power consumption. The FMAX of modified Booth encoding with Wallace tree is 33.96 Mhz, which is 2.04 times faster than the booth encoding, 1.62 times faster than the modified booth encoding, 1.04 times faster than the booth encoding with wallace tree. Furthermore, compared to modified booth encoding, the area of modified booth encoding with wallace tree is reduced by 24.88%, and power consumption of that is reduced by 2.5%.

A Novel Redundant Binary Montgomery Multiplier and Hardware Architecture (새로운 잉여 이진 Montgomery 곱셈기와 하드웨어 구조)

  • Lim Dae-Sung;Chang Nam-Su;Ji Sung-Yeon;Kim Sung-Kyoung;Lee Sang-Jin;Koo Bon-Seok
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.16 no.4
    • /
    • pp.33-41
    • /
    • 2006
  • RSA cryptosystem is of great use in systems such as IC card, mobile system, WPKI, electronic cash, SET, SSL and so on. RSA is performed through modular exponentiation. It is well known that the Montgomery multiplier is efficient in general. The critical path delay of the Montgomery multiplier depends on an addition of three operands, the problem that is taken over carry-propagation makes big influence at an efficiency of Montgomery Multiplier. Recently, the use of the Carry Save Adder(CSA) which has no carry propagation has worked McIvor et al. proposed a couple of Montgomery multiplication for an ideal exponentiation, the one and the other are made of 3 steps and 2 steps of CSA respectively. The latter one is more efficient than the first one in terms of the time complexity. In this paper, for faster operation than the latter one we use binary signed-digit(SD) number system which has no carry-propagation. We propose a new redundant binary adder(RBA) that performs the addition between two binary SD numbers and apply to Montgomery multiplier. Instead of the binary SD addition rule using in existing RBAs, we propose a new addition rule. And, we construct and simulate to the proposed adder using gates provided from SAMSUNG STD130 $0.18{\mu}m$ 1.8V CMOS Standard Cell Library. The result is faster by a minimum 12.46% in terms of the time complexity than McIvor's 2 method and existing RBAs.