• Title/Summary/Keyword: Carry Save Adder

Search Result 37, Processing Time 0.026 seconds

High Speed Modular Multiplication Algorithm for RSA Cryptosystem (RSA 암호 시스템을 위한 고속 모듈라 곱셈 알고리즘)

  • 조군식;조준동
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.27 no.3C
    • /
    • pp.256-262
    • /
    • 2002
  • This paper presents a novel radix-4 modular multiplication algorithm based on the sign estimation technique (3). The sign estimation technique detects the sign of a number represented in the form of a carry-sum pair. It can be implemented with 5-bit carry look-ahead adder. The hardware speed of the cryptosystem is dependent on the performance modular multiplication of large numbers. Our algorithm requires only (n/2+3) clock cycle for n bit modulus in performing modular multiplication. Our algorithm out-performs existing algorithm in terms of required clock cycles by a half, It is efficient for modular exponentiation with large modulus used in RSA cryptosystem. Also, we use high-speed adder (7) instead of CPA (Carry Propagation Adder) for modular multiplication hardware performance in fecal stage of CSA (Carry Save Adder) output. We apply RL (Right-and-Left) binary method for modular exponentiation because the number of clock cycles required to complete the modular exponentiation takes n cycles. Thus, One 1024-bit RSA operation can be done after n(n/2+3) clock cycles.

A Scalable Word-based RSA Cryptoprocessor with PCI Interface Using Pseudo Carry Look-ahead Adder (가상 캐리 예측 덧셈기와 PCI 인터페이스를 갖는 분할형 워드 기반 RSA 암호 칩의 설계)

  • Gwon, Taek-Won;Choe, Jun-Rim
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.39 no.8
    • /
    • pp.34-41
    • /
    • 2002
  • This paper describes a scalable implementation method of a word-based RSA cryptoprocessor using pseudo carry look-ahead adder The basic organization of the modular multiplier consists of two layers of carry-save adders (CSA) and a reduced carry generation and Propagation scheme called the pseudo carry look-ahead adder for the high-speed final addition. The proposed modular multiplier does not need complicated shift and alignment blocks to generate the next word at each clock cycle. Therefore, the proposed architecture reduces the hardware resources and speeds up the modular computation. We implemented a single-chip 1024-bit RSA cryptoprocessor based on the word-based modular multiplier with 256 datapaths in 0.5${\mu}{\textrm}{m}$ SOG technology after verifying the proposed architectures using FPGA with PCI bus.

Implementation of RSA Exponentiator Based on Radix-$2^k$ Modular Multiplication Algorithm (Radix-$2^k$ 모듈라 곱셈 알고리즘 기반의 RSA 지수승 연산기 설계)

  • 권택원;최준림
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.12 no.2
    • /
    • pp.35-44
    • /
    • 2002
  • In this paper, an implementation method of RSA exponentiator based on Radix-$2^k$ modular multiplication algorithm is presented and verified. We use Booth receding algorithm to implement Radix-$2^k$ modular multiplication and implement radix-16 modular multiplier using 2K-byte memory and CSA(carry-save adder) array - with two full adder and three half adder delays. For high speed final addition we use a reduced carry generation and propagation scheme called pseudo carry look-ahead adder. Furthermore, the optimum value of the radix is presented through the trade-off between the operating frequency and the throughput for given Silicon technology. We have verified 1,024-bit RSA processor using Altera FPGA EP2K1500E device and Samsung 0.3$\mu\textrm{m}$ technology. In case of the radix-16 modular multiplication algorithm, (n+4+1)/4 clock cycles are needed and the 1,024-bit modular exponentiation is performed in 5.38ms at 50MHz.

The Montgomery Multiplier Using Scalable Carry Save Adder (분할형 CSA를 이용한 Montgomery 곱셈기)

  • 하재철;문상재
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.10 no.3
    • /
    • pp.77-83
    • /
    • 2000
  • This paper presents a new modular multiplier for Montgomery multiplication using iterative small carry save adder. The proposed multiplier is more flexible and suitable for long bit multiplication due to its scalable property according to design area and required computing time. We describe the word-based Montgomery algorithm and design architecture of the multiplier. Our analysis and simulation show that the proposed multiplier provides area/time tradeoffs in limited design area such as IC cards.

Bit-level Simulator for CORDIC Arithmetic based on carry-save adder (CORDIC 연산기 구현을 위한 Bit-level 하드웨어 시뮬레이션)

  • 이성수;이정아
    • Proceedings of the Korea Database Society Conference
    • /
    • 1995.12a
    • /
    • pp.173-176
    • /
    • 1995
  • 본 논문에서 다루는 내용은 멀티미디어 정보처리시 이용되는 여러 신호 처리용 하드웨어에서 필요로 하는 벡터 트랜스퍼메이션(Vector Transformation)및 오소그날 트랜스퍼메이션(Orthogonal Transformation)에 유용할 뿐만 아니라 여러 형태의 다양한 연산(elementary function including trigonometric functions)을 하나의 단일화된 알고리즘으로 구현할 수 있게 한 CORDIC(Coordinate Rotation Digit Computer)연산[1][2]에 관한 연구이다. CORDIC 연산기를 실현함에 있어서 고속 연산을 위해 고속 가산기(fast adder)로서 CSA(Carry Save Adder)를 선택하는데, 본 논문의 연구 초점은 CORDIC연산기를 하드웨어로 실현하기 전에 Bit-Level의 시뮬레이터를 통하여, CSA의 특징상 발생할 수 있는 문제점어 대해 설명하고, 해결 방법[3]을 이용하여 원하는 값에 접근하는가를 확인하여 다양한 Bit의 조작으로 오차의 정도에 따라 유효한 CORDIC연산기를 실현하는데 도움이 되고자 한다.

  • PDF

A Design of HAS-160 Processor for Smartcard Application (스마트카드용 HAS-160 프로세서 설계)

  • Kim, Hae-ju;Shin, Kyung-Wook
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2009.10a
    • /
    • pp.913-916
    • /
    • 2009
  • This paper describes a hardware design of hash processor which implements HAS-160 algorithm adopted as a Korean standard. To achieve a high-speed operation with small-area, the arithmetic operation is implemented using a hybrid structure of 5:3 and 3:2 carry-save adders and a carry-select adder. The HAS-160 processor synthesized with $0.35-{\mu}m$ CMOS cell library has 17,600 gates. It computes a 160-bit hash code from a message block of 512 bits in 82 clock cycles, and has 312 Mbps throughput at 50 MHz@3.3-V clock frequency.

  • PDF

A Efficient Architecture of MBA-based Parallel MAC for High-Speed Digital Signal Processing (고속 디지털 신호처리를 위한 MBA기반 병렬 MAC의 효율적인 구조)

  • 서영호;김동욱
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.41 no.7
    • /
    • pp.53-61
    • /
    • 2004
  • In this paper, we proposed a new architecture of MAC(Multiplier-Accumulator) to operate high-speed multiplication-accumulation. We used the MBA(Modified radix-4 Booth Algorithm) which is based on the 1's complement number system, and CSA(Carry Save Adder) for addition of the partial products. During the addition of the partial product, the signed numbers with the 1's complement type after Booth encoding are converted in the 2's complement signed number in the CSA tree. Since 2-bit CLA(Carry Look-ahead Adder) was used in adding the lower bits of the partial product, the input bit width of the final adder and whole delay of the critical path were reduced. The proposed MAC was applied into the DWT(Discrete Wavelet Transform) filtering operation for JPEG2000, and it showed the possibility for the practical application. Finally we identified the improved performance according to the comparison with the previous architecture in the aspect of hardware resource and delay.

Design of Two-dimensional Digital Filter by Research and Educational CAD Tools (연구교육용 CAD 툴에 의한 이차원 디지탈필터의 설계)

  • Song, Nak-Un;Kim, Jong-Jun
    • The Transactions of the Korea Information Processing Society
    • /
    • v.3 no.5
    • /
    • pp.1187-1197
    • /
    • 1996
  • In this work, two-dimensional FIR digital filter is designed and simulated using research and educational CAD tools. The two-dimensional digital filter consists mainly of one-dimensional digital filter and line memory. To speed up one-dimensional digital filter, multiplications are carried out on the basis of hardwired-shifting methods by the digital filter coefficients represented in CSD formats, while carry-save adder and Manchester adder are used in addition. It is found that the designed digital filter operates up to 30 Mhz in VHDL simulation and operates normally in IRSIM simulation for the layout made by Berkeley CAD tools.

  • PDF

Design of a ECC arithmetic engine for Digital Transmission Contents Protection (DTCP) (컨텐츠 보호를 위한 DTCP용 타원곡선 암호(ECC) 연산기의 구현)

  • Kim Eui seek;Jeong Yong jin
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.30 no.3C
    • /
    • pp.176-184
    • /
    • 2005
  • In this paper, we implemented an Elliptic Curve Cryptography(ECC) processor for Digital Transmission Contents Protection (DTCP), which is a standard for protecting various digital contents in the network. Unlikely to other applications, DTCP uses ECC algorithm which is defined over GF(p), where p is a 160-bit prime integer. The core arithmetic operation of ECC is a scalar multiplication, and it involves large amount of very long integer modular multiplications and additions. In this paper, the modular multiplier was designed using the well-known Montgomery algorithm which was implemented with CSA(Carry-save Adder) and 4-level CLA(Carry-lookahead Adder). Our new ECC processor has been synthesized using Samsung 0.18 m CMOS standard cell library, and the maximum operation frequency was estimated 98 MHz, with the size about 65,000 gates. The resulting performance was 29.6 kbps, that is, it took 5.4 msec to process a 160-bit data frame. We assure that this performance is enough to be used for digital signature, encryption and decryption, and key exchanges in real time environments.

Design of a Correlator and an Access-code Generator for Bluetooth Baseband (블루투스 기저대역을 위한 상관기와 액세스 코드 생성 모듈의 설계)

  • Hwang Sun-Won;Lee Sang-Hoon;Shin Wee-Jae
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.6 no.4
    • /
    • pp.206-211
    • /
    • 2005
  • We describe the design for a correlator and an access code generator in bluetooth system. These are used for a connection setting, a packet decision and a clock synchronization between Bluetooth units. The correlator consists of two blocks; carry save adder based on Wallace tree and threshold-value decision block. It determines on an useful packet and clock-synchronization for input signal of 1.0Mbps through the sliding-window correlating. The access-code generator also consists of two blocks; BCH(Bose-Chadhuri-Hocquenghem) cyclic encoder and control block. It generates the access-codes according to four steps' generation process based on Bluetooth standard. In order to solve synchronization problem, we make use of any memory as a pseudo random sequence. The proposed correlator and access-code generator were coded with VHDL. An FPGA Implementation of these modules and the simulation results are proved by Xilinx chip. The critical delay and correlative margin based on synthesis show the 4.689ns and the allowable correlation-error up to 7-bit.

  • PDF