• Title/Summary/Keyword: Montgomery modular multiplication algorithm

Search Result 35, Processing Time 0.025 seconds

A Design of Modular Multiplier Based on Improved Multi-Precision Carry Save Adder (개선된 다정도 CSA에 기반한 모듈라 곱셈기 설계)

  • Kim, Dae-Young;Lee, Jun-Yong
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.33 no.4
    • /
    • pp.223-230
    • /
    • 2006
  • The method of implementing a modular multiplier for Montgomery multiplication by using an adder depends on a selected adder. When using a CPA, there is a carry propagation problem. When using a CSA, it needs an additional calculation for a final result. The Multiplier using a Multi-precision CSA can solve both problems simultaneously by combining a CSA and a CPA. This paper presents an improved MP-CSA which reduces hardware resources and operation time by changing a MP-CSA's carry chain structure. Consequently, the proposed multiplier is more suitable for the module of long bit multiplication and exponentiation using a modular multiplier repeatedly.

Design of Modular Exponentiation Processor for RSA Cryptography (RSA 암호시스템을 위한 모듈러 지수 연산 프로세서 설계)

  • 허영준;박혜경;이건직;이원호;유기영
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.10 no.4
    • /
    • pp.3-11
    • /
    • 2000
  • In this paper, we design modular multiplication systolic array and exponentiation processor having n bits message black. This processor uses Montgomery algorithm and LR binary square and multiply algorithm. This processor consists of 3 divisions, which are control unit that controls computation sequence, 5 shift registers that save input and output values, and modular exponentiation unit. To verify the designed exponetion processor, we model and simulate it using VHDL and MAX+PLUS II. Consider a message block length of n=512, the time needed for encrypting or decrypting such a block is 59.5ms. This modular exponentiation unit is used to RSA cryptosystem.

Scalable RSA public-key cryptography processor based on CIOS Montgomery modular multiplication Algorithm (CIOS 몽고메리 모듈러 곱셈 알고리즘 기반 Scalable RSA 공개키 암호 프로세서)

  • Cho, Wook-Lae;Shin, Kyung-Wook
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.22 no.1
    • /
    • pp.100-108
    • /
    • 2018
  • This paper describes a design of scalable RSA public-key cryptography processor supporting four key lengths of 512/1,024/2,048/3,072 bits. The modular multiplier that is a core arithmetic block for RSA crypto-system was designed with 32-bit datapath, which is based on the CIOS (Coarsely Integrated Operand Scanning) Montgomery modular multiplication algorithm. The modular exponentiation was implemented by using L-R binary exponentiation algorithm. The scalable RSA crypto-processor was verified by FPGA implementation using Virtex-5 device, and it takes 456,051/3,496347/26,011,947/88,112,770 clock cycles for RSA computation for the key lengths of 512/1,024/2,048/3,072 bits. The RSA crypto-processor synthesized with a $0.18{\mu}m$ CMOS cell library occupies 10,672 gate equivalent (GE) and a memory bank of $6{\times}3,072$ bits. The estimated maximum clock frequency is 147 MHz, and the RSA decryption takes 3.1/23.8/177/599.4 msec for key lengths of 512/1,024/2,048/3,072 bits.

An Addition-Chain Heuristics and Two Modular Multiplication Algorithms for Fast Modular Exponentiation (모듈라 멱승 연산의 빠른 수행을 위한 덧셈사슬 휴리스틱과 모듈라 곱셈 알고리즘들)

  • 홍성민;오상엽;윤현수
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.7 no.2
    • /
    • pp.73-92
    • /
    • 1997
  • A modular exponentiation( E$M^{$=varepsilon$}$mod N) is one of the most important operations in Public-key cryptography. However, it takes much time because the modular exponentiation deals with very large operands as 512-bit integers. Modular exponentiation is composed of repetition of modular multiplications, and the number of repetition is the same as the length of the addition-chain of the exponent(E). Therefore, we can reduce the execution time of modular exponentiation by finding shorter addition-chain(i.e. reducing the number of repetitions) or by reducing the execution time of each modular multiplication. In this paper, we propose an addition-chain heuristics and two fast modular multiplication algorithms. Of two modular multiplication algorithms, one is for modular multiplication between different integers, and the other is for modular squaring. The proposed addition-chain heuristics finds the shortest addition-chain among exisiting algorithms. Two proposed modular multiplication algorithms require single-precision multiplications fewer than 1/2 times of those required for previous algorithms. Implementing on PC, proposed algorithms reduce execution times by 30-50% compared with the Montgomery algorithm, which is the best among previous algorithms.

Design of Linear Systolic Arrays of Modular Multiplier for the Fast Modular Exponentiation (고속 모듈러 지수연산을 위한 모듈러 곱셈기의 선형 시스톨릭 어레이 설계)

  • Lee, Geon-Jik;Heo, Yeong-Jun;Yu, Gi-Yeong
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.26 no.9
    • /
    • pp.1055-1063
    • /
    • 1999
  • 공개키 암호화 시스템에서 주된 연산은 512비트 이상의 큰 수에 의한 모듈러 지수 연산으로 표현되며, 이 연산은 내부적으로 모듈러 곱셈을 반복적으로 수행함으로써 계산된다. 본 논문에서는 Montgomery 알고리즘을 분석하여 right-to-left 방식의 모듈러 지수 연산에서 공통으로 계산 가능한 부분을 이용하여 모듈러 제곱과 모듈러 곱셈을 동시에 수행하는 선형 시스톨릭 어레이를 설계한다. 설계된 시스톨릭 어레이는 VLSI 칩과 같은 하드웨어로 구현함으로써 IC 카드나 smart 카드에 이용될 수 있다.Abstract The main operation of the public-key cryptographic system is represented the modular exponentiation containing 512 or more bits and computed by performing the repetitive modular multiplications. In this paper, we analyze Montgomery algorithm and design the linear systolic array for performing modular multiplication and modular squaring simultaneously using the computable part in common in right-to-left modular exponentiation. The systolic array presented in this paper could be designed on VLSI hardware and used in IC and smart card.

Montgomery Multiplier with Very Regular Behavior

  • Yoo-Jin Baek
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.16 no.1
    • /
    • pp.17-28
    • /
    • 2024
  • As listed as one of the most important requirements for Post-Quantum Cryptography standardization process by National Institute of Standards and Technology, the resistance to various side-channel attacks is considered very critical in deploying cryptosystems in practice. In fact, cryptosystems can easily be broken by side-channel attacks, even though they are considered to be secure in the mathematical point of view. The timing attack(TA) and the simple power analysis attack(SPA) are such side-channel attack methods which can reveal sensitive information by analyzing the timing behavior or the power consumption pattern of cryptographic operations. Thus, appropriate measures against such attacks must carefully be considered in the early stage of cryptosystem's implementation process. The Montgomery multiplier is a commonly used and classical gadget in implementing big-number-based cryptosystems including RSA and ECC. And, as recently proposed as an alternative of building blocks for implementing post quantum cryptography such as lattice-based cryptography, the big-number multiplier including the Montgomery multiplier still plays a role in modern cryptography. However, in spite of its effectiveness and wide-adoption, the multiplier is known to be vulnerable to TA and SPA. And this paper proposes a new countermeasure for the Montgomery multiplier against TA and SPA. Briefly speaking, the new measure first represents a multiplication operand without 0 digits, so the resulting multiplication operation behaves in a very regular manner. Also, the new algorithm removes the extra final reduction (which is intrinsic to the modular multiplication) to make the resulting multiplier more timing-independent. Consequently, the resulting multiplier operates in constant time so that it totally removes any TA and SPA vulnerabilities. Since the proposed method can process multi bits at a time, implementers can also trade-off the performance with the resource usage to get desirable implementation characteristics.

Compact implementations of Curve Ed448 on low-end IoT platforms

  • Seo, Hwajeong
    • ETRI Journal
    • /
    • v.41 no.6
    • /
    • pp.863-872
    • /
    • 2019
  • Elliptic curve cryptography is a relatively lightweight public-key cryptography method for key generation and digital signature verification. Some lightweight curves (eg, Curve25519 and Curve Ed448) have been adopted by upcoming Transport Layer Security 1.3 (TLS 1.3) to replace the standardized NIST curves. However, the efficient implementation of Curve Ed448 on Internet of Things (IoT) devices remains underexplored. This study is focused on the optimization of the Curve Ed448 implementation on low-end IoT processors (ie, 8-bit AVR and 16-bit MSP processors). In particular, the three-level and two-level subtractive Karatsuba algorithms are adopted for multi-precision multiplication on AVR and MSP processors, respectively, and two-level Karatsuba routines are employed for multi-precision squaring. For modular reduction and finite field inversion, fast reduction and Fermat-based inversion operations are used to mitigate side-channel vulnerabilities. The scalar multiplication operation using the Montgomery ladder algorithm requires only 103 and 73 M clock cycles on AVR and MSP processors.

Design of a ECC arithmetic engine for Digital Transmission Contents Protection (DTCP) (컨텐츠 보호를 위한 DTCP용 타원곡선 암호(ECC) 연산기의 구현)

  • Kim Eui seek;Jeong Yong jin
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.30 no.3C
    • /
    • pp.176-184
    • /
    • 2005
  • In this paper, we implemented an Elliptic Curve Cryptography(ECC) processor for Digital Transmission Contents Protection (DTCP), which is a standard for protecting various digital contents in the network. Unlikely to other applications, DTCP uses ECC algorithm which is defined over GF(p), where p is a 160-bit prime integer. The core arithmetic operation of ECC is a scalar multiplication, and it involves large amount of very long integer modular multiplications and additions. In this paper, the modular multiplier was designed using the well-known Montgomery algorithm which was implemented with CSA(Carry-save Adder) and 4-level CLA(Carry-lookahead Adder). Our new ECC processor has been synthesized using Samsung 0.18 m CMOS standard cell library, and the maximum operation frequency was estimated 98 MHz, with the size about 65,000 gates. The resulting performance was 29.6 kbps, that is, it took 5.4 msec to process a 160-bit data frame. We assure that this performance is enough to be used for digital signature, encryption and decryption, and key exchanges in real time environments.

Efficient Radix-4 Systolic VLSI Architecture for RSA Public-key Cryptosystem (RSA 공개키 암호화시스템의 효율적인 Radix-4 시스톨릭 VLSI 구조)

  • Park Tae geun
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.29 no.12C
    • /
    • pp.1739-1747
    • /
    • 2004
  • In this paper, an efficient radix-4 systolic VLSI architecture for RSA public-key cryptosystem is proposed. Due to the simple operation of iterations and the efficient systolic mapping, the proposed architecture computes an n-bit modular exponentiation in n$^{2}$ clock cycles since two modular multiplications for M$_{i}$ and P$_{i}$ in each exponentiation process are interleaved, so that the hardware is fully utilized. We encode the exponent using Radix-4. SD (Signed Digit) number system to reduce the number of modular multiplications for RSA cryptography. Therefore about 20% of NZ (non-zero) digits in the exponent are reduced. Compared to conventional approaches, the proposed architecture shows shorter period to complete the RSA while requiring relatively less hardware resources. The proposed RSA architecture based on the modified Montgomery algorithm has locality, regularity, and scalability suitable for VLSI implementation.

VLSI Implementation of High Speed Variable-Length RSA Crytosystem (가변길이 고속 RSA 암호시스템의 VLSI 구현)

  • 박진영;서영호;김동욱
    • Proceedings of the IEEK Conference
    • /
    • 2002.06b
    • /
    • pp.285-288
    • /
    • 2002
  • In this paper, a new structure of 1024-bit high-speed RSA cryptosystem has been proposed and implemented in hardware to increase the operation speed and enhance the variable-length operation in the plain text. The proposed algorithm applied a radix-4 Booth algorithm and CSA(Carry Save Adder) to the Montgomery algorithm for modular multiplication As the results from implementation, the clock period was approached to one delay of a full adder and the operation speed was 150MHz. The total amount of hardware was about 195k gates. The cryptosystem operates as the effective length of the inputted modulus number, which makes variable length encryption rather than the fixed-length one. Therefore, a high-speed variable-length RSA cryptosystem could be implemented.

  • PDF