• Title/Summary/Keyword: algorithm for multiplication

Search Result 372, Processing Time 0.031 seconds

Implementation of RSA Exponentiator Based on Radix-$2^k$ Modular Multiplication Algorithm (Radix-$2^k$ 모듈라 곱셈 알고리즘 기반의 RSA 지수승 연산기 설계)

  • 권택원;최준림
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.12 no.2
    • /
    • pp.35-44
    • /
    • 2002
  • In this paper, an implementation method of RSA exponentiator based on Radix-$2^k$ modular multiplication algorithm is presented and verified. We use Booth receding algorithm to implement Radix-$2^k$ modular multiplication and implement radix-16 modular multiplier using 2K-byte memory and CSA(carry-save adder) array - with two full adder and three half adder delays. For high speed final addition we use a reduced carry generation and propagation scheme called pseudo carry look-ahead adder. Furthermore, the optimum value of the radix is presented through the trade-off between the operating frequency and the throughput for given Silicon technology. We have verified 1,024-bit RSA processor using Altera FPGA EP2K1500E device and Samsung 0.3$\mu\textrm{m}$ technology. In case of the radix-16 modular multiplication algorithm, (n+4+1)/4 clock cycles are needed and the 1,024-bit modular exponentiation is performed in 5.38ms at 50MHz.

Efficient Multiplication of Boolean Matrices and Algorithm for D-Class Computation (D-클래스 계산을 위한 불리언 행렬의 효율적 곱셈 및 알고리즘)

  • Han, Jae-Il;Shin, Bum-Joo
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.12 no.2
    • /
    • pp.68-78
    • /
    • 2007
  • D-class is defined as a set of equivalent $n{\times}n$ boolean matrices according to a given equivalence relation. The D-class computation requires the multiplication of three boolean matrices for each of all possible triples of $n{\times}n$ boolean matrices. However, almost all the researches on boolean matrices focused on the efficient multiplication of only two boolean matrices and a few researches have recently been shown to deal with the multiplication of all boolean matrices. The paper suggests a mathematical theory that enables the efficient multiplication for all possible boolean matrix triples and the efficient computation of all D-classes, and discusses algorithms designed with the theory and their execution results.

  • PDF

Low power filter structure using Short-length running convolution (Short-length running convolution을 사용한 저전력 필터 구조)

  • Oh, Se-Man;Lee, Won-Sang;Jang, Young-Beom
    • Proceedings of the IEEK Conference
    • /
    • 2006.06a
    • /
    • pp.263-264
    • /
    • 2006
  • In this paper, an efficient and fast algorithm to reduce calculation amount of FIR(Finite Impulse Responses) filtering is proposed. Proposed algorithm enables arbitrary size of parallel processing, and their structures are also easily derived. Furthermore, it is shown that the number of multiplication/sample is reduced, and number of instructions using MAC(Multiplication and Accumulation) processor are also reduced. For theoretical improvement, numbers of sub filters are compared with those of conventional algorithm. In addition to the theoretical improvement, it is shown that number of element for hardwired implementation are reduced comparison to those of the conventional algorithm.

  • PDF

Optimizing Multiprecision Squaring for Efficient Public Key Cryptography on 8-bit Sensor Nodes (8 비트 센서 노드 상에서 효율적인 공개키 암호를 위한 다정도 제곱 연산의 최적화)

  • Kim, Il-Hee;Park, Yong-Su;Lee, Youn-Ho
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.36 no.6
    • /
    • pp.502-510
    • /
    • 2009
  • Multiprecision squaring is one of the most significant algorithms in the core public key cryptography operation. The aim of this work is to present a new improved squaring algorithm compared with the MIRACL's multi precision squaring algorithm in which the previous work [1] on multiprecision multiplication is implemented. First, previous works on multiprecision multiplication and standard squaring are analyzed. Then, our new Lazy Doubling squaring algorithm is introduced. In MIRACLE library [3], Scott's Carry-Catcher Hybrid multiplication technique [1] is applied to implementation of multiprecision multiplication and squaring. Experimental results of the Carry-Catcher hybrid squaring algorithm and the proposed Lazy Doubling squaring algorithm both of which are tested on Atmega128 CPU show that proposed idea has achieved significant performance improvements. The proposed Lazy Doubling Squaring algorithm reduces addition instructions by the fact $a_0\;{\ast}\;2\;+\;a_1\;{\ast}\;2\;+\;...\;+\;a_{n-1}\;{\ast}\;2\;+\;a_n\;{\ast}\;2\;=\;(a_0\;+\;a_1\;+\;...\;+\;a_{n-1}\;+\;a_n)\;{\ast}\;2$ while the standard squaring algorithm reduces multiplication instructions by the fact $S_{ij}\;=\;x_i\;{\ast}\;x_j\;=\;S_{ij}$. Experimental results show that the proposed squaring method is 25% faster than that in MIRACL.

An Analysis on the Process of Conceptual Understanding of Fifth Grade Elementary School Students about the Multiplication of Decimal with Base-Ten Blocks (십진블록을 활용한 소수의 곱셈 지도에서 초등학교 5학년 학생들의 개념적 이해 과정 분석)

  • Kim, Soo-Jeong;Pang, Jeong-Suk
    • Journal of Elementary Mathematics Education in Korea
    • /
    • v.11 no.1
    • /
    • pp.1-21
    • /
    • 2007
  • The purpose of this study was to propose instructional methods using base-ten blocks in teaching the multiplication of decimal for 5th grade students by analyzing the process of their conceptual comprehension of multiplication of decimal. The students in this study were found to understand various meanings of operations (e.g., repeated addition, bundling, and area) by modeling them with base-ten blocks. They were able to identify the algorithm through the use of base-ten blocks and to understand the principle of calculations by connecting the manipulative activities to each stage of algorithm. The students were also able to determine whether the results of multiplication of decimal might be reasonable using base-ten blocks. This study suggests that appropriate use of base-ten blocks promotes the conceptual understanding of the multiplication of decimal.

  • PDF

A Parallel-Architecture Processor Design for the Fast Multiplication of Homogeneous Transformation Matrices (Homogeneous Transformation Matrix의 곱셈을 위한 병렬구조 프로세서의 설계)

  • Kwon Do-All;Chung Tae-Sang
    • The Transactions of the Korean Institute of Electrical Engineers D
    • /
    • v.54 no.12
    • /
    • pp.723-731
    • /
    • 2005
  • The $4{\times}4$ homogeneous transformation matrix is a compact representation of orientation and position of an object in robotics and computer graphics. A coordinate transformation is accomplished through the successive multiplications of homogeneous matrices, each of which represents the orientation and position of each corresponding link. Thus, for real time control applications in robotics or animation in computer graphics, the fast multiplication of homogeneous matrices is quite demanding. In this paper, a parallel-architecture vector processor is designed for this purpose. The processor has several key features. For the accuracy of computation for real application, the operands of the processors are floating point numbers based on the IEEE Standard 754. For the parallelism and reduction of hardware redundancy, the processor takes column vectors of homogeneous matrices as multiplication unit. To further improve the throughput, the processor structure and its control is based on a pipe-lined structure. Since the designed processor can be used as a special purpose coprocessor in robotics and computer graphics, additionally to special matrix/matrix or matrix/vector multiplication, several other useful instructions for various transformation algorithms are included for wide application of the new design. The suggested instruction set will serve as standard in future processor design for Robotics and Computer Graphics. The design is verified using FPGA implementation. Also a comparative performance improvement of the proposed design is studied compared to a uni-processor approach for possibilities of its real time application.

An Efficient Architecture for Modified Karatsuba-Ofman Algorithm (불필요한 연산이 없는 카라슈바 알고리즘과 하드웨어 구조)

  • Chang Nam-Su;Kim Chang-Han
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.43 no.3 s.345
    • /
    • pp.33-39
    • /
    • 2006
  • In this paper we propose the Modified Karatsuba-Ofman algorithm for polynomial multiplication to polynomials of arbitrary degree. Leone proposed optimal stop condition for iteration of Karatsuba-Ofman algorithm(KO). In this paper, we propose a Non-Redundant Karatsuba-Ofman algorithm (NRKOA) with removing redundancy operations, and design a parallel hardware architecture based on the proposed algorithm. Comparing with existing related Karatsuba architectures with the same time complexity, the proposed architecture reduces the area complexity. Furthermore, the space complexity of the proposed multiplier is reduced by 43% in the best case.

A Polynomial-Time Algorithm for Breaking the McEliece's Public-Key Cryptosystem (McEliece 공개키 암호체계의 암호해독을 위한 Polynomial-Time 알고리즘)

  • Park, Chang-Seop-
    • Proceedings of the Korea Institutes of Information Security and Cryptology Conference
    • /
    • 1991.11a
    • /
    • pp.40-48
    • /
    • 1991
  • McEliece 공개키 암호체계에 대한 새로운 암호해독적 공격이 제시되어진다. 기존의 암호해독 algorithm이 exponential-time의 complexity를 가지는 반면, 본고에서 제시되어지는 algorithm은 polynomial-time의 complexity를 가진다. 모든 linear codes에는 systematic generator matrix가 존재한다는 사실이 본 연구의 동기가 된다. Public generator matrix로부터, 암호해독에 사용되어질 수 있는 새로운 trapdoor generator matrix가 Gauss-Jordan Elimination의 역할을 하는 일련의 transformation matrix multiplication을 통해 도출되어진다. 제시되어지는 algorithm의 계산상의 complexity는 주로 systematic trapdoor generator matrix를 도출하기 위해 사용되는 binary matrix multiplication에 기인한다. Systematic generator matrix로부터 쉽게 도출되어지는 parity-check matrix를 통해서 인위적 오류의 수정을 위한 Decoding이 이루어진다.

  • PDF

A Study of the Modulus Multiplier Design for Speed up Throughput in the Public-key Cryptosystem (공개키 암호시스템의 처리속도향상을 위한 모듈러 승산기 설계에 관한 연구)

  • 이선근;김환용
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.40 no.4
    • /
    • pp.51-57
    • /
    • 2003
  • The development of the communication network and the other network method can generate serious social problems. So, it is highly required to control security of network. These problems related security will be developed and keep up to confront with anti-security field such as hacking, cracking. The way to preserve security from hacker or cracker without developing new cryptographic algorithm is keeping the state of anti-cryptanalysis in a prescribed time by means of extending key-length. In this paper, we proposed M3 algorithm for the reduced processing time in the montgomery multiplication part. Proposed M3 algorithm using the matrix function M(.) and lookup table perform optionally montgomery multiplication with repeated operation. In this result, modified repeated operation part produce 30% processing rate than existed montgomery multiplicator. The proposed montgomery multiplication structured unit array method in carry generated part and variable length multiplication for eliminating bottle neck effect with the RSA cryptosystem. Therefore, this proposed montgomery multiplier enforce the real time processing and prevent outer cracking.

Real-Tim Sound Field Effect Implementation Using Block Filtering and QFT (Block Filtering과 QFT를 이용한 실시간 음장 효과구현)

  • Sohn Sung-Yong;Seo Jeongil;Hahn Minsoo
    • MALSORI
    • /
    • no.51
    • /
    • pp.85-98
    • /
    • 2004
  • It is almost impossible to generate the sound field effect in real time with the time-domain linear convolution because of its large multiplication operation requirement. To solve this, three methods are introduced to reduce the number of multiplication operations in this paper. Firstly, the time-domain linear convolution is replaced with the frequency-domain circular convolution. In other words, the linear convolution result can be derived from that of the circular convolution. This technique reduces the number of multiplication operations remarkably, Secondly, a subframe concept is introduced, i.e., one original frame is divided into several subframes. Then the FFT is executed for each subframe and, as a result, the number of multiplication operations can be reduced. Finally, the QFT is used in stead of the FFT. By combining all the above three methods into our final the SFE generation algorithm, the number of computations are reduced sufficiently and the real-time SFE generation becomes possible with a general PC.

  • PDF