Search | Korea Science

A High Performance Modular Multiplier for ECC (타원곡선 암호를 위한 고성능 모듈러 곱셈기)

Choe, Jun-Yeong;Shin, Kyung-Wook
- Journal of IKEEE
- /
- v.24 no.4
- /
- pp.961-968
- /
- 2020
This paper describes a design of high performance modular multiplier that is essentially used for elliptic curve cryptography. Our modular multiplier supports modular multiplications for five field sizes over GF(p), including 192, 224, 256, 384 and 521 bits as defined in NIST FIPS 186-2, and it calculates modular multiplication in two steps with integer multiplication and reduction. The Karatsuba-Ofman multiplication algorithm was used for fast integer multiplication, and the Lazy reduction algorithm was adopted for reduction operation. In addition, the Nikhilam division algorithm was used for the division operation included in the Lazy reduction. The division operation is performed only once for a given modulo value, and it was designed to skip division operation when continuous modular multiplications with the same modulo value are calculated. It was estimated that our modular multiplier can perform 6.4 million modular multiplications per second when operating at a clock frequency of 32 MHz. It occupied 456,400 gate equivalents (GEs), and the estimated clock frequency was 67 MHz when synthesized with a 180-nm CMOS cell library.
https://doi.org/10.7471/ikeee.2020.24.4.961 인용 PDF KSCI

The fast implementation of block cipher SIMON using pre-computation with counter mode of operation (블록암호 SIMON의 카운터 모드 사전 연산 고속 구현)

Kwon, Hyeok-Dong;Jang, Kyung-Bae;Kim, Hyun-Ji;Seo, Hwa-Jeong
- Journal of the Korea Institute of Information and Communication Engineering
- /
- v.25 no.4
- /
- pp.588-594
- /
- 2021
SIMON, a lightweight block cipher developed by the US National Security Agency, is a family of block ciphers optimized for hardware implementation. It supports many kinds of standards to operate in various environments. The counter mode of operation is one of the operational modes. It provides to encrypt plaintext which is longer than the original size. The counter mode uses a constant(Nonce) and Counter value as an input value. Since Nonce is the identical for all blocks, so it always has same result when operates with other constant values. With this feature, it is possible to skip some instructions of round function by pre-computation. In general, the input value of SIMON is affected by the counter. However in an 8-bit environment, it is calculated in 8-bit units, so there is a part that can be pre-computed. In this paper, we focus the part that can be pre-calculated, and compare with previous works.
https://doi.org/10.6109/jkiice.2021.25.4.588 인용 PDF KSCI

Efficient Finite Field Arithmetic Architectures for Pairing Based Cryptosystems (페어링 기반 암호시스템의 효율적인 유한체 연산기)

Chang, Nam-Su;Kim, Tae-Hyun;Kim, Chang-Han;Han, Dong-Guk;Kim, Ho-Won
- Journal of the Korea Institute of Information Security & Cryptology
- /
- v.18 no.3
- /
- pp.33-44
- /
- 2008
The efficiency of pairing based cryptosystems depends on the computation of pairings. pairings is defined over finite fileds GF$(3^m)$ by trinomials due to efficiency. The hardware architectures for pairings have been widely studied. This paper proposes new adder and multiplier for GF(3) which are more efficient than previous results. Furthermore, this paper proposes a new unified adder-subtractor for GF$(3^m)$ based on the proposed adder and multiplier. Finally, this paper proposes new multiplier for GF$(3^m)$. The proposed MSB-first bit-serial multiplier for GF$(p^m)$ reduces the time delay by approximately 30 % and the size of register by half than previous LSB-first multipliers. The proposed multiplier can be applied to all finite fields defined by trinomials.
https://doi.org/10.13089/JKIISC.2008.18.3.33 인용 PDF KSCI

Design of an Efficient AES-ARIA Processor using Resource Sharing Technique (자원 공유기법을 이용한 AES-ARIA 연산기의 효율적인 설계)

Koo, Bon-Seok;Ryu, Gwon-Ho;Chang, Tae-Joo;Lee, Sang-Jin
- Journal of the Korea Institute of Information Security & Cryptology
- /
- v.18 no.6A
- /
- pp.39-49
- /
- 2008
AEA and ARIA are next generation standard block cipher of US and Korea, respectively, and these algorithms are used in various fields including smart cards, electronic passport, and etc. This paper addresses the first efficient unified hardware architecture of AES and ARIA, and shows the implementation results with 0.25um CMOS library. We designed shared S-boxes based on composite filed arithmetic for both algorithms, and also extracted common terms of the permutation matrices of both algorithms. With the $0.25-{\mu}m$ CMOS technology, our processor occupies 19,056 gate counts which is 32% decreased size from discrete implementations, and it uses 11 clock cycles and 16 cycles for AES and ARIA encryption, which shows 720 and 1,047 Mbps, respectively.
https://doi.org/10.13089/JKIISC.2008.18.6A.39 인용 PDF KSCI HTML

Elliptic Curves for Efficient Repeated Additions (효율적인 반복 연산을 위한 타원 곡선)

Lee, Eun-Jeong;Choie, Young-Ju
- Journal of the Korea Institute of Information Security & Cryptology
- /
- v.5 no.1
- /
- pp.17-24
- /
- 1995
In spite of the good security of the cryptosystem on an elliptic curve defined over finite field, the cryptosystem on an elliptic curve is slower than that on a finite field. To be practical, we need a better method to improve a speed of the cryptosystem on an elliptic curve defined over a finite field. In 1991, Koblitz suggested to use an anomalous curve over $F_2$, which is an elliptic curve with Frobenious map whose trace is 1, and reduced a speed of computation of mP. In this paper, we consider an elliptic curve defined over $F_4$ with Frobenious map whose trace is 3 and suggest an efficient algorithm to compute mP. On the proposed elliptic curve, we can compute multiples mP with ${\frac{3}{2}}log_2m$+1 addition in worst case.
https://doi.org/10.13089/JKIISC.1995.5.1.17 인용 PDF

Efficient Thread Allocation Method of Convolutional Neural Network based on GPGPU (GPGPU 기반 Convolutional Neural Network의 효율적인 스레드 할당 기법)

Kim, Mincheol;Lee, Kwangyeob
- Asia-pacific Journal of Multimedia Services Convergent with Art, Humanities, and Sociology
- /
- v.7 no.10
- /
- pp.935-943
- /
- 2017
CNN (Convolution neural network), which is used for image classification and speech recognition among neural networks learning based on positive data, has been continuously developed to have a high performance structure to date. There are many difficulties to utilize in an embedded system with limited resources. Therefore, we use GPU (General-Purpose Computing on Graphics Processing Units), which is used for general-purpose operation of GPU to solve the problem because we use pre-learned weights but there are still limitations. Since CNN performs simple and iterative operations, the computation speed varies greatly depending on the thread allocation and utilization method in the Single Instruction Multiple Thread (SIMT) based GPGPU. To solve this problem, there is a thread that needs to be relaxed when performing Convolution and Pooling operations with threads. The remaining threads have increased the operation speed by using the method used in the following feature maps and kernel calculations.
https://doi.org/10.14257/ajmahs.2017.10.70 인용

Fast Geometric Transformations of 3D Images Represented by an Octree (8진트리로 표현된 3차원 영상의 빠른 기학학적 변환)

Heo, Yeong-Nam;Park, Seung-Jin;Kim, Eung-Gon
- The Transactions of the Korea Information Processing Society
- /
- v.2 no.6
- /
- pp.831-838
- /
- 1995
Geometric transformations require many operations in displaying moving 3D objects on the screen and a fast computation is a important problem in CAD or animation applications. The general method to compute the transformation coordinates of an object represented by an octree must perform the operations on every node. This paper proposes an efficient method that computes the rectangular coordinates of the vertices of the octree nodes into the coordinates of the universe space using the basicvectors in order to compute quickly geometric transformations of 3D images represented by an octree. The coordinates of the vertices of each octant are computed by using the formula presented here, which requies additions and multiplications by powers of 2. This method has a very fast execution time and is compared with the general computation method.
PDF

Time Series Pattern Recognition based on Branch and Bound Dynamic Time Warping (분기 한정적인 동적 타임 워핑 기반의 시계열 패턴인식)

Jang, Seok-Woo;Park, Young-Jae;Kim, Gye-Young
- Journal of KIISE:Software and Applications
- /
- v.37 no.7
- /
- pp.584-589
- /
- 2010
The dynamic time warping algorithm generally used in time series pattern recognition spends most of the time in generating the correlation table, and it establishes the global path constraint to reduce the corresponding time complexity. However, the constraint restrains just in terms of the time axis, not considering the contents of input patterns. In this paper, we therefore propose an efficient branch and bound dynamic time warping algorithm which sets the global constraints by adaptively reflecting the patterns. The experimental results show that the proposed method outperforms conventional methods in terms of the speed and accuracy.
PDF KSCI

Secure Multiplication Method against Side Channel Attack on ARM Cortex-M3 (ARM Cortex-M3 상에서 부채널 공격에 강인한 곱셈 연산 구현)

Seo, Hwajeong
- Journal of the Korea Institute of Information Security & Cryptology
- /
- v.27 no.4
- /
- pp.943-949
- /
- 2017
Cryptography implementation over lightweight Internet of Things (IoT) device needs to provide an accurate and fast execution for high service availability. However, adversaries can extract the secret information from the lightweight device by analyzing the unique features of computation in the device. In particular, modern ARM Cortex-M3 processors perform the multiplication in different execution timings when the input values are varied. In this paper, we analyze previous multiplication methods over ARM Cortex-M3 and provide optimized techniques to accelerate the performance. The proposed method successfully accelerates the performance by up-to 28.4% than previous works.
https://doi.org/10.13089/JKIISC.2017.27.4.943 인용 PDF KSCI HTML

Operation Rearrangement for Low-Power VLIW Instruction Fetches (저전력 VLIW 명령어 추출을 위한 연산재배치 기법)

Sin, Dong-Gun;Kim, Ji-Hong
- Journal of KIISE:Computer Systems and Theory
- /
- v.28 no.10
- /
- pp.530-540
- /
- 2001
As mobile applications are required to handle more computing-intensive tasks, many mobile devices are designed using VLIW processors for high performance. In VLIW machines where a single instruction contains multiple operations, the power consumption during instruction fetches varies significantly depending on how the operations are arranged within the instruction. In this paper, we describe a post-pass optimal operation rearrangement method for low-power VLIW instruction fetch, The proposed method modifies operation placement orders within VLIW instructions so that the switching activity between successive instruction fetches is minimized. Our experiment shows that the switching activity can be 34% on average fro benchmark programs.
PDF

Search Result 3,998, Processing Time 0.029 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)