• Title/Summary/Keyword: 연산 지도

Search Result 3,998, Processing Time 0.03 seconds

An Implementation of a Convolutional Accelerator based on a GPGPU for a Deep Learning (Deep Learning을 위한 GPGPU 기반 Convolution 가속기 구현)

  • Jeon, Hee-Kyeong;Lee, Kwang-yeob;Kim, Chi-yong
    • Journal of IKEEE
    • /
    • v.20 no.3
    • /
    • pp.303-306
    • /
    • 2016
  • In this paper, we propose a method to accelerate convolutional neural network by utilizing a GPGPU. Convolutional neural network is a sort of the neural network learning features of images. Convolutional neural network is suitable for the image processing required to learn a lot of data such as images. The convolutional layer of the conventional CNN required a large number of multiplications and it is difficult to operate in the real-time on the embedded environment. In this paper, we reduce the number of multiplications through Winograd convolution operation and perform parallel processing of the convolution by utilizing SIMT-based GPGPU. The experiment was conducted using ModelSim and TestDrive, and the experimental results showed that the processing time was improved by about 17%, compared to the conventional convolution.

Low-complexity implementation of OFDMA timing delay detector with multiple receive antennas for broadband wireless access (광대역 무선 액세스를 위한 다중 수신안테나를 갖는 OFDMA 시스템의 낮은 복잡도의 타이밍 딜레이 추정기 구현)

  • Won, Hui-Chul
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.12 no.3
    • /
    • pp.19-30
    • /
    • 2007
  • In this paper, we propose low-complexity implementation of orthogonal frequency division multiple access (OFDMA) timing delay detector with multiple receive antennas for broadband wireless access (BWA). First, in order to reduce the computational complexity, the detection structure which rotates the phase of the received ranging symbols is introduced. Second, we propose the detection structure with the N-point/M-interval fast Fourier transform structure and a frequency-domain average-power estimator for complexity reduction without sacrificing the system performance. Finally, simulation results for the proposed structures and complexity comparison of the existing structure with the proposed detectors are presented.

  • PDF

Design and Implementation of 3DES crypto-algorithm with Pipeline Architecture (파이프라인 구조의 3DES 암호알고리즘의 설계 및 구현)

  • Lee Wan-Bok;Kim Jung-Tae
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.10 no.2
    • /
    • pp.333-337
    • /
    • 2006
  • Symmetric block ciper algorithm consists of a chains of operations such as permutation and substitution. There exists four kinds of operation mode, CBC, ECB, CFB, and OFB depending on the operation paradigm. Since the final ciper text is obtained through the many rounds of operations, it consumes much time. This paper proposes a pipelined design methodology which can improve the speed of crypto operations in ECB mode. Because the operations of the many rounds are concatenated in serial and executed concurrently, the overall computation time can be reduced significantly. The experimental result shows that the method can speed up the performance more than ten times.

Accuracy Improvement Method for 1-Bit Convolutional Neural Network (1-Bit 합성곱 신경망을 위한 정확도 향상 기법)

  • Im, Sung-Hoon;Lee, Jae-Heung
    • Journal of IKEEE
    • /
    • v.22 no.4
    • /
    • pp.1115-1122
    • /
    • 2018
  • In this paper, we analyze the performance degradation of previous 1-Bit convolutional neural network method and introduce ways to mitigate it. Previous work applies 32-Bit operation to first and last layers. But our method applies 32-Bit operation to second layer too. We also show that nonlinear activation function can be removed after binarizing inputs and weights. In order to verify the method proposed in this paper, we experiment the object detection neural network for korean license plate detection. Our method results in 96.1% accuracy, but the existing method results in 74% accuracy.

Color Correction with Optimized Hardware Implementation of CIE1931 Color Coordinate System Transformation (CIE1931 색좌표계 변환의 최적화된 하드웨어 구현을 통한 색상 보정)

  • Kim, Dae-Woon;Kang, Bong-Soon
    • Journal of IKEEE
    • /
    • v.25 no.1
    • /
    • pp.10-14
    • /
    • 2021
  • This paper presents a hardware that improves the complexity of the CIE1931 color coordinate algorithm operation. The conventional algorithm has disadvantage of growing hardware due to 4-Split Multiply operations used to calculate large bits in the computation process. But the proposed algorithm pre-calculates the defined R2X, X2R Matrix operations of the conventional algorithm and makes them a matrix. By applying the matrix to the images and improving the color, it is possible to reduce the amount of computation and hardware size. By comparing the results of Xilinx synthesis of hardware designed with Verilog, we can check the performance for real-time processing in 4K environments with reduced hardware resources. Furthermore, this paper validates the hardware mount behavior by presenting the execution results of the FPGA board.

A Scalar Multiplication Method and its Hardware with resistance to SPA(Simple Power Analysis) (SPA에 견디는 스칼라 곱셈 방법과 하드웨어)

  • 윤중철;정석원;임종인
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.13 no.3
    • /
    • pp.65-70
    • /
    • 2003
  • In this paper, we propose a scalar multiplication method and its hardware architecture which is resistant to SPA while its computation speed is faster than Colon's. There were SPA-resistant scalar multiplication method which has performance problem. Due to this reason, the research about an efficient SPA-resistant scalar multiplication is one of important topics. The proposed architecture resists to SPA and is faster than Colon's method under the assumption that Colon's and the proposed method use same fmite field arithmetic units(multiplier and inverter). With n-bit scalar multiple, the computation cycle of the proposed is 2n·(Inversion cycle)+3(Aultiplication cycle).

Efficient RSA Multisignature Scheme (효율적인 RSA 다중 서명 방식)

  • 박상준;박상우;원동호
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.7 no.2
    • /
    • pp.19-26
    • /
    • 1997
  • In this paper, we propose an RSA multisignature scheme with no bit expansion in which the signing order is not restricted. In this scheme we use RSA moduli with the same bit length. the most 1 bits of which are same. The proposed scheme is based on these RSA moduli and a repeated exponentiation of Levine and Brawley. Kiesler and Harn first utilize the repeated exponentiation technique in their multisignature scheme, which requires 1.5m exponentiations for signing, where m is the number of signers. However, the proposed scheme requires (equation omitted) m exponentiation. So if l is sufficiently large (l $\geq$ 32), then we can neglect the vaue (equation omitted

A fast scalar multiplication on elliptic curves (타원곡선에서 스칼라 곱의 고속연산)

  • 박영호;한동국;오상호;이상진;임종인;주학수
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.12 no.2
    • /
    • pp.3-10
    • /
    • 2002
  • For efficient implementation of scalar multiplication in Kobliz elliptic curves, Frobenius endomorphism is useful. Instead of binary expansion of scalar, using Frobenius expansion of scalar we can speed up scalar multiplication and so fast scalar multiplication is closely related to the expansion length of integral multipliers. In this paper we propose a new idea to reduce the length of Frobenius expansion of integral multipliers of scalar multiplication, which makes speed up scalar multiplication. By using the element whose norm is equal to a prime instead of that whose norm is equal to the order of a given elliptic curve we optimize the length of the Frobenius expansion. It can reduce more the length of the Frobenius expansion than that of Solinas, Smart.

Chosen Message Attack on the RSA-CRT Countermeasure Based on Fault Propagation Method (오류 확산 기법에 기반한 RSA-CRT 대응책에 대한선택 메시지 공격)

  • Baek, Yi-Roo;Ha, Jae-Cheol
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.20 no.3
    • /
    • pp.135-140
    • /
    • 2010
  • The computation using Chinese Remainder Theorem in RSA cryptosystem is well suited in the digital signature or decryption processing due to its low computational load compared to the case of general RSA without CRT. Since the RSA-CRT algorithm is vulnerable to many fault insertion attacks, some countermeasures against them were proposed. Among several countermeasures, Yen et al. proposed two schemes based on fault propagation method. Unfortunately, a new vulnerability was founded in FDTC 2006 conference. To improve the original schemes, Kim et al. recently proposed a new countermeasure in which they adopt the AND operation for fault propagation. In this paper, we show that the proposed scheme using AND operation without checking procedure is also vulnerable to fault insertion attack with chosen messages.

Efficient Implementation of Finite Field Operations in NIST PQC Rainbow (NIST PQC Rainbow의 효율적 유한체 연산 구현)

  • Kim, Gwang-Sik;Kim, Young-Sik
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.31 no.3
    • /
    • pp.527-532
    • /
    • 2021
  • In this paper, we propose an efficient finite field computation method for Rainbow algorithm, which is the only multivariate quadratic-equation based digital signature among the current US NIST PQC standardization Final List algorithms. Recently, Chou et al. proposed a new efficient implementation method for Rainbow on the Cortex-M4 environment. This paper proposes a new multiplication method over the finite field that can reduce the number of XOR operations by more than 13.7% compared to the Chou et al. method. In addition, a multiplicative inversion over that can be performed by a 4x4 matrix inverse instead of the table lookup method is presented. In addition, the performance is measured by porting the software to which the new method was applied onto RaspberryPI 3B+.