• Title/Summary/Keyword: Hardware Resources

Search Result 442, Processing Time 0.027 seconds

An Efficient Hardware Implementation of Lightweight Block Cipher LEA-128/192/256 for IoT Security Applications (IoT 보안 응용을 위한 경량 블록암호 LEA-128/192/256의 효율적인 하드웨어 구현)

  • Sung, Mi-Ji;Shin, Kyung-Wook
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.19 no.7
    • /
    • pp.1608-1616
    • /
    • 2015
  • This paper describes an efficient hardware implementation of lightweight encryption algorithm LEA-128/192/256 which supports for three master key lengths of 128/192/256-bit. To achieve area-efficient and low-power implementation of LEA crypto- processor, the key scheduler block is optimized to share hardware resources for encryption/decryption key scheduling of three master key lengths. In addition, a parallel register structure and novel operating scheme for key scheduler is devised to reduce clock cycles required for key scheduling, which results in an increase of encryption/decryption speed by 20~30%. The designed LEA crypto-processor has been verified by FPGA implementation. The estimated performances according to master key lengths of 128/192/256-bit are 181/162/109 Mbps, respectively, at 113 MHz clock frequency.

Efficient Radix-4 Systolic VLSI Architecture for RSA Public-key Cryptosystem (RSA 공개키 암호화시스템의 효율적인 Radix-4 시스톨릭 VLSI 구조)

  • Park Tae geun
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.29 no.12C
    • /
    • pp.1739-1747
    • /
    • 2004
  • In this paper, an efficient radix-4 systolic VLSI architecture for RSA public-key cryptosystem is proposed. Due to the simple operation of iterations and the efficient systolic mapping, the proposed architecture computes an n-bit modular exponentiation in n$^{2}$ clock cycles since two modular multiplications for M$_{i}$ and P$_{i}$ in each exponentiation process are interleaved, so that the hardware is fully utilized. We encode the exponent using Radix-4. SD (Signed Digit) number system to reduce the number of modular multiplications for RSA cryptography. Therefore about 20% of NZ (non-zero) digits in the exponent are reduced. Compared to conventional approaches, the proposed architecture shows shorter period to complete the RSA while requiring relatively less hardware resources. The proposed RSA architecture based on the modified Montgomery algorithm has locality, regularity, and scalability suitable for VLSI implementation.

The Design of a High-Performance RC4 Cipher Hardware using Clusters (클러스터를 이용한 고성능 RC4 암호화 하드웨어 설계)

  • Lee, Kyu-Hee
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.23 no.7
    • /
    • pp.875-880
    • /
    • 2019
  • A RC4 stream cipher is widely used for security applications such as IEEE 802.11 WEP, IEEE 802.11i TKIP and so on, because it can be simply implemented to dedicated circuits and achieve a high-speed encryption. RC4 is also used for systems with limited resources like IoT, but there are performance limitations. RC4 consists of two stages, KSA and PRGA. KSA performs initialization and randomization of S-box and K-box and PRGA produces cipher texts using the randomized S-box. In this paper, we initialize the S-box and K-box in the randomization of the KSA stage to reduce the initialization delay. In the randomization, we use clusters to process swap operation between elements of S-box in parallel and can generate two cipher texts per clock. The proposed RC4 cipher hardware can initialize S-box and K-box without any delay and achieves about 2 times to 6 times improvement in KSA randomization and key stream generation.

A 521-bit high-performance modular multiplier using 3-way Toom-Cook multiplication and fast reduction algorithm (3-way Toom-Cook 곱셈과 고속 축약 알고리듬을 이용한 521-비트 고성능 모듈러 곱셈기)

  • Yang, Hyeon-Jun;Shin, Kyung-Wook
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.25 no.12
    • /
    • pp.1882-1889
    • /
    • 2021
  • This paper describes a high-performance hardware implementation of modular multiplication used as a core operation in elliptic curve cryptography. A 521-bit high-performance modular multiplier for NIST P-521 curve was designed by adopting 3-way Toom-Cook integer multiplication and fast reduction algorithm. Considering the property of the 3-way Toom-Cook algorithm in which the result of integer multiplication is multiplied by 1/3, modular multiplication was implemented on the Toom-Cook domain where the operands were multiplied by 3. The modular multiplier was implemented in the xczu7ev FPGA device to verify its hardware operation, and hardware resources of 69,958 LUTs, 4,991 flip-flops, and 101 DSP blocks were used. The maximum operating frequency on the Zynq7 FPGA device was 50 MHz, and it was estimated that about 4.16 million modular multiplications per second could be achieved.

A High-Performance ECC Processor Supporting NIST P-521 Elliptic Curve (NIST P-521 타원곡선을 지원하는 고성능 ECC 프로세서)

  • Yang, Hyeon-Jun;Shin, Kyung-Wook
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.26 no.4
    • /
    • pp.548-555
    • /
    • 2022
  • This paper describes the hardware implementation of elliptic curve cryptography (ECC) used as a core operation in elliptic curve digital signature algorithm (ECDSA). The ECC processor supports eight operation modes (four point operations, four modular operations) on the NIST P-521 curve. In order to minimize computation complexity required for point scalar multiplication (PSM), the radix-4 Booth encoding scheme and modified Jacobian coordinate system were adopted, which was based on the complexity analysis for five PSM algorithms and four different coordinate systems. Modular multiplication was implemented using a modified 3-Way Toom-Cook multiplication and a modified fast reduction algorithm. The ECC processor was implemented on xczu7ev FPGA device to verify hardware operation. Hardware resources of 101,921 LUTs, 18,357 flip-flops and 101 DSP blocks were used, and it was evaluated that about 370 PSM operations per second were achieved at a maximum operation clock frequency of 45 MHz.

Optimized and Portable FPGA-Based Systolic Cell Architecture for Smith-Waterman-Based DNA Sequence Alignment

  • Shah, Hurmat Ali;Hasan, Laiq;Koo, Insoo
    • Journal of information and communication convergence engineering
    • /
    • v.14 no.1
    • /
    • pp.26-34
    • /
    • 2016
  • The alignment of DNA sequences is one of the important processes in the field of bioinformatics. The Smith-Waterman algorithm (SWA) performs optimally for aligning sequences but is computationally expensive. Field programmable gate array (FPGA) performs the best on parameters such as cost, speed-up, and ease of re-configurability to implement SWA. The performance of FPGA-based SWA is dependent on efficient cell-basic implementation-unit design. In this paper, we present an optimized systolic cell design while avoiding oversimplification, very large-scale integration (VLSI)-level design, and direct mapping of iterative equations such as previous cell designs. The proposed design makes efficient use of hardware resources and provides portability as the proposed design is not based on gate-level details. Our cell design implementing a linear gap penalty resulted in a performance improvement of 32× over a GPP platform and surpassed the hardware utilization of another implementation by a factor of 4.23.

Hardware and Software Implementation of a GPS Receiver Test Bed Running from PC (PC 기반 GPS 수신기 하드웨어 모듈 및 펌웨어 개발)

  • Long, Nguyen Phi;Hieu, Nguyen Hoang;Lee, Sang-Hoon;Park, Ok-Deuk;Kim, Hyun-Su;Kim, Han-Sil
    • Proceedings of the KIEE Conference
    • /
    • 2006.10c
    • /
    • pp.394-396
    • /
    • 2006
  • When developing a new GPS receiver module, the essential problems are evaluation of reliable algorithms, software debugging, and performance comparison between algorithms to find optimal solution. Most GPS receiver modules nowadays use a correlator to track signals from satellites and an MCU (Micro Controller Unit) to control operations of the entire module. The problem of software evaluation from MCU is very difficult, due to limitation of MCU resources and low ability of interfacing with user. Normally, user has to expense special tool kit for a limiting access to MCU but it is also hard to use. This article introduces an implementation of a GPS receiver test bed using correlator GP2021 interfacing with ISA (Industry Standard Architecture) PC bus. This way can give user complete control and visibility into the operation of the receiver, then user can easily debug program and test algorithms. For this article, the least square method is implemented to test the hardware and software performance.

  • PDF

A Design of Crypto-processor for Lightweight Block Cipher LEA (경량 블록암호 LEA용 암호/복호 프로세서 설계)

  • Sung, Mi-ji;Shin, Kyung-wook
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2015.05a
    • /
    • pp.401-403
    • /
    • 2015
  • This paper describes an efficient hardware design of 128-bit block cipher algorithm LEA(lightweight encryption algorithm). In order to achieve area-efficient and low-power implementation, round block and key scheduler block are optimized to share hardware resources for encryption and decryption. The key scheduler register is modified to reduce clock cycles required for key scheduling, which results in improved encryption/decryption performance. FPGA synthesis results of the LEA processor show that it has 2,364 slices, and the estimated performance for the master key of 128/192/256-bit at 113 MHz clock frequency is about 181/162/109 Mbps, respectively.

  • PDF

Comparison of Artificial Neural Networks for Low-Power ECG-Classification System

  • Rana, Amrita;Kim, Kyung Ki
    • Journal of Sensor Science and Technology
    • /
    • v.29 no.1
    • /
    • pp.19-26
    • /
    • 2020
  • Electrocardiogram (ECG) classification has become an essential task of modern day wearable devices, and can be used to detect cardiovascular diseases. State-of-the-art Artificial Intelligence (AI)-based ECG classifiers have been designed using various artificial neural networks (ANNs). Despite their high accuracy, ANNs require significant computational resources and power. Herein, three different ANNs have been compared: multilayer perceptron (MLP), convolutional neural network (CNN), and spiking neural network (SNN) only for the ECG classification. The ANN model has been developed in Python and Theano, trained on a central processing unit (CPU) platform, and deployed on a PYNQ-Z2 FPGA board to validate the model using a Jupyter notebook. Meanwhile, the hardware accelerator is designed with Overlay, which is a hardware library on PYNQ. For classification, the MIT-BIH dataset obtained from the Physionet library is used. The resulting ANN system can accurately classify four ECG types: normal, atrial premature contraction, left bundle branch block, and premature ventricular contraction. The performance of the ECG classifier models is evaluated based on accuracy and power. Among the three AI algorithms, the SNN requires the lowest power consumption of 0.226 W on-chip, followed by MLP (1.677 W), and CNN (2.266 W). However, the highest accuracy is achieved by the CNN (95%), followed by MLP (76%) and SNN (90%).

Middleware services for structural health monitoring using smart sensors

  • Nagayama, T.;Spencer, B.F. Jr.;Mechitov, K.A.;Agha, G.A.
    • Smart Structures and Systems
    • /
    • v.5 no.2
    • /
    • pp.119-137
    • /
    • 2009
  • Smart sensors densely distributed over structures can use their computational and wireless communication capabilities to provide rich information for structural health monitoring (SHM). Though smart sensor technology has seen substantial advances during recent years, implementation of smart sensors on full-scale structures has been limited. Hardware resources available on smart sensors restrict data acquisition capabilities; intrinsic to these wireless systems are packet loss, data synchronization errors, and relatively slow communication speeds. This paper addresses these issues under the hardware limitation by developing corresponding middleware services. The reliable communication service requires only a few acknowledgement packets to compensate for packet loss. The synchronized sensing service employs a resampling approach leaving the need for strict control of sensing timing. The data aggregation service makes use of application specific knowledge and distributed computing to suppress data transfer requirements. These middleware services are implemented on the Imote2 smart sensor platform, and their efficacy demonstrated experimentally.