• Title/Summary/Keyword: Processor-sharing

Search Result 112, Processing Time 0.025 seconds

An Efficient Hardware Implementation of Block Cipher Algorithm LEA (블록암호 알고리듬 LEA의 효율적인 하드웨어 구현)

  • Sung, Mi-ji;Park, Jang-nyeong;Shin, Kyung-wook
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2014.10a
    • /
    • pp.777-779
    • /
    • 2014
  • The LEA(Lightweight Encryption Algorithm) is a 128-bit high-speed/lightweight block cipher algorithm developed by National Security Research Institute(NSRI) in 2012. The LEA encrypts plain text of 128-bit using cipher key of 128/192/256-bit, and produces cipher text of 128-bit, and vice versa. To reduce hardware complexity, we propose an efficient architecture which shares hardware resources for encryption and decryption in round transformation block. Hardware sharing technique for key scheduler was also devised to achieve area-efficient and low-power implementation. The designed LEA cryptographic processor was verified by using FPGA implementation.

  • PDF

故障許容電算體系의 設計와 信賴度

  • 조정완
    • Communications of the Korean Institute of Information Scientists and Engineers
    • /
    • v.1 no.1
    • /
    • pp.42-49
    • /
    • 1983
  • 전산기의 신뢰도(reliability)라 함은 사용자가 제출한 입력에 대하여 전산 기가 제공하는 결과의 신빙성의 척도라할 수 있는데, 이것은 주어진 전산기의 부 분품 하나하나가, 그리고 프로그램의 하나하나의 instruction이 설계당시에 목적한 성능을 얼마나 잘 유지하고 있는가를 측정하는 척도라고 볼 수 있습니다. 이 신 뢰도는 전산기의 수명, 필요할 때 전산기가 가동할 확율, 또는 전산기의 성능으로 나타낼 수 있습니다. 제2세대 이전의 전산기들에서는 전자공업과 전산기 기술의 불충분한 발전으로 인하여 비용과 기계의 크기의 한정 때문에 신뢰도 향상을 위 한 대책이 거의 없었습니다. 따라서 현재 볼 수 있는 American Air Line의 SABRE(Semi Automatic Business Research Environment), Bell 전화 연구소의 ESS-I, II, III(Electronic Switching System), IBM의 FMS(Future Manufacturing System)과 같은 real-time 씨스템으로서의 응용분야의 개발은 상 당히 어려운 문제였습니다. 그러나 전자공업의 비약적인 발전에 힘입어 금세대의 범용전산기의 설계가 가능하게 되었고, 오퍼레이팅 씨스템의 발전으로 인하여 multiprogramming, time-sharing, real-time 씨스템 등의 응용분야의 개발이 활발 하게 되었습니다. 이러한 응용분야의 활발한 개발과, 대규모 집적회로 (LSI)의 개 발로 ROM(Read Only Memory)의 가격화, 그리고 microprogram의 보급 등으로 특수 목적의 time sharing operation을 위한 소형 전산기가 발전하게 되었으며 종 래의 범용 전산기 대신에 CDC의 string unit과 pipeline을 이용한 STAR 100과 일리노이 대학의 256processor와 Burrough의 B6500로 구성된 ILLIAC-IV와 같은 초대형 전산기가 등장하게 되었습니다.

Optimization of Pipelined Discrete Wavelet Packet Transform Based on an Efficient Transpose Form and an Advanced Functional Sharing Technique

  • Nguyen, Hung-Ngoc;Kim, Cheol-Hong;Kim, Jong-Myon
    • Journal of Information Processing Systems
    • /
    • v.15 no.2
    • /
    • pp.374-385
    • /
    • 2019
  • This paper presents an optimal implementation of a Daubechies-based pipelined discrete wavelet packet transform (DWPT) processor using finite impulse response (FIR) filter banks. The feed-forward pipelined (FFP) architecture is exploited for implementation of the DWPT on the field-programmable gate array (FPGA). The proposed DWPT is based on an efficient transpose form structure, thereby reducing its computational complexity by half of the system. Moreover, the efficiency of the design is further improved by using a canonical-signed digit-based binary expression (CSDBE) and advanced functional sharing (AFS) methods. In this work, the AFS technique is proposed to optimize the convolution of FIR filter banks for DWPT decomposition, which reduces the hardware resource utilization by not requiring any embedded digital signal processing (DSP) blocks. The proposed AFS and CSDBE-based DWPT system is embedded on the Virtex-7 FPGA board for testing. The proposed design is implemented as an intellectual property (IP) logic core that can easily be integrated into DSP systems for sub-band analysis. The achieved results conclude that the proposed method is very efficient in improving hardware resource utilization while maintaining accuracy of the result of DWPT.

A New Arithmetic Unit Over GF(2$^{m}$ ) for Low-Area Elliptic Curve Cryptographic Processor (저 면적 타원곡선 암호프로세서를 위한 GF(2$^{m}$ )상의 새로운 산술 연산기)

  • 김창훈;권순학;홍춘표
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.28 no.7A
    • /
    • pp.547-556
    • /
    • 2003
  • This paper proposes a novel arithmetic unit over GF(2$^{m}$ ) for low-area elliptic curve cryptographic processor. The proposed arithmetic unit, which is linear feed back shift register (LFSR) architecture, is designed by using hardware sharing between the binary GCD algorithm and the most significant bit (MSB)-first multiplication scheme, and it can perform both division and multiplication in GF(2$^{m}$ ). In other word, the proposed architecture produce division results at a rate of one per 2m-1 clock cycles in division mode and multiplication results at a rate of one per m clock cycles in multiplication mode. Analysis shows that the computational delay time of the proposed architecture, for division, is less than previously proposed dividers with reduced transistor counts. In addition, since the proposed arithmetic unit does not restrict the choice of irreducible polynomials and has regularity and modularity, it provides a high flexibility and scalability with respect to the field size m. Therefore, the proposed novel architecture can be used for both division and multiplication circuit of elliptic curve cryptographic processor. Specially, it is well suited to low-area applications such as smart cards and hand held devices.

A Public-key Cryptography Processor supporting P-224 ECC and 2048-bit RSA (P-224 ECC와 2048-비트 RSA를 지원하는 공개키 암호 프로세서)

  • Sung, Byung-Yoon;Lee, Sang-Hyun;Shin, Kyung-Wook
    • Journal of IKEEE
    • /
    • v.22 no.3
    • /
    • pp.522-531
    • /
    • 2018
  • A public-key cryptography processor EC-RSA was designed, which integrates a 224-bit prime field elliptic curve cryptography (ECC) defined in the FIPS 186-2 as well as RSA with 2048-bit key length into a single hardware structure. A finite field arithmetic core used in both scalar multiplication for ECC and exponentiation for RSA was designed with 32-bit data-path. A lightweight implementation was achieved by an efficient hardware sharing of the finite field arithmetic core and internal memory for ECC and RSA operations. The EC-RSA processor was verified by FPGA implementation. It occupied 11,779 gate equivalents (GEs) and 14 kbit RAM synthesized with a 180-nm CMOS cell library and the estimated maximum clock frequency was 133 MHz. It takes 867,746 clock cycles for ECC scalar multiplication resulting in the estimated throughput of 34.3 kbps, and takes 26,149,013 clock cycles for RSA decryption resulting in the estimated throughput of 10.4 kbps.

A GA-Based Adaptive Task Redistribution Method for Intelligent Distributed Computing (지능형 분산컴퓨팅을 위한 유전알고리즘 기반의 적응적 부하재분배 방법)

  • 이동우;이성훈;황종선
    • Journal of KIISE:Software and Applications
    • /
    • v.31 no.10
    • /
    • pp.1345-1355
    • /
    • 2004
  • In a sender-initiated load redistribution algorithm, a sender(overloaded processor) continues to send unnecessary request messages for load transfer until a receiver(underloaded processor) is found while the system load is heavy. In a receiver-initiated load redistribution algorithm, a receiver continues to send unnecessary request messages for load acquisition until a sender is found while the system load is light. Therefore, it yields many problems such as low CPU utilization and system throughput because of inefficient inter-processor communications in this environment. This paper presents an approach based on genetic algorithm(GA) for adaptive load sharing in distributed systems. In this scheme, the processors to which the requests are sent off are determined by the proposed GA to decrease unnecessary request messages.

Design of Shared Memory Controller Device Driver in Embedded System (임베디드 시스템에서의 공유 메모리 컨트롤러 디바이스 드라이버 설계)

  • Moon, Ji-Hoon;Oh, Jae-Chul
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.9 no.6
    • /
    • pp.703-709
    • /
    • 2014
  • In the AMP(Asymmetric Multiprocessing) based dual core using core-specific operating system in a single processor system, shared memory method is used to send data between processors in dual core. To used shared memory in different operating systems, there is a problem of needing to solving the issue of message communication and synchronization between the two operations systems. In this paper, separate memory controller was used for data sharing between different processor cores in dual core environment. This controller can designate two slave ports to allow simultaneous access from two processors, and in the case of process data simultaneously by two processors, priority order of slave ports is determined through memory mediator. When sending data from A to B processor, SRAM area was logically separated into 8 pages. It allowed using memory area from multiple processes with the size of 4KByte per page, and control register with the size of 4Byte was used to discern the usability of current page.

Design of an Efficient AES-ARIA Processor using Resource Sharing Technique (자원 공유기법을 이용한 AES-ARIA 연산기의 효율적인 설계)

  • Koo, Bon-Seok;Ryu, Gwon-Ho;Chang, Tae-Joo;Lee, Sang-Jin
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.18 no.6A
    • /
    • pp.39-49
    • /
    • 2008
  • AEA and ARIA are next generation standard block cipher of US and Korea, respectively, and these algorithms are used in various fields including smart cards, electronic passport, and etc. This paper addresses the first efficient unified hardware architecture of AES and ARIA, and shows the implementation results with 0.25um CMOS library. We designed shared S-boxes based on composite filed arithmetic for both algorithms, and also extracted common terms of the permutation matrices of both algorithms. With the $0.25-{\mu}m$ CMOS technology, our processor occupies 19,056 gate counts which is 32% decreased size from discrete implementations, and it uses 11 clock cycles and 16 cycles for AES and ARIA encryption, which shows 720 and 1,047 Mbps, respectively.

A Crypto-processor Supporting Multiple Block Cipher Algorithms (다중 블록 암호 알고리듬을 지원하는 암호 프로세서)

  • Cho, Wook-Lae;Kim, Ki-Bbeum;Bae, Gi-Chur;Shin, Kyung-Wook
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.20 no.11
    • /
    • pp.2093-2099
    • /
    • 2016
  • This paper describes a design of crypto-processor that supports multiple block cipher algorithms of PRESENT, ARIA, and AES. The crypto-processor integrates three cores that are PRmo (PRESENT with mode of operation), AR_AS (ARIA_AES), and AES-16b. The PRmo core implementing 64-bit block cipher PRESENT supports key length 80-bit and 128-bit, and four modes of operation including ECB, CBC, OFB, and CTR. The AR_AS core supporting key length 128-bit and 256-bit integrates two 128-bit block ciphers ARIA and AES into a single data-path by utilizing resource sharing technique. The AES-16b core supporting key length 128-bit implements AES with a reduced data-path of 16-bit for minimizing hardware. Each crypto-core contains its own on-the-fly key scheduler, and consecutive blocks of plaintext/ciphertext can be processed without reloading key. The crypto-processor was verified by FPGA implementation. The crypto-processor implemented with a $0.18{\mu}m$ CMOS cell library occupies 54,500 gate equivalents (GEs), and it can operate with 55 MHz clock frequency.

Low-area FFT Processor Structure using $Radix-4^2$ Algorithm ($Radix-4^2$알고리즘을 사용한 저면적 FFT 프로세서 구조)

  • Kim, Han-Jin;Jang, Young-Beom
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.49 no.3
    • /
    • pp.8-14
    • /
    • 2012
  • In this paper, a low-area FFT structure using $Radix-4^2$ algorithm is proposed. The large point FFT structure consists of cascade connection of the many stages. In implementation of large point FFT using $Radix-4^2$ algorithm, stages which number of different coefficients are only 3 appear in every 2 stages. For example, in the 4096-point FFT, the stages that number of different coefficients are 3 appear in stage 1, 3, and 5 among 6 stages. Multiplication block area of these 3 stages can be reduced using CSD(Canonic Signed Digit) and common sub-expression sharing techniques. Using the proposed structure, the 256-point FFT is implemented with the Verilog-HDL coding and synthesized by $1.971mm^2$ cell area in tsmc $0.18{\mu}m$CMOS library. This result shows 23% cell area reduction compared with the conventional structure.