• Title/Summary/Keyword: Standard cell library

Search Result 196, Processing Time 0.066 seconds

VLSI Design of H.264/AVC CAVLC encoder for HDTV Application (실시간 HD급 영상 처리를 위한 H.264/AVC CAVLC 부호화기의 하드웨어 구조 설계)

  • Woo, Jang-Uk;Lee, Won-Jae;Kim, Jae-Seok
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.44 no.7 s.361
    • /
    • pp.45-53
    • /
    • 2007
  • In this paper, we propose an efficient hardware architecture for H.264/AVC CAVLC (Context-based Adaptive Variable Length Coding) encoding. Previous CAVLC architectures search all of the coefficients to find statistic characteristics in a block. However, it is unnecessary information that zero coefficients following the last position of a non-zero coefficient when CAVLC encodes residual coefficients. In order to reduce this unnecessary operation, we propose two techniques, which detect the first and last position of non-zero coefficients and arrange non-zero coefficients sequentially. By adopting these two techniques, the required processing time was reduced about 23% compared with previous architecture. It was designed in a hardware description language and total logic gate count is 16.3k using 0.18um standard cell library Simulation results show that our design is capable of real-time processing for $1920{\times}1088\;30fps$ videos at 81MHz.

Full Data-rate Viterbi Decoder for DAB Receiver (최대 데이터율을 지원하는 DAB 수신기용 Viterbi 디코더의 설계)

  • 김효원;구오석;류주현;윤대희
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.27 no.6C
    • /
    • pp.601-609
    • /
    • 2002
  • The efficient Viterbi decoder that supports full data-rate output of DAB system was proposed. Viterbi decoder consumes lots of computational load and should be designed to be fast specific hardware. In this paper, SST scheme was adopted for Viterbi decoder with puncturing to reduced the power consumption. Puncturing vector tables are modified and re-arranged to be designed by a hardwired logic to save the system area. New re-scaling scheme which uses the fact that the difference of the maximum and minimum of the path metric values is bounded is proposed. The proposed re-scaling scheme optimizes the wordlength of path metric memory and greatly reduces the computational load for re-scaling by controlling MSB of path metric memory. Another saving of computation is done by proposed algorithm for branch metric calculation, which makes use of pre-calculated metric values. The designed Viterbi decoder was synthesized using SAMSUNG 0.35$\mu$ standard cell library and occupied small area and showed lower power consumption.

VLIS Design of OCB-AES Cryptographic Processor (OCB-AES 암호 프로세서의 VLSI 설계)

  • Choi Byeong-Yoon;Lee Jong-Hyoung
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.9 no.8
    • /
    • pp.1741-1748
    • /
    • 2005
  • In this paper, we describe VLSI design and performance evaluation of OCB-AES crytographic algorithm that simulataneously provides privacy and authenticity. The OCB-AES crytographic algorithm sovles the problems such as long operation time and large hardware of conventional crytographic system, because the conventional system must implement the privancy and authenticity sequentially with seqarated algorithms and hardware. The OCB-AES processor with area-efficient modular offset generator and tag generator is designed using IDEC Samsung 0.35um standard cell library and consists of about 55,700 gates. Its cipher rate is about 930Mbps and the number of clock cycles needed to generate the 128-bit tags for authenticity and integrity is (m+2)${\times}$(Nr+1), where m and Nr represent the number of block for message and number of rounds for AES encryption, respectively. The OCB-AES processor can be applicable to soft cryptographic IP of IEEE 802.11i wireless LAN and Mobile SoC.

Design of High-performance Pedestrian and Vehicle Detection Circuit using Haar-like Features (Haar-like 특징을 이용한 고성능 보행자 및 차량 인식 회로 설계)

  • Kim, Soo-Jin;Park, Sang-Kyun;Lee, Seon-Young;Cho, Kyeong-Soon
    • The KIPS Transactions:PartA
    • /
    • v.19A no.4
    • /
    • pp.175-180
    • /
    • 2012
  • This paper describes the design of high-performance pedestrian and vehicle detection circuit using the Haar-like features. The proposed circuit uses a sliding window for every image frame in order to extract Haar-like features and to detect pedestrians and vehicles. A total of 200 Haar-like features per sliding window is extracted from Haar-like feature extraction circuit and the extracted features are provided to AdaBoost classifier circuit. In order to increase the processing speed, the proposed circuit adopts the parallel architecture and it can process two sliding windows at the same time. We described the proposed high-performance pedestrian and vehicle detection circuit using Verilog HDL and synthesized the gate-level circuit using the 130nm standard cell library. The synthesized circuit consists of 1,388,260 gates and its maximum operating frequency is 203MHz. Since the proposed circuit processes about 47.8 $640{\times}480$ image frames per second, it can be used to provide the real-time detection of pedestrians and vehicles.

Low Area Hardware Design of Efficient SAO for HEVC Encoder (HEVC 부호기를 위한 효율적인 SAO의 저면적 하드웨어 설계)

  • Cho, Hyunpyo;Ryoo, Kwangki
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.19 no.1
    • /
    • pp.169-177
    • /
    • 2015
  • This paper proposes a hardware architecture for an efficient SAO(Sample Adaptive Offset) with low area for HEVC(High Efficiency Video Coding) encoder. SAO is a newly adopted technique in HEVC as part of the in-loop filter. SAO reduces mean sample distortion by adding offsets to reconstructed samples. The existing SAO requires a great deal of computational and processing time for UHD(Ultra High Definition) video due to sample by sample processing. To reduce SAO processing time, the proposed SAO hardware architecture processes four samples simultaneously, and is implemented with a 2-step pipelined architecture. In addition, to reduce hardware area, it has a single architecture for both luma and chroma components and also uses optimized and common operators. The proposed SAO hardware architecture is designed using Verilog HDL(Hardware Description Language), and has a total of 190k gates in TSMC $0.13{\mu}m$ CMOS standard cell library. At 200MHz, it can support 4K UHD video encoding at 60fps in real time, but operates at a maximum of 250MHz.

A Lightweight Hardware Implementation of ECC Processor Supporting NIST Elliptic Curves over GF(2m) (GF(2m) 상의 NIST 타원곡선을 지원하는 ECC 프로세서의 경량 하드웨어 구현)

  • Lee, Sang-Hyun;Shin, Kyung-Wook
    • Journal of IKEEE
    • /
    • v.23 no.1
    • /
    • pp.58-67
    • /
    • 2019
  • A design of an elliptic curve cryptography (ECC) processor that supports both pseudo-random curves and Koblitz curves over $GF(2^m)$ defined by the NIST standard is described in this paper. A finite field arithmetic circuit based on a word-based Montgomery multiplier was designed to support five key lengths using a datapath of fixed size, as well as to achieve a lightweight hardware implementation. In addition, Lopez-Dahab's coordinate system was adopted to remove the finite field division operation. The ECC processor was implemented in the FPGA verification platform and the hardware operation was verified by Elliptic Curve Diffie-Hellman (ECDH) key exchange protocol operation. The ECC processor that was synthesized with a 180-nm CMOS cell library occupied 10,674 gate equivalents (GEs) and a dual-port RAM of 9 kbits, and the maximum clock frequency was estimated at 154 MHz. The scalar multiplication operation over the 223-bit pseudo-random elliptic curve takes 1,112,221 clock cycles and has a throughput of 32.3 kbps.

Montgomery Multiplier Supporting Dual-Field Modular Multiplication (듀얼 필드 모듈러 곱셈을 지원하는 몽고메리 곱셈기)

  • Kim, Dong-Seong;Shin, Kyung-Wook
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.24 no.6
    • /
    • pp.736-743
    • /
    • 2020
  • Modular multiplication is one of the most important arithmetic operations in public-key cryptography such as elliptic curve cryptography (ECC) and RSA, and the performance of modular multiplier is a key factor influencing the performance of public-key cryptographic hardware. An efficient hardware implementation of word-based Montgomery modular multiplication algorithm is described in this paper. Our modular multiplier was designed to support eleven field sizes for prime field GF(p) and binary field GF(2k) as defined by SEC2 standard for ECC, making it suitable for lightweight hardware implementations of ECC processors. The proposed architecture employs pipeline scheme between the partial product generation and addition operation and the modular reduction operation to reduce the clock cycles required to compute modular multiplication by 50%. The hardware operation of our modular multiplier was demonstrated by FPGA verification. When synthesized with a 65-nm CMOS cell library, it was realized with 33,635 gate equivalents, and the maximum operating clock frequency was estimated at 147 MHz.

A Self-Timed Ring based Lightweight TRNG with Feedback Structure (피드백 구조를 갖는 Self-Timed Ring 기반의 경량 TRNG)

  • Choe, Jun-Yeong;Shin, Kyung-Wook
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.24 no.2
    • /
    • pp.268-275
    • /
    • 2020
  • A lightweight hardware design of self-timed ring based true random number generator (TRNG) suitable for information security applications is described. To reduce hardware complexity of TRNG, an entropy extractor with feedback structure was proposed, which minimizes the number of ring stages. The number of ring stages of the FSTR-TRNG was determined to be a multiple of eleven, taking into account operating clock frequency and entropy extraction circuit, and the ratio of tokens to bubbles was determined to operate in evenly-spaced mode. The hardware operation of FSTR-TRNG was verified by FPGA implementation. A set of statistical randomness tests defined by NIST 800-22 were performed by extracting 20 million bits of binary sequences generated by FSTR-TRNG, and all of the fifteen test items were found to meet the criteria. The FSTR-TRNG occupied 46 slices of Spartan-6 FPGA device, and it was implemented with about 2,500 gate equivalents (GEs) when synthesized in 180 nm CMOS standard cell library.

Design of high-speed RSA processor based on radix-4 Montgomery multiplier (래딕스-4 몽고메리 곱셈기 기반의 고속 RSA 연산기 설계)

  • Koo, Bon-Seok;Ryu, Gwon-Ho;Chang, Tae-Joo;Lee, Sang-Jin
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.17 no.6
    • /
    • pp.29-39
    • /
    • 2007
  • RSA is one of the most popular public-key crypto-system in various applications. This paper addresses a high-speed RSA crypto-processor with modified radix-4 modular multiplication algorithm and Chinese Remainder Theorem(CRT) using Carry Save Adder(CSA). Our design takes 0.84M clock cycles for a 1024-bit modular exponentiation and 0.25M cycles for a 512-bit exponentiations. With 0.18um standard cell library, the processor achieves 365Kbps for a 1024-bit exponentiation and 1,233Kbps for two 512-bit exponentiations at a 300MHz clock rate.

Design of AES Cryptographic Processor with Modular Round Key Generator (모듈화된 라운드 키 생성회로를 갖는 AES 암호 프로세서의 설계)

  • 최병윤;박영수;전성익
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.12 no.5
    • /
    • pp.15-25
    • /
    • 2002
  • In this paper a design of high performance cryptographic processor which implements AES Rijndael algorithm is described. To eliminate performance degradation due to round-key computation delay of conventional processor, the on-the-fly precomputation of round key based on modified round structure is adopted. And on-the-fly round key generator which supports 128, 192, and 256-bit key has modular structure. The designed processor has iterative structure which uses 1 clock cycle per round and supports three operation modes, such as ECB, CBC, and CTR mode which is a candidate for new AES modes of operation. The cryptographic processor designed in Verilog-HDL and synthesized using 0.251$\mu\textrm{m}$ CMOS cell library consists of about 51,000 gates. Simulation results show that the critical path delay is about 7.5ns and it can operate up to 125Mhz clock frequency at 2.5V supply. Its peak performance is about 1.45Gbps encryption or decryption rate under 128-bit key ECB mode.