• Title/Summary/Keyword: Gate-Cycle

Search Result 155, Processing Time 0.025 seconds

Design of an Asynchronous Data Cache with FIFO Buffer for Write Back Mode (Write Back 모드용 FIFO 버퍼 기능을 갖는 비동기식 데이터 캐시)

  • Park, Jong-Min;Kim, Seok-Man;Oh, Myeong-Hoon;Cho, Kyoung-Rok
    • The Journal of the Korea Contents Association
    • /
    • v.10 no.6
    • /
    • pp.72-79
    • /
    • 2010
  • In this paper, we propose the data cache architecture with a write buffer for a 32bit asynchronous embedded processor. The data cache consists of CAM and data memory. It accelerates data up lood cycle between the processor and the main memory that improves processor performance. The proposed data cache has 8 KB cache memory. The cache uses the 4-way set associative mapping with line size of 4 words (16 bytes) and pseudo LRU replacement algorithm for data replacement in the memory. Dirty register and write buffer is used for write policy of the cache. The designed data cache is synthesized to a gate level design using $0.13-{\mu}m$ process. Its average hit rate is 94%. And the system performance has been improved by 46.53%. The proposed data cache with write buffer is very suitable for a 32-bit asynchronous processor.

Formation of Nickel Silicide from Atomic Layer Deposited Ni film with Ti Capping layer

  • Yun, Sang-Won;Lee, U-Yeong;Yang, Chung-Mo;Na, Gyeong-Il;Jo, Hyeon-Ik;Ha, Jong-Bong;Seo, Hwa-Il;Lee, Jeong-Hui
    • Proceedings of the Korean Society Of Semiconductor Equipment Technology
    • /
    • 2007.06a
    • /
    • pp.193-198
    • /
    • 2007
  • The NiSi is very promising candidate for the metallization in 60nm CMOS process such as FUSI(fully silicided) gate and source/drain contact because it exhibits non-size dependent resistance, low silicon consumption and mid-gap workfunction. Ni film was first deposited by using ALD (atomic layer deposition) technique with Bis-Ni precursor and $H_2$ reactant gas at $220^{\circ}C$ with deposition rate of $1.25{\AA}/cycle$. The as-deposited Ni film exhibited a sheet resistance of $5{\Omega}/{\square}$. RTP (repaid thermal process) was then performed by varying temperature from $400^{\circ}C$ to $900^{\circ}C$ in $N_2$ ambient for the formation of NiSi. The process window temperature for the formation of low-resistance NiSi was estimated from $600^{\circ}C$ to $800^{\circ}C$ and from $700^{\circ}C$ to $800^{\circ}C$ with and without Ti capping layer. The respective sheet resistance of the films was changed to $2.5{\Omega}/{\square}$ and $3{\Omega}/{\square}$ after silicidation. This is because Ti capping layer increases reaction between Ni and Si and suppresses the oxidation and impurity incorporation into Ni film during silicidation process. The NiSi films were treated by additional thermal stress in a resistively heated furnace for test of thermal stability, showing that the film heat-treated at $800^{\circ}C$ was more stable than that at $700^{\circ}C$ due to better crystallinity.

  • PDF

Physical Properties of the Al2O3 Thin Films Deposited by Atomic Layer Deposition (ALD법으로 제조된 Al2O3 박막의 물리적 특성)

  • Kim, Jae-Bum;Kwon, Duk-Ryel;Oh, Ki-Young;Lee, Chong-Mu
    • Korean Journal of Materials Research
    • /
    • v.12 no.6
    • /
    • pp.493-498
    • /
    • 2002
  • $Al_2O_3$ is a promising gate dielectric because of its high dielectric constant, high resistivity and low leakage current. Since $OH^-$ radical in $Al_2O_3$ films deposited by ALD using TMA and $H_2O$ degrades the good properties of $Al_2O_3$, TMA and $O_3$ were used to deposite $Al_2O_3$ films and the effects of $O_3$ on the properties of the $Al_2O_3$ films were investigated. The growth rate of the $Al_2O_3$ film under the optimum condition was 0.85 $\AA$/cycle. According to the XPS analysis results the $OH^-$ concentration in the $Al_2O_3$ film deposited using $O_3$ is lower than that using $H_2O$. RBS analysis results indicate the chemical formula of the film is $Al_{2.2}O_{2.8}$. The carbon concentration in the film detected by AES is under 1 at%. SEM observation confirms that the step coverage of the $Al_2O_3$ film deposited by ALD using $O_3$ is nearly 100%.

Design of Cryptic Circuit for Passive RFID Tag (수동형 RFID 태그에 적합한 암호 회로의 설계)

  • Lim, Young-Il;Cho, Kyoung-Rok;You, Young-Gap
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.44 no.1
    • /
    • pp.8-15
    • /
    • 2007
  • This paper proposed hardware architecture of the block cryptographic algorithm HIGHT aiming small size and low power application, and analyzed its performance. The HIGHT is a modified algorithm of the Feistel. The encryption and decryption circuit were designed as one iterative block. It reduces the redundant circuit that yields small area. For the performance improvement, the circuit generates 32-bit subkey during 1 clock cycle. we synthesized the HIGHT with Hynix $0.25-{\mu}m$ CMOS technology. The proposed circuit size was 2.658 EG(equivalent gate), and its power consumption was $10.88{\mu}W$ at 2.5V for 100kHz. It is useful for a passive RFID tag or a smart IC card of a small size and low power.

Design of High Speed Binary Arithmetic Encoder for CABAC Encoder (CABAC 부호화기를 위한 고속 이진 산술 부호화기의 설계)

  • Park, Seungyong;Jo, Hyungu;Ryoo, Kwangki
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.21 no.4
    • /
    • pp.774-780
    • /
    • 2017
  • This paper proposes an efficient binary arithmetic encoder hardware architecture for CABAC encoding, which is an entropy coding method of HEVC. CABAC is an entropy coding method that is used in HEVC standard. Entropy coding removes statistical redundancy and supports a high compression ratio of images. However, the binary arithmetic encoder causes a delay in real time processing and parallel processing is difficult because of the high dependency between data. The operation of the proposed CABAC BAE hardware structure is to separate the renormalization and process the conventional iterative algorithm in parallel. The new scheme was designed as a four-stage pipeline structure that can reduce critical path optimally. The proposed CABAC BAE hardware architecture was designed with Verilog HDL and implemented in 65nm technology. Its gate count is 8.07K and maximum operating speed of 769MHz. It processes the four bin per clock cycle. Maximum processing speed increased by 26% from existing hardware architectures.

Design of an Efficient Binary Arithmetic Encoder for H.264/AVC (H.264/AVC를 위한 효율적인 이진 산술 부호화기 설계)

  • Moon, Jeon-Hak;Kim, Yoon-Sup;Lee, Seong-Soo
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.46 no.12
    • /
    • pp.66-72
    • /
    • 2009
  • This paper proposes an efficient binary arithmetic encoder for CABAC which is used one of the entropy coding methods for H.264/AVC. The present binary arithmetic encoding algorithm requires huge complexity of operation and data dependency of each step, which is difficult to be operated in fast. Therefore, renormalization exploits 2-stage pipeline architecture for efficient process of operation, which reduces huge complexity of operation and data dependency. Context model updater is implemented by using a simple expression instead of transIdxMPS table and merging transIdxLPS and rangeTabLPS tables, which decreases hardware size. Arithmetic calculator consists of regular mode, bypass mode and termination mode for appearance probability of binary value. It can operate in maximum speed. The proposed binary arithmetic encoder has 7282 gate counts in 0.18um standard cell library. And input symbol per cycle is about 1.

VLSI Design of an Improved Structure of a $GF(2^m)$ Divider (확장성에 유리한 병렬 알고리즘 방식에 기반한 $GF(2^m)$나눗셈기의 VLSI 설계)

  • Moon San-Gook
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.9 no.3
    • /
    • pp.633-637
    • /
    • 2005
  • In this contribution, we developed and improved an existing GF (Galois field) dividing algorithm by suggesting a novel architecture for a finite field divider, which is frequently required for the error correction applications and the security-related applications such as the Reed-Solomon code, elliptic curve encryption/ decryption, is proposed. We utilized the VHDL language to verify the design methodology, and implemented the architecture on an FPGA chip. We suggested the n-bit lookup table method to obtain the throughput of 2m/n cycles, where m is the order of the division polynomial and n is the number of the most significant lookup-bits. By doing this, we extracted the advantages in achieving both high-throughput and less cost of the gate areaon the chip. A pilot FPGA chip was implemented with the case of m=4, n=2. We successfully utilized the Altera's EP20K30ETC144-1 to exhibit the maximum operating clock frequency of 77 MHz.

Hardware Design and Implementation of Joint Viterbi Detection and Decoding Algorithm for Bluetooth Low Energy Systems (블루투스 저전력 시스템을 위한 저복잡도 결합 비터비 검출 및 복호 알고리즘의 하드웨어 설계 및 구현)

  • Park, Chul-hyun;Jung, Yongchul;Jung, Yunho
    • Journal of IKEEE
    • /
    • v.24 no.3
    • /
    • pp.838-844
    • /
    • 2020
  • In this paper, we propose an efficient Viterbi processor using Joint Viterbi detection and decoding (JVDD) algorithm for a for bluetooth low energy (BLE) system. Since the convolutional coded Gaussian minimum-shift keying (GMSK) signal is specified in the BLE 5.0 standard, two Viterbi processors are needed for detection and decoding. However, the proposed JVDD scheme uses only one Viterbi processor by modifying the branch metric with inter-symbol interference information from GMSK modulation; therefore, the hardware complexity can be significantly reduced without performance degradation. Low-latency and low-complexity hardware architecture for the proposed JVDD algorithm was proposed, which makes Viterbi decoding completed within one clock cycle. Viterbi Processor RTL synthesis results on a GF55nm process show that the gate count is 12K and the memory unit and the initial latency is reduced by 33% compared to the modified state exchange (MSE).

A Scalable Word-based RSA Cryptoprocessor with PCI Interface Using Pseudo Carry Look-ahead Adder (가상 캐리 예측 덧셈기와 PCI 인터페이스를 갖는 분할형 워드 기반 RSA 암호 칩의 설계)

  • Gwon, Taek-Won;Choe, Jun-Rim
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.39 no.8
    • /
    • pp.34-41
    • /
    • 2002
  • This paper describes a scalable implementation method of a word-based RSA cryptoprocessor using pseudo carry look-ahead adder The basic organization of the modular multiplier consists of two layers of carry-save adders (CSA) and a reduced carry generation and Propagation scheme called the pseudo carry look-ahead adder for the high-speed final addition. The proposed modular multiplier does not need complicated shift and alignment blocks to generate the next word at each clock cycle. Therefore, the proposed architecture reduces the hardware resources and speeds up the modular computation. We implemented a single-chip 1024-bit RSA cryptoprocessor based on the word-based modular multiplier with 256 datapaths in 0.5${\mu}{\textrm}{m}$ SOG technology after verifying the proposed architectures using FPGA with PCI bus.

Design of Viterbi Decoders Using a Modified Register Exchange Method (변형된 레지스터 교환 방식의 비터비 디코더 설계)

  • 이찬호;노승효
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.40 no.1
    • /
    • pp.36-44
    • /
    • 2003
  • This paper proposes a Viterbi decoding scheme without trace-back operations to reduce the amount of memory storing the survivor path information, and to increase the decoding speed. The proposed decoding scheme is a modified register exchange scheme, and is verified by a simulation to give the same results as those of the conventional decoders. It is compared with the conventional decoding schemes such as the trace-back and the register exchange scheme. The memory size of the proposed scheme is reduced to 1/(5 x constraint length) of that of the register exchange scheme, and the throughput is doubled compared with that of the trace-back scheme. A decoder with a code rate of 2/3, a constraint length, K=3 and a trace-back depth of 15 is designed using VHDL and implemented in an FPGA. It is also shown that the modified register exchange scheme can be applied to a block decoding scheme.