Search | Korea Science

Performance Analysis of Slave-Side Arbitration Schemes for the Multi-Layer AHB BusMatrix (ML-AHB 버스 매트릭스를 위한 슬레이브 중심 중재 방식의 성능 분석)

Hwang, Soo-Yun;Park, Hyeong-Jun;Jhang, Kyoung-Son
- Journal of KIISE:Computer Systems and Theory
- /
- v.34 no.5_6
- /
- pp.257-266
- /
- 2007
In On-Chip bus, the arbitration scheme is one of the critical factors that decide the overall system performance. The arbitration scheme used in traditional shared bus is the master-side arbitration based on the request and grant signals between multiple masters and single arbiter. In the case of the master-side arbitration, only one master and one slave can transfer the data at a time. Therefore the throughput of total bus system and the utilization of resources are decreased in the master-side arbitration. However in the slave-side arbitration, there is an arbiter at each slave port and the master just starts a transaction and waits for the slave response to proceed to the next transfer. Thus, the unit of arbitration can be a transaction or a transfer. Besides the throughput of total bus system and the utilization of resources are increased since the multiple masters can simultaneously perform transfers with independent slaves. In this paper, we implement and analyze the arbitration schemes for the Multi-Layer AHB BusMatrix based on the slave-side arbitration. We implement the slave-side arbitration schemes based on fixed priority, round robin and dynamic priority and accomplish the performance simulation to compare and analyze the performance of each arbitration scheme according to the characteristics of the master and slave. With the performance simulation, we observed that when there are few masters on critical path in a bus system, the arbitration scheme based on dynamic priority shows the maximum performance and in other cases, the arbitration scheme based on round robin shows the highest performance. In addition, the arbitration scheme with transaction based multiplexing shows higher performance than the same arbitration scheme with single transfer based switching in an application with frequent accesses to the long latency devices or memories such as SDRAM. The improvements of the arbitration scheme with transaction based multiplexing are 26%, 42% and 51%, respectively when the latency times of SDRAM are 1, 2 and 3 clock cycles.
PDF KSCI

A Low-power EEPROM design for UHF RFID tag chip (UHF RFID 태그 칩용 저전력 EEPROM설계)

Yi, Won-Jae;Lee, Jae-Hyung;Park, Kyung-Hwan;Lee, Jung-Hwan;Lim, Gyu-Ho;Kang, Hyung-Geun;Ko, Bong-Jin;Park, Mu-Hun;Ha, Pan-Bong;Kim, Young-Hee
- Journal of the Korea Institute of Information and Communication Engineering
- /
- v.10 no.3
- /
- pp.486-495
- /
- 2006
In this paper, a low-power 1Kb synchronous EEPROM is designed with flash cells for passive UHF RFID tag chips. To make a low-power EEPROM, four techniques are newly proposed. Firstly, dual power supply voltages VDD(1.5V) and VDDP(2.5V), are used. Secondly, CKE signal is used to remove switching current due to clocking of synchronous circuits. Thirdly, a low-speed but low-power sensing scheme using clocked inverters is used instead of the conventional current sensing method. Lastly, the low-voltage, VDD for the reference voltage generator is supplied by using the Voltage-up converter in write cycle. An EEPROM is fabricated with the $0.25{\mu}m$ EEPROM process. Simulation results show that power dissipations are $4.25{\mu}W$ in the read cycle and $25{\mu}W$ in the write cycle, respectively. The layout area is $646.3\times657.68{\mu}m^2$.
PDF KSCI

Advanced Victim Cache with Processor Reuse Information (프로세서의 재사용 정보를 이용하는 개선된 고성능 희생 캐쉬)

Kwak Jong Wook;Lee Hyunbae;Jhang Seong Tae;Jhon Chu Shik
- Journal of KIISE:Computer Systems and Theory
- /
- v.31 no.12
- /
- pp.704-715
- /
- 2004
Recently, a single or multi processor system uses the hierarchical memory structure to reduce the time gap between processor clock rate and memory access time. A cache memory system includes especially two or three levels of caches to reduce this time gap. Moreover, one of the most important things In the hierarchical memory system is the hit rate in level 1 cache, because level 1 cache interfaces directly with the processor. Therefore, the high hit rate in level 1 cache is critical for system performance. A victim cache, another high level cache, is also important to assist level 1 cache by reducing the conflict miss in high level cache. In this paper, we propose the advanced high level cache management scheme based on the processor reuse information. This technique is a kind of cache replacement policy which uses the frequency of processor's memory accesses and makes the higher frequency address of the cache location reside longer in cache than the lower one. With this scheme, we simulate our policy using Augmint, the event-driven simulator, and analyze the simulation results. The simulation results show that the modified processor reuse information scheme(LIVMR) outperforms the level 1 with the simple victim cache(LIV), 6.7% in maximum and 0.5% in average, and performance benefits become larger as the number of processors increases.
PDF KSCI

A 2-Gb/s SLVS Transmitter for MIPI D-PHY (MIPI D-PHY를 위한 2-Gb/s SLVS 송신단)

Baek, Seung Wuk;Jeong, Dong Gil;Park, Sang Min;Hwang, Yu Jeong;Jang, Young Chan
- Journal of Korea Society of Industrial Information Systems
- /
- v.18 no.5
- /
- pp.25-32
- /
- 2013
A 1.8V 2-Gb/s scalable low voltage signaling (SLVS) transmitter (TX) is designed for mobile applications requiring high speed and low power consumption. It consists of 4-lane TX for data transmission, 1-lane TX for a source synchronous clocking, and a 8-phase clock generator. The proposed SLVS TX has the scaling voltage swing from 50 mV to 650 mV and supports a high speed (HS) mode and a low power (LP) mode. An output impedance calibration scheme for the SVLS TX is proposed to improve the signal integrity. The proposed SLVS TX is implemented by using a 0.18-${\mu}m$ 1-poly 6-metal CMOS with a 1.8 V supply. The simulated data jitter of the implemented SLVS TX is about 8.04 ps at the data rate of 2-Gb/s. The area and power consumption of the 1-lane of the proposed SLVS TX are $422{\times}474{\mu}m^2$ and 5.35 mW/Gb/s, respectively.
https://doi.org/10.9723/jksiis.2013.18.5.025 인용 PDF KSCI

900MHz RFID Passive Tag Frontend Design and Implementation (900MHz 대역 RFID 수동형 태그 전치부 설계 및 구현)

Hwang, Ji-Hun;Oh, Jong-Hwa;Kim, Hyun-Woong;Lee, Dong-Gun;Roh, Hyoung-Hwan;Seong, Yeong-Rak;Oh, Ha-Ryoung;Park, Jun-Seok
- The Journal of Korean Institute of Communications and Information Sciences
- /
- v.35 no.7B
- /
- pp.1081-1090
- /
- 2010
$0.18{\mu}m$ CMOS UHF RFID tag frontend is presented in this paper. Several key components are highlighted: the voltage multiplier based on the threshold voltage terminated circuit, the demodulator using current mode, and the clock generator. For standard compliance, all designed components are under the EPC Global Class-1 Generation-2 UHF RFID protocol. Backscatter modulation uses the pulse width modulation scheme. Overall performance of the proposed tag chip was verified with the evaluation board. Prototype Tag Chip dimension is neary 0.77mm2 ; According to the simulation results, the reader can successfully interrogate the tag within 1.5m. where the tag consumes the power about $71{\mu}W$.
PDF KSCI

Efficient Hardware Design of Hash Processor Supporting SHA-3 and SHAKE256 Algorithms (SHA-3과 SHAKE256 알고리듬을 지원하는 해쉬 프로세서의 하드웨어 설계)

Choi, Byeong-Yoon
- Journal of the Korea Institute of Information and Communication Engineering
- /
- v.21 no.6
- /
- pp.1075-1082
- /
- 2017
This paper describes a design of hash processor which can execute new hash algorithm, SHA-3 and extendable-output function (XOF), SHAKE-256. The processor that consists of padder block, round-core block and output block maximizes its performance by using the block-level pipelining scheme. The padder block formats the variable-length input data into multiple blocks and then round block generates SHA-3 message digest or SHAKE256 result for multiple blocks using on-the-fly round constant generator. The output block finally transfers the result to host processor. The hash processor that is implemented with Xilinx Virtex-5 FPGA can operate up to 220-MHz clock frequency. The estimated maximum throughput is 5.28 Gbps(giga bits per second) for SHA3-512. Because the processor supports both SHA-3 hash algorithm and SHAKE256 algorithm, it can be applicable to cryptographic areas such as data integrity, key generation and random number generation.
https://doi.org/10.6109/jkiice.2017.21.6.1075 인용 PDF KSCI

A Cryptoprocessor for AES-128/192/256 Rijndael Block Cipher Algorithm (AES-128/192/256 Rijndael 블록암호 알고리듬용 암호 프로세서)

안하기;박광호;신경욱
- Journal of the Korea Institute of Information and Communication Engineering
- /
- v.6 no.3
- /
- pp.427-433
- /
- 2002
This paper describes a design of cryptographic processor that implements the AES(Advanced Encryption Standard) block cipher algorithm "Rijndael". To achieve high throughput rate, a sub-pipeline stage is inserted into the round transformation block, resulting that the second half of current round function and the first half of next round function are being simultaneously operated. For area-efficient and low-power implementation, the round block is designed to share the hardware resources in encryption and decryption. An efficient scheme for on-the-fly key scheduling, which supports the three master-key lengths of 128-b/192-b/256-b, is devised to generate round keys in the first sub-pipeline stage of each round processing. The cryptoprocessor designed in Verilog-HDL was verified using Xilinx FPGA board and test system. The core synthesized using 0.35-${\mu}{\textrm}{m}$ CMOS cell library consists of about 25,000 gates. Simulation results show that it has a throughput of about 520-Mbits/sec with 220-MHz clock frequency at 2.5-V supply.-V supply.
PDF KSCI

Design and Implementation of a Realtime Video Player on Tiled-Display System (타일드-디스플레이 시스템에서 실시간 동영상 상영기의 설계 및 구현)

Choe, Gi-Seok;Yu, Jeong-Soo;Choi, Jeong-Hooni;Nang, Jong-Ho
- Journal of KIISE:Computer Systems and Theory
- /
- v.35 no.4
- /
- pp.150-157
- /
- 2008
This paper presents a design and implementation of realtime video player that operates on a tiled-display system consisting of multiple PCs to provide a very large and high resolution display. In the proposed system, the master process transmits a compressed video stream to multiple PCs using UDP multicast. All slaves(PC) receive the same video stream, decompress, clip their designated areas from the decompressed video frame, and display it to their displays while being synchronized with each other. A simple synchronization mechanism based on the H/W clock of each slave is proposed to avoid the skew between the tiles of the display, and a flow-control mechanism based on the bit-rate of the video stream and a pre-buffering scheme are proposed to prevent the jitter The proposed system is implemented with Microsoft DirectX filter technology in order to decouple the video/audio codec from the player.
PDF KSCI

Design of Multiplierless Lifting-based Wavelet Transform using Pattern Search Methods (패턴 탐색 기법을 사용한 Multiplierless 리프팅 기반의 웨이블릿 변환의 설계)

Son, Chang-Hoon;Park, Seong-Mo;Kim, Young-Min
- Journal of Korea Multimedia Society
- /
- v.13 no.7
- /
- pp.943-949
- /
- 2010
This paper presents some improvements on VLSI implementation of lifting-based 9/7 wavelet transform by optimization hardware multiplication. The proposed solution requires less logic area and power consumption without performance loss compared to previous wavelet filter structure based on lifting scheme. This paper proposes a better approach to the hardware implementation using Lefevre algorithm based on extensions of Pattern search methods. To compare the proposed structure to the previous solutions on full multiplier blocks, we implemented them using Verilog HDL. For a hardware implementation of the two solutions, the logical synthesis on 0.18 um standard cells technology show that area, maximum delay and power consumption of the proposed architecture can be reduced up to 51%, 43% and 30%, respectively, compared to previous solutions for a 200 MHz target clock frequency. Our evaluation show that when design VLSI chip of lifting-based 9/7 wavelet filter, our solution is better suited for standard-cell application-specific integrated circuits than prior works on complete multiplier blocks.
PDF KSCI

Stable Power Plan Technique for Implementing SoC (SoC 구현을 위한 안정적인 Power Plan 기법)

Seo, Young-Ho;Kim, Dong-Wook
- Journal of the Korea Institute of Information and Communication Engineering
- /
- v.16 no.12
- /
- pp.2731-2740
- /
- 2012
ASIC(application specific integrated circuit) process is a set of various technologies for fabricating a chip. Generally there have been many researches for RTL design, synthesis, floor plan & routing, low power scheme, clock tree synthesis, and testability which are widely researched in recent. In this paper we propose a new methodology of power strap routing in basis of design experience and experiment. First the power strap for vertical VDD and VSS and horizontal VDD and VSS is routed, and then after the problems which are generated in this process are analyzed, we propose a new process for resolving them. For this, the strap guide is inserted to protect the unnecessary strap routing and dumped for next steps. Next the unnecessary power straps which are generated the first inserting process are removed, and the pre-routing is performed for the macro cells. Finally the resultant power straps are routed using the dumped routing guide. Through the proposed process we identified the efficient and stable route of the power straps.
https://doi.org/10.6109/jkiice.2012.16.12.2731 인용 PDF KSCI

Search Result 213, Processing Time 0.028 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)