Search | Korea Science

Instruction Flow based Early Way Determination Technique for Low-power L1 Instruction Cache

Kim, Gwang Bok;Kim, Jong Myon;Kim, Cheol Hong
- Journal of the Korea Society of Computer and Information
- /
- v.21 no.9
- /
- pp.1-9
- /
- 2016
Recent embedded processors employ set-associative L1 instruction cache to improve the performance. The energy consumption in the set-associative L1 instruction cache accounts for considerable portion in the embedded processor. When an instruction is required from the processor, all ways in the set-associative instruction cache are accessed in parallel. In this paper, we propose the technique to reduce the energy consumption in the set-associative L1 instruction cache effectively by accessing only one way. Gshare branch predictor is employed to predict the instruction flow and determine the way to fetch the instruction. When the branch prediction is untaken, next instruction in a sequential order can be fetched from the instruction cache by accessing only one way. According to our simulations with SPEC2006 benchmarks, the proposed technique requires negligible hardware overhead and shows 20% energy reduction on average in 4-way L1 instruction cache.
https://doi.org/10.9708/jksci.2016.21.9.001 인용 PDF KSCI

Energy-aware Instruction Cache Design using Backward Branch Information for Embedded Processors (임베디드 시스템에서 후방 분기 명령어 정보를 이용한 저전력 명령어 캐쉬 설계 기법)

Yang, Na-Ra;Kim, Jong-Myon;Kim, Cheol-Hong
- Journal of the Korea Society of Computer and Information
- /
- v.13 no.6
- /
- pp.33-39
- /
- 2008
Energy efficiency should be considered together with performance when designing embedded processors. This paper proposes a new energy-aware instruction cache design using backward branch information to reduce the energy consumption in an embedded processor, since instruction caches consume a significant fraction of the on-chip energy. Proposed instruction cache is composed of two caches: a large main instruction cache and a small loop instruction cache. Proposed technique enables the selective access between the main instruction cache and the loop instruction cache to reduce the number of accesses to the main instruction cache, leading to good energy efficiency. Analysis results show that the proposed instruction cache reduces the energy consumption by 20% on the average, compared to the traditional instruction cache.
PDF

Impacts of multiple cache block sizes on system performance (다양한 cache block크기에 의한 시스템의 성능 변화)

이성환;김준성
- Proceedings of the IEEK Conference
- /
- 2003.07d
- /
- pp.1347-1350
- /
- 2003
본 논문에서는 instruction과 data cache로 나누어지는 L1 cache를 가진 시스템에서 instruction과 data cache 각각의 block 크기 변화가 전체 시스템의 성능에 미치는 영향을 고찰하였다. 이를 위하여 SPEC CPU 벤치마크 프로그램을 입력으로 하는 SimpleScalar를 이용한 시뮬레이션을 수행하였다. 본 연구를 통해서, instruction과 data 각각의 특성에 맞는 cache block 크기를 사용하는 것이 일률적인 cache block 크기를 사용하는 것에 비하여 전체 시스템의 성능을 더욱 향상시켜 준다는 것을 보여준다.
PDF

Low-power Filter Cache Design Technique for Multicore Processors (멀티 코어 프로세서를 위한 저전력 필터 캐쉬 설계 기법)

Park, Young-Jin;Kim, Jong-Myon;Kim, Cheol-Hong
- Journal of the Korea Society of Computer and Information
- /
- v.14 no.12
- /
- pp.9-16
- /
- 2009
Energy consumption as well as performance should be considered when designing up-to-date multicore processors. In this paper, we propose new design technique to reduce the energy consumption in the instruction cache for multicore processors by using modified filter cache. The filter cache has been recognized as one of the most energy-efficient design techniques for singlecore processors. The energy consumed in the instruction cache accounts for a significant portion of total processor energy consumption. Therefore, energy-aware instruction cache design techniques are essential to reduce the energy consumption in a multicore processor. The proposed technique reduces the energy consumption in the instruction cache for multicore processors by reducing the number of accesses to the level-1 instruction cache. We evaluate the proposed design using a simulation infrastructure based on SimpleScalar and CACTI. Simulation results show that the proposed architecture reduces the energy consumption in the instruction cache for multicore processors by up to 3.4% compared to the conventional filter cache architecture. Moreover, the proposed architecture shows better performance over the conventional filter cache architecture.
https://doi.org/10.9708/jksci.2009.14.12.009 인용 PDF

Design of A On-Chip Caches for RISC Processors (RISC 프로세서 On-Chip Cache의 설계)

홍인식;임인칠
- Journal of the Korean Institute of Telematics and Electronics
- /
- v.27 no.8
- /
- pp.1201-1210
- /
- 1990
This paper proposes on-chip instruction and data cache memories on RISC reduced instruction set computer) architecture which supports fast instruction fetch and data read/write, and enables RISC processor under research to obtain high performance. In the execution of HLL(high level language) programs, heavily used local scalar variables are stored in large register file, but arrays, structures, and global scalar variables are difficult for compiler to allocate registers. These problems can be solved by on-chip Instruction/Data cache. And each cycle of instruction fetch, pad delay causes the lowering of the processors's performance. Cache memories are designed in CMOS technology and SRAM(static-RAM), that saves layout area and power dissipation, is used for instruction and data storage. To speed up and support RISC processor's piplined architecture efficiently, hardwired logic technology is used overall circuits i cache blocks. The schematic capture and timing simulation of proposed cache memorises are performed on Apollo DN4000 workstation using Mentor Graphics CAD tools.
PDF

Energy-aware Instruction Cache Design using Partitioning (분할 기법을 이용한 저전력 명령어 캐쉬 설계)

Kim, Jong-Myon;Jung, Jae-Wook;Kim, Cheol-Hong
- Journal of KIISE:Computing Practices and Letters
- /
- v.13 no.5
- /
- pp.241-251
- /
- 2007
Energy consumption in the instruction cacheaccounts for a significant portion of the total processor energy consumption. Therefore, reducing energy consumption in the instruction cache is important in designing embedded processors. This paper proposes a method for reducing dynamic energy consumption in the instruction cache by partitioning it to smaller (less energy-consuming) sub-caches. When a request comes into the proposed cache, only one sub-cache is accessed by utilizing the locality of applications. By contrast, the other sub-caches are not accessed, leading todynamic energy reduction. In addition, the proposed cache reduces dynamic energy consumption by eliminating the energy consumed in tag matching. We evaluated the energy efficiency by running cycle accurate simulator, SimpleScalar. with power parameters obtained from CACTI. Simulation results show that the proposed cache reduces dynamic energy consumption by $37%{\sim}60%$ compared to the traditional direct-mapped instruction cache.
PDF KSCI

Improving Instruction Cache Performance by Dynamic Management of Cache-Image (캐시 이미지의 동적 관리 방법을 이용한 명령어 캐시 성능 개선)

Suh, Hyo-Joong
- KIISE Transactions on Computing Practices
- /
- v.23 no.9
- /
- pp.564-571
- /
- 2017
The burst loading of a pre-created cache-image is an effective method to reduce the instruction cache misses in the early stage of the program execution. It is useful to alleviate the performance degradation as well as the energy inefficiency, which is induced by the concentrated cold misses at the instruction cache. However, there are some defects, including software overhead on the compiler and installer. Furthermore, there are several mismatches as a result of the dynamic properties for specific applications. This paper addresses these issues and proposes a cache-image maintenance/recreation policy that can conduct dynamic management using a hardware-assisted method. The results of the simulation show that the proposed method can maintain the cache-image with a proper size and validity.
https://doi.org/10.5626/KTCP.2017.23.9.564 인용 KSCI

Low Power Trace Cache for Embedded Processor

Moon Je-Gil;Jeong Ha-Young;Lee Yong-Surk
- Proceedings of the IEEK Conference
- /
- summer
- /
- pp.204-208
- /
- 2004
Embedded business will be expanded market more and more since customers seek more wearable and ubiquitous systems. Cellular telephones, PDAs, notebooks and portable multimedia devices could bring higher microprocessor revenues and more rewarding improvements in performance and functions. Increasing battery capacity is still creeping along the roadmap. Until a small practical fuel cell becomes available, microprocessor developers must come up with power-reduction methods. According to MPR 2003, the instruction and data caches of ARM920T processor consume $44\%$ of total processor power. The rest of it is split into the power consumptions of the integer core, memory management units, bus interface unit and other essential CPU circuitry. And the relationships among CPU, peripherals and caches may change in the future. The processor working on higher operating frequency will exact larger cache RAM and consume more energy. In this paper, we propose advanced low power trace cache which caches traces of the dynamic instruction stream, and reduces cache access times. And we evaluate the performance of the trace cache and estimate the power of the trace cache, which is compared with conventional cache.
PDF

Performance Analyses of Instruction Fetch Models Considering Cache Miss and Branch Misprediction (캐쉬 미스와 분기예측 실패를 고려한 명령어 페치 모델의 성능분석)

Kim, Seon-Mo;Jeong, Jin-Ha;Choe, Sang-Bang
- Journal of KIISE:Computer Systems and Theory
- /
- v.28 no.12
- /
- pp.685-697
- /
- 2001
Cache memories are small fast memories used to temporarily hold the contents of main memory that are likely to be referenced by processors so as to reduce instruction and data access time. In this paper, we represent analytical models of instruction fetch process for four types of instruction cache structures that can be used for superscalar processors. In the models, we define various kinds of architectural parameters and take cache miss and branch misprediction into consideration. To prove the correctness of the proposed models, we performed extensive simulations and compared the results with the analytical models. Simulation results showed that the proposed model can estimate the instruction fetch rate accurately within 10% error in most cases. Both analytical model and simulation show that the increase of cache misses reduces the instruction fetch rate more severely than that of branch misprediction does. However, the analytical model can explain the causes of performance degradation which cannot be uncovered by the simulation method only. The model is also able to provide exact relationship between cache miss and branch misprediction for instruction fetch analysis.
PDF

Design of an Asynchronous Instruction Cache based on a Mixed Delay Model (혼합 지연 모델에 기반한 비동기 명령어 캐시 설계)

Jeon, Kwang-Bae;Kim, Seok-Man;Lee, Je-Hoon;Oh, Myeong-Hoon;Cho, Kyoung-Rok
- The Journal of the Korea Contents Association
- /
- v.10 no.3
- /
- pp.64-71
- /
- 2010
Recently, to achieve high performance of the processor, the cache is splits physically into two parts, one for instruction and one for data. This paper proposes an architecture of asynchronous instruction cache based on mixed-delay model that are DI(delay-insensitive) model for cache hit and Bundled delay model for cache miss. We synthesized the instruction cache at gate-level and constructed a test platform with 32-bit embedded processor EISC to evaluate performance. The cache communicates with the main memory and CPU using 4-phase hand-shake protocol. It has a 8-KB, 4-way set associative memory that employs Pseudo-LRU replacement algorithm. As the results, the designed cache shows 99% cache hit ratio and reduced latency to 68% tested on the platform with MI bench mark programs.
https://doi.org/10.5392/JKCA.2010.10.3.064 인용 PDF KSCI

Search Result 67, Processing Time 0.027 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)