• Title/Summary/Keyword: cache performance

Search Result 656, Processing Time 0.033 seconds

Impacts of multiple cache block sizes on system performance (다양한 cache block크기에 의한 시스템의 성능 변화)

  • 이성환;김준성
    • Proceedings of the IEEK Conference
    • /
    • 2003.07d
    • /
    • pp.1347-1350
    • /
    • 2003
  • 본 논문에서는 instruction과 data cache로 나누어지는 L1 cache를 가진 시스템에서 instruction과 data cache 각각의 block 크기 변화가 전체 시스템의 성능에 미치는 영향을 고찰하였다. 이를 위하여 SPEC CPU 벤치마크 프로그램을 입력으로 하는 SimpleScalar를 이용한 시뮬레이션을 수행하였다. 본 연구를 통해서, instruction과 data 각각의 특성에 맞는 cache block 크기를 사용하는 것이 일률적인 cache block 크기를 사용하는 것에 비하여 전체 시스템의 성능을 더욱 향상시켜 준다는 것을 보여준다.

  • PDF

Variable latency L1 data cache architecture design in multi-core processor under process variation

  • Kong, Joonho
    • Journal of the Korea Society of Computer and Information
    • /
    • v.20 no.9
    • /
    • pp.1-10
    • /
    • 2015
  • In this paper, we propose a new variable latency L1 data cache architecture for multi-core processors. Our proposed architecture extends the traditional variable latency cache to be geared toward the multi-core processors. We added a specialized data structure for recording the latency of the L1 data cache. Depending on the added latency to the L1 data cache, the value stored to the data structure is determined. It also tracks the remaining cycles of the L1 data cache which notifies data arrival to the reservation station in the core. As in the variable latency cache of the single-core architecture, our proposed architecture flexibly extends the cache access cycles considering process variation. The proposed cache architecture can reduce yield losses incurred by L1 cache access time failures to nearly 0%. Moreover, we quantitatively evaluate performance, power, energy consumption, power-delay product, and energy-delay product when increasing the number of cache access cycles.

Filter Cache Predictor Using Mode Selection Bit (모드 선택 비트를 사용한 필터 캐시 예측기)

  • Kwak, Jong-Wook
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.46 no.5
    • /
    • pp.1-13
    • /
    • 2009
  • Filter cache has been introduced as one solution of reducing cache power consumption. More than 50% of the power reduction results from the filter cache, whereas more than 20% of the performance is compromised. To minimize the performance degradation of the filter cache, the predictive filter cache has been proposed. In this paper, we review the previous filter cache predictors and analyze the problems of the solutions. As a result, we found main problems that cause prediction misses in previous filter cache schemes and, to resolve the problems, this paper proposes a new prediction policy. In our scheme, some reference bit entries, called MSBs, are inserted into filter cache and BTB, to adaptively control the filter cache access. In simulation parts, we use a modified SimpleScalar simulator with MiBench benchmark programs to verify the proposed filter cache. The simulation result shows in average 5% performance improvement, compared to previous ones.

An Area Efficient Low Power Data Cache for Multimedia Embedded Systems (멀티미디어 내장형 시스템을 위한 저전력 데이터 캐쉬 설계)

  • Kim Cheong-Ghil;Kim Shin-Dug
    • The KIPS Transactions:PartA
    • /
    • v.13A no.2 s.99
    • /
    • pp.101-110
    • /
    • 2006
  • One of the most effective ways to improve cache performance is to exploit both temporal and spatial locality given by any program executional characteristics. This paper proposes a data cache with small space for low power but high performance on multimedia applications. The basic architecture is a split-cache consisting of a direct-mapped cache with small block sire and a fully-associative buffer with large block size. To overcome the disadvantage of small cache space, two mechanisms are enhanced by considering operational behaviors of multimedia applications: an adaptive multi-block prefetching to initiate various fetch sizes and an efficient block filtering to remove rarely reused data. The simulations on MediaBench show that the proposed 5KB-cache can provide equivalent performance and reduce energy consumption up to 40% as compared with 16KB 4-way set associative cache.

Designing a low-power L1 cache system using aggressive data of frequent reference patterns

  • Jung, Bo-Sung;Lee, Jung-Hoon
    • Journal of the Korea Society of Computer and Information
    • /
    • v.27 no.7
    • /
    • pp.9-16
    • /
    • 2022
  • Today, with the advent of the 4th industrial revolution, IoT (Internet of Things) systems are advancing rapidly. For this reason, a various application with high-performance and large-capacity are emerging. Therefore, there is a need for low-power and high-performance memory for computing systems with these applications. In this paper, we propose an effective structure for the L1 cache memory, which consumes the most energy in the computing system. The proposed cache system is largely composed of two parts, the L1 main cache and the buffer cache. The main cache is 2 banks, and each bank consists of a 2-way set association. When the L1 cache hits, the data is copied into buffer cache according to the proposed algorithm. According to simulation, the proposed L1 cache system improved the performance of energy delay products by about 65% compared to the existing 4-way set associative cache memory.

Energy-Performance Efficient 2-Level Data Cache Architecture for Embedded System (내장형 시스템을 위한 에너지-성능 측면에서 효율적인 2-레벨 데이터 캐쉬 구조의 설계)

  • Lee, Jong-Min;Kim, Soon-Tae
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.37 no.5
    • /
    • pp.292-303
    • /
    • 2010
  • On-chip cache memories play an important role in both performance and energy consumption points of view in resource-constrained embedded systems by filtering many off-chip memory accesses. We propose a 2-level data cache architecture with a low energy-delay product tailored for the embedded systems. The L1 data cache is small and direct-mapped, and employs a write-through policy. In contrast, the L2 data cache is set-associative and adopts a write-back policy. Consequently, the L1 data cache is accessed in one cycle and is able to provide high cache bandwidth while the L2 data cache is effective in reducing global miss rate. To reduce the penalty of high miss rate caused by the small L1 cache and power consumption of address generation, we propose an ECP(Early Cache hit Predictor) scheme. The ECP predicts if the L1 cache has the requested data using both fast address generation and L1 cache hit prediction. To reduce high energy cost of accessing the L2 data cache due to heavy write-through traffic from the write buffer laid between the two cache levels, we propose a one-way write scheme. From our simulation-based experiments using a cycle-accurate simulator and embedded benchmarks, the proposed 2-level data cache architecture shows average 3.6% and 50% improvements in overall system performance and the data cache energy consumption.

A New Cache Replacement Policy for Improving Last Level Cache Performance (라스트 레벨 캐쉬 성능 향상을 위한 캐쉬 교체 기법 연구)

  • Do, Cong Thuan;Son, Dong Oh;Kim, Jong Myon;Kim, Cheol Hong
    • Journal of KIISE
    • /
    • v.41 no.11
    • /
    • pp.871-877
    • /
    • 2014
  • Cache replacement algorithms have been developed in order to reduce miss counts. In modern processors, the performance gap between the processor and main memory has been increasing, creating a more important role for cache replacement policies. The Least Recently Used (LRU) policy is one of the most common policies used in modern processors. However, recent research has shown that the performance gap between the LRU and the theoretical optimal replacement algorithm (OPT) is large. Although LRU replacement has been proven to be adequate over and over again, the OPT/LRU performance gap is continuously widening as the cache associativity becomes large. In this study, we observed that there is a potential chance to improve cache performance based on existing LRU mechanisms. We propose a method that enhances the performance of the LRU replacement algorithm based on the access proportion among the lines in a cache set during a period of two successive replacement actions that make the final replacement action. Our experimental results reveals that the proposed method reduced the average miss rate of the baseline 512KB L2 cache by 15 percent when compared to conventional LRU. In addition, the performance of the processor that applied our proposed cache replacement policy improved by 4.7 percent over LRU, on average.

A Policy of Page Management Using Double Cache for NAND Flash Memory File System (NAND 플래시 메모리 파일 시스템을 위한 더블 캐시를 활용한 페이지 관리 정책)

  • Park, Myung-Kyu;Kim, Sung-Jo
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.36 no.5
    • /
    • pp.412-421
    • /
    • 2009
  • Due to the physical characteristics of NAND flash memory, overwrite operations are not permitted at the same location, and therefore erase operations are required prior to rewriting. These extra operations cause performance degradation of NAND flash memory file system. Since it also has an upper limit to the number of erase operations for a specific location, frequent erases should reduce the lifetime of NAND flash memory. These problems can be resolved by delaying write operations in order to improve I/O performance: however, it will lower the cache hit ratio. This paper proposes a policy of page management using double cache for NAND flash memory file system. Double cache consists of Real cache and Ghost cache to analyze page reference patterns. This policy attempts to delay write operations in Ghost cache to maintain the hit ratio in Real cache. It can also improve write performance by reducing the search time for dirty pages, since Ghost cache consists of Dirty and Clean list. We find that the hit ratio and I/O performance of our policy are improved by 20.57% and 20.59% in average, respectively, when comparing them with the existing policies. The number of write operations is also reduced by 30.75% in average, compared with of the existing policies.

Machine Learning-Based Detection of Cache Side Channel Attack Using Performance Counter Monitor of CPU (Performance Counter Monitor를 이용한 머신 러닝 기반 캐시 부채널 공격 탐지)

  • Hwang, Jongbae;Bae, Daehyeon;Ha, Jaecheol
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.30 no.6
    • /
    • pp.1237-1246
    • /
    • 2020
  • Recently, several cache side channel attacks have been proposed to extract secret information by exploiting design flaws of the microarchitecture. The Flush+Reload attack, one of the cache side channel attack, can be applied to malicious application attacks due to its properties of high resolution and low noise. In this paper, we proposed a detection system, which detects the cache-based attacks using the PCM(Performance Counter Monitor) for monitoring CPU cache activity. Especially, we observed the variation of each counter value of PCM in case of two kinds of attacks, Spectre attack and secret recovering attack during AES encryption. As a result, we found that four hardware counters were sensitive to cache side channel attacks. Our detector based on machine learning including SVM(Support Vector Machine), RF(Random Forest) and MLP(Multi Level Perceptron) can detect the cache side channel attacks with high detection accuracy.

Processor Design Technique for Low-Temperature Filter Cache (필터 캐쉬의 저온도 유지를 위한 프로세서 설계 기법)

  • Choi, Hong-Jun;Yang, Na-Ra;Lee, Jeong-A;Kim, Jong-Myon;Kim, Cheol-Hong
    • Journal of the Korea Society of Computer and Information
    • /
    • v.15 no.1
    • /
    • pp.1-12
    • /
    • 2010
  • Recently, processor performance has been improved dramatically. Unfortunately, as the process technology scales down, energy consumption in a processor increases significantly whereas the processor performance continues to improve. Moreover, peak temperature in the processor increases dramatically due to the increased power density, resulting in serious thermal problem. For this reason, performance, energy consumption and thermal problem should be considered together when designing up-to-date processors. This paper proposes three modified filter cache schemes to alleviate the thermal problem in the filter cache, which is one of the most energy-efficient design techniques in the hierarchical memory systems : Bypass Filter Cache (BFC), Duplicated Filter Cache (DFC) and Partitioned Filter Cache (PFC). BFC scheme enables the direct access to the L1 cache when the temperature on the filter cache exceeds the threshold, leading to reduced temperature on the filter cache. DFC scheme lowers temperature on the filter cache by appending an additional filter cache to the existing filter cache. The filter cache for PFC scheme is composed of two half-size filter caches to lower the temperature on the filter cache by reducing the access frequency. According to our simulations using Wattch and Hotspot, the proposed partitioned filter cache shows the lowest peak temperature on the filter cache, leading to higher reliability in the processor.