• Title/Summary/Keyword: Cache-miss

Search Result 99, Processing Time 0.029 seconds

Design of Cache Memory System for Next Generation CPU (차세대 CPU를 위한 캐시 메모리 시스템 설계)

  • Jo, Ok-Rae;Lee, Jung-Hoon
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.11 no.6
    • /
    • pp.353-359
    • /
    • 2016
  • In this paper, we propose a high performance L1 cache structure for the high clock CPU. The proposed cache memory consists of three parts, i.e., a direct-mapped cache to support fast access time, a two-way set associative buffer to reduce miss ratio, and a way-select table. The most recently accessed data is stored in the direct-mapped cache. If a data has a high probability of a repeated reference, when the data is replaced from the direct-mapped cache, the data is stored into the two-way set associative buffer. For the high performance and fast access time, we propose an one way among two ways set associative buffer is selectively accessed based on the way-select table (WST). According to simulation results, access time can be reduced by about 7% and 40% comparing with a direct cache and Intel i7-6700 with two times more space respectively.

Advanced Victim Cache with Processor Reuse Information (프로세서의 재사용 정보를 이용하는 개선된 고성능 희생 캐쉬)

  • Kwak Jong Wook;Lee Hyunbae;Jhang Seong Tae;Jhon Chu Shik
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.31 no.12
    • /
    • pp.704-715
    • /
    • 2004
  • Recently, a single or multi processor system uses the hierarchical memory structure to reduce the time gap between processor clock rate and memory access time. A cache memory system includes especially two or three levels of caches to reduce this time gap. Moreover, one of the most important things In the hierarchical memory system is the hit rate in level 1 cache, because level 1 cache interfaces directly with the processor. Therefore, the high hit rate in level 1 cache is critical for system performance. A victim cache, another high level cache, is also important to assist level 1 cache by reducing the conflict miss in high level cache. In this paper, we propose the advanced high level cache management scheme based on the processor reuse information. This technique is a kind of cache replacement policy which uses the frequency of processor's memory accesses and makes the higher frequency address of the cache location reside longer in cache than the lower one. With this scheme, we simulate our policy using Augmint, the event-driven simulator, and analyze the simulation results. The simulation results show that the modified processor reuse information scheme(LIVMR) outperforms the level 1 with the simple victim cache(LIV), 6.7% in maximum and 0.5% in average, and performance benefits become larger as the number of processors increases.

Performance Analysis of n-way Associative Cache and Fully Associative Cache (n-way Set Associative Cache와 Fully Associative Cache성능 분석)

  • Jo, Yong-Hun;Kim, Jeong-Seon
    • The Transactions of the Korea Information Processing Society
    • /
    • v.4 no.3
    • /
    • pp.802-810
    • /
    • 1997
  • In this paper, the performance of direce mapping caches, 2_, 4_, 8_, .., 4096_way way set associative caches, and fully assiciative caches are analyized by trace simulation for verivying their effectiveness.In general, it is well known that as n, the number of main memory lines to be stored into one cache line number in direct mapping cache, increases, the performance of the cache memory should get higher linearly.According to our analysis, however, it is not true on all the cache organizations.It is shown that as n increases, miss ratios get lower only when the small cache(less than 256K) using large line size is used.It is also shown that fully associative mapping achieves high performance only when small size cache using large line size ia used.

  • PDF

Reuse Information based Thrashing Resistant Cache Management Scheme

  • Sim, Gyu Yeon;Kim, Cheol Hong
    • Journal of the Korea Society of Computer and Information
    • /
    • v.22 no.3
    • /
    • pp.9-16
    • /
    • 2017
  • In recent computing systems, LRU replacement policy has been widely used because it can be simply implemented and applicable to most programs. However, if the working set size of the program is bigger than the actual cache size, LRU replacement policy may occur thrashing problem. Thrashing problem means that cache blocks are consistently replaced without re-referencing in the cache. This paper proposes a new cache management scheme to solve the thrashing problem in the second-level cache. The proposed scheme measures per set reuse frequency using EAF structure to find thrashing sets. When the cache miss occurs, it tests whether the address of the missed block is stored or not. If the address of the missed block is stored, it means that the recently evicted block is re-requested, so the reuse frequency is predicted high. In this case, the corresponding counter of the set is increased. When the counter value is bigger than the threshold value, we assume that the corresponding set shows high reuse frequency. The proposed scheme assigns the set with high reuse frequency to the additional small size cache to keep the blocks in the cache for a long time. Our experimental results show that the proposed scheme improves the IPC by 3.81% on average.

Data Cache System based on the Selective Bank Algorithm for Embedded System (내장형 시스템을 위한 선택적 뱅크 알고리즘을 이용한 데이터 캐쉬 시스템)

  • Jung, Bo-Sung;Lee, Jung-Hoon
    • The KIPS Transactions:PartA
    • /
    • v.16A no.2
    • /
    • pp.69-78
    • /
    • 2009
  • One of the most effective way to improve cache performance is to exploit both temporal and spatial locality given by any program executive characteristics. In this paper we present a high performance and low power cache structure with a bank selection mechanism that enhances exploitation of spatial and temporal locality. The proposed cache system consists of two parts, i.e., a main direct-mapped cache with a small block size and a fully associative buffer with a large block size as a multiple of the small block size. Especially, the main direct-mapped cache is constructed as two banks for low power consumption and stores a small block which is selected from fully associative buffer by the proposed bank selection algorithm. By using the bank selection algorithm and three state bits, We selectively extend the lifetime of those small blocks with high temporal locality by storing them in the main direct-mapped caches. This approach effectively reduces conflict misses and cache pollution at the same time. According to the simulation results, the average miss ratio, compared with the Victim and STAS caches with the same size, is improved by about 23% and 32% for Mibench applications respectively. The average memory access time is reduced by about 14% and 18% compared with the he victim and STAS caches respectively. It is also shown that energy consumption of the proposed cache is around 10% lower than other cache systems that we examine.

Static Analysis of Cache Interference Miss and Prediction of Program Execution Time (캐쉬 간섭실패의 정적분석 및 프로그램의 수행시간 예측)

  • Lee, Geon-Yeong;Jeong, Yu-Seok;Hong, Man-Pyo
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.27 no.11
    • /
    • pp.881-889
    • /
    • 2000
  • 프로그램의 실행시간은 캐쉬메모리의 효율적 사용과 밀접한 관계가 있다. 특히 간섭 실패는 프로그램의 성능에 큰 영향을 미치지만 나타나는 형태가 불규칙적이므로 예측하기가 매우 어렵다. 본 논문에서는 직접 사상 캐쉬전략을 사용한 완전 중첩 루프 내 배열의 캐쉬 실패율(cache miss ratio)을 구하는 분석적 모델을 제시한다. 논문에서 제시한 모델을 임의의 캐쉬 위치에 각 배열이 접근한 시간을 기반으로 다음주기에서 캐쉬 실패의 발생 여부를 예측하는데, 간섭으로 발생한 캐쉬 실패 개수에 대해 기존에 제시된 모델보다 더 빠르고 정확한 예측이 가능하다. 특히, 한문장의 수행시간 예측시간은 배열의 크기와 독립적이기 때문에, 전체 프로그램의 수행시간 예측은 배열의 크기 및 문장의 반복 회수 배만큼 빠른 결과를 보여준다. 본 모델은 프로그램의 성능예측 뿐만 아니라 데이터 지역성의 최적화, 캐쉬 구성, 스케쥴링 등에서도 이용 가능하다.

  • PDF

Cache Algorithm in Reverse Connection Setup Protocol(CRCP) for effective Location Management in PCS Network (PCS 네트워크 상에서 효율적인 위치관리를 위한 역방향 호설정 캐쉬 알고리즘(CRCP)에 관한 연구)

  • Ahn, Yun-Shok;An, Seok;Bae, Yun-Jeong;Jo, Jea-Jun;Kim, Jae-Ha;Kim, Byung-Gi
    • Proceedings of the KIEE Conference
    • /
    • 1998.11b
    • /
    • pp.630-632
    • /
    • 1998
  • The basic user location strategies proposed in current PCS(Personal Communication Services) Network are two-level Database strategies. These Databases which exist in the Signalling network always maintain user's current location information, and it is used in call setup process to a mobile user. As the number of PCS users are increasing, this strategies yield some problem such as concentrating signalling traffic on the Database, increasing Call setup Delay, and so on. In this paper, we proposed RCP(Reverse Connection setup Protocol) model, which apply RVC(Reverse Virtual Call setup) algorithm to PCS reference model, and CRCP(Cache algorithm in RCP) model, which adopt Caching strategies in the RCP model. When Cache-miss occur, we found that CRCP model require less miss-penalty than PCS model. Also we show that proposed models are always likely to yield better performance in terms of reduced Location Tracking Delay time.

  • PDF

An Index Structure for Main-memory Storage Systems using The Level Pre-fetching

  • Lee, Seok-Jae;Yoon, Jong-Hyun;Song, Seok-Il;Yoo, Jae-Soo
    • International Journal of Contents
    • /
    • v.3 no.1
    • /
    • pp.19-23
    • /
    • 2007
  • Recently, several main-memory index structures have been proposed to reduce the impact of secondary cache misses. In mainmemory storage systems, secondary cache misses have a substantial effect on the performance of index structures. However, recent studies still stiffer from secondary cache misses when visiting each level of index tree. In this paper, we propose a new index structure that minimizes the total amount of cache miss latency. The proposed index structure prefetched grandchildren of a current node. The basic structure of the proposed index structure is based on that of the CSB+-Tree, which uses the concept of a node group to increase fan-out. However, the insert algorithm of the proposed index structure significantly reduces the cost of a split. The superiority of our algorithm is shown through performance evaluation.

A New trace-driven Simulation Algorithm for Sector Cache Memories with Various Block Sizes (다양한 블럭 크기를 갖는 섹터 캐시 메모리의 Trace-driven 시뮬레이션 알고리즘)

  • Dong Gue Park
    • Journal of the Korean Institute of Telematics and Electronics B
    • /
    • v.32B no.6
    • /
    • pp.849-861
    • /
    • 1995
  • In this paper, a new trace driven simulation algorithm is proposed to evaluate the bus traffic and the miss ration of the various sector cache memories, which have various sub-block sizes and block sizes and associativities and number of sets, with a single pass through an address trace. Trace-driven simulaton is usually used as a method for performance evaluation of sector cache memories, but it spends a lot of simulation time for simulating the diverse cache configurations with a long address trace. The proposed algorithm shortens the simulation time by evaluating the performance of the various sector cache configurations. which have various sub-block sizes and block sizes and associativities and number of sets , with a single pass through an address trace. Our simulation results show that the run times of the proposed simulation algorithm can be considerably reduced than those of existing simulation algorithms, when the proposed algorithm is miplemented in C language and the address traces obtained from the various sample programs are used as a input of trace-driven simulation.

  • PDF

Performance and Energy Optimization for Low-Write Performance Non-volatile Main Memory Systems (낮은 쓰기 성능을 갖는 비휘발성 메인 메모리 시스템을 위한 성능 및 에너지 최적화 기법)

  • Jung, Woo-Soon;Lee, Hyung-Gyu
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.13 no.5
    • /
    • pp.245-252
    • /
    • 2018
  • Non-volatile RAM devices have been increasingly viewed as an alternative of DRAM main memory system. However some technologies including phase-change memory (PCM) are still suffering from relatively poor write performance as well as limited endurance. In this paper, we introduce a proactive last-level cache management to efficiently hide a low write performance of non-volatile main memory systems. The proposed method significantly reduces the cache miss penalty by proactively evicting the part of cachelines when the non-volatile main memory system is in idle state. Our trace-driven simulation demonstrates 24% performance enhancement, compared with a conventional LRU cache management, on the average.