• Title/Summary/Keyword: Trace cache

Search Result 45, Processing Time 0.026 seconds

A New trace-driven Simulation Algorithm for Sector Cache Memories with Various Block Sizes (다양한 블럭 크기를 갖는 섹터 캐시 메모리의 Trace-driven 시뮬레이션 알고리즘)

  • Dong Gue Park
    • Journal of the Korean Institute of Telematics and Electronics B
    • /
    • v.32B no.6
    • /
    • pp.849-861
    • /
    • 1995
  • In this paper, a new trace driven simulation algorithm is proposed to evaluate the bus traffic and the miss ration of the various sector cache memories, which have various sub-block sizes and block sizes and associativities and number of sets, with a single pass through an address trace. Trace-driven simulaton is usually used as a method for performance evaluation of sector cache memories, but it spends a lot of simulation time for simulating the diverse cache configurations with a long address trace. The proposed algorithm shortens the simulation time by evaluating the performance of the various sector cache configurations. which have various sub-block sizes and block sizes and associativities and number of sets , with a single pass through an address trace. Our simulation results show that the run times of the proposed simulation algorithm can be considerably reduced than those of existing simulation algorithms, when the proposed algorithm is miplemented in C language and the address traces obtained from the various sample programs are used as a input of trace-driven simulation.

  • PDF

Low Power Trace Cache for Embedded Processor

  • Moon Je-Gil;Jeong Ha-Young;Lee Yong-Surk
    • Proceedings of the IEEK Conference
    • /
    • summer
    • /
    • pp.204-208
    • /
    • 2004
  • Embedded business will be expanded market more and more since customers seek more wearable and ubiquitous systems. Cellular telephones, PDAs, notebooks and portable multimedia devices could bring higher microprocessor revenues and more rewarding improvements in performance and functions. Increasing battery capacity is still creeping along the roadmap. Until a small practical fuel cell becomes available, microprocessor developers must come up with power-reduction methods. According to MPR 2003, the instruction and data caches of ARM920T processor consume $44\%$ of total processor power. The rest of it is split into the power consumptions of the integer core, memory management units, bus interface unit and other essential CPU circuitry. And the relationships among CPU, peripherals and caches may change in the future. The processor working on higher operating frequency will exact larger cache RAM and consume more energy. In this paper, we propose advanced low power trace cache which caches traces of the dynamic instruction stream, and reduces cache access times. And we evaluate the performance of the trace cache and estimate the power of the trace cache, which is compared with conventional cache.

  • PDF

Dynamic Cache Partitioning Strategy for Efficient Buffer Cache Management (효율적인 버퍼 캐시 관리를 위한 동적 캐시 분할 블록교체 기법)

  • 진재선;허의남;추현승
    • Journal of the Korea Society for Simulation
    • /
    • v.12 no.2
    • /
    • pp.35-44
    • /
    • 2003
  • The effectiveness of buffer cache replacement algorithms is critical to the performance of I/O systems. In this paper, we propose the degree of inter-reference gap (DIG) based block replacement scheme that retains merits of the least recently used (LRU) such as simple implementation and good cache hit ratio (CHR) for general patterns of references, and improves CHR further. In the proposed scheme, cache blocks with low DIGs are distinguished from blocks with high DIGs and the replacement block is selected among high DIGs blocks as done in the low inter-reference recency set (LIRS) scheme. Thus, by having the effect of the partitioning the cache memory dynamically based on DIGs, CHR is improved. Trace-driven simulation is employed to verified the superiority of the DIG based scheme and shows that the performance improves up to about 175% compared to the LRU scheme and 3% compared to the LIRS scheme for the same traces.

  • PDF

CPC: A File I/O Cache Management Policy for Compute-Bound Workloads

  • Bahn, Hyokyung
    • International journal of advanced smart convergence
    • /
    • v.11 no.2
    • /
    • pp.1-6
    • /
    • 2022
  • With the emergence of the new era of the 4th industrial revolution, compute-bound workloads with large memory footprint like big data processing increase dramatically. Even in such compute-bound workloads, however, we observe bulky I/Os while loading big data from storage to memory. Although file I/O cache plays a role of accelerating the performance of storage I/O, we found out that the cache hit rate in such environments is not improved even though we increase the file I/O cache capacity because of some special I/O references generated by compute-bound workloads. To cope with this situation, we propose a new file I/O cache management policy that improves the cache hit rate for compute-bound workloads significantly. Trace-driven simulations by replaying file I/O reference logs of compute-bound workloads show that the proposed cache management policy improves the cache hit rate compared to the well-acknowledged CLOCK algorithm by a large margin.

Performance Analysis of n-way Associative Cache and Fully Associative Cache (n-way Set Associative Cache와 Fully Associative Cache성능 분석)

  • Jo, Yong-Hun;Kim, Jeong-Seon
    • The Transactions of the Korea Information Processing Society
    • /
    • v.4 no.3
    • /
    • pp.802-810
    • /
    • 1997
  • In this paper, the performance of direce mapping caches, 2_, 4_, 8_, .., 4096_way way set associative caches, and fully assiciative caches are analyized by trace simulation for verivying their effectiveness.In general, it is well known that as n, the number of main memory lines to be stored into one cache line number in direct mapping cache, increases, the performance of the cache memory should get higher linearly.According to our analysis, however, it is not true on all the cache organizations.It is shown that as n increases, miss ratios get lower only when the small cache(less than 256K) using large line size is used.It is also shown that fully associative mapping achieves high performance only when small size cache using large line size ia used.

  • PDF

Performance and Energy Optimization for Low-Write Performance Non-volatile Main Memory Systems (낮은 쓰기 성능을 갖는 비휘발성 메인 메모리 시스템을 위한 성능 및 에너지 최적화 기법)

  • Jung, Woo-Soon;Lee, Hyung-Gyu
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.13 no.5
    • /
    • pp.245-252
    • /
    • 2018
  • Non-volatile RAM devices have been increasingly viewed as an alternative of DRAM main memory system. However some technologies including phase-change memory (PCM) are still suffering from relatively poor write performance as well as limited endurance. In this paper, we introduce a proactive last-level cache management to efficiently hide a low write performance of non-volatile main memory systems. The proposed method significantly reduces the cache miss penalty by proactively evicting the part of cachelines when the non-volatile main memory system is in idle state. Our trace-driven simulation demonstrates 24% performance enhancement, compared with a conventional LRU cache management, on the average.

Bitmap-based Prefix Caching for Fast IP Lookup

  • Kim, Jinsoo;Ko, Myeong-Cheol;Nam, Junghyun;Kim, Junghwan
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.8 no.3
    • /
    • pp.873-889
    • /
    • 2014
  • IP address lookup is very crucial in performance of routers. Several works have been done on prefix caching to enhance the performance of IP address lookup. Since a prefix represents a range of IP addresses, a prefix cache shows better performance than an IP address cache. However, not every prefix is cacheable in itself. In a prefix cache it causes false hit to cache a non-leaf prefix because there is possibly the longer matching prefix in the routing table. Prefix expansion techniques such as complete prefix tree expansion (CPTE) make it possible to cache the non-leaf prefixes as the expanded forms, but it is hard to manage the expanded prefixes. The expanded prefixes sometimes incur a great deal of update overhead in a routing table. We propose a bitmap-based prefix cache (BMCache) to provide low update overhead as well as low cache miss ratio. The proposed scheme does not have any expanded prefixes in the routing table, but it can expand a non-leaf prefix using a bitmap on caching time. The trace-driven simulation shows that BMCache has very low miss ratio in spite of its low update overhead compared to other schemes.

The Effects of Cache Memory on the System Bus Traffic (캐쉬 메모리가 버스 트래픽에 끼치는 영향)

  • 조용훈;김정선
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.21 no.1
    • /
    • pp.224-240
    • /
    • 1996
  • It is common sense for at least one or more levels of cache memory to be used in these day's computer systems. In this paper, the impact of the internal cache memory organization on the performance of the computer is investigated by using a simulator program, which is wirtten by authors and run on SUN SPARC workstation, with several real execution, with several real execution trace files. 280 cache organizations have been simulated using n-way set associative mapping and LRU(Least Recently Used) replacement algorithm with write allocation policy. As a result, 16-way setassociative cache is the best configuration, and when we select 256KB cache memory and 64 byte line size, the bus traffic ratio was decreased compared to that of the noncache system so that a single bus could support almost 7 processors without any delay and degradationof high ratio(hit ratio was 99.21%). The smaller the line size we choose, the little lower hit ratio we can get, but the more processors can be supported by a single bus(maximum 18 processors). Therefore, using a proper cache memory organization can make a single bus structure be able to support multiple processors without any performance degradation.

  • PDF

An Efficient Caching Algorithm to Minimize Duplicated Disk Blocks in 2-level Disk Cache System (2-레벨 디스크 캐쉬 시스템에서 디스크 블록 중복 저장을 최소화하는 효율적인 캐싱 알고리즘)

  • 류갑상;정수목
    • Journal of the Korea Computer Industry Society
    • /
    • v.5 no.1
    • /
    • pp.57-64
    • /
    • 2004
  • The speed gap between processors and disks is a serious problem. So, I/O sub-system limits the performance of computer system. To overcome the speed gap, caches have been used in computer system. By using cache, the access times to disk blocks can be reduced and the performance of computer system can be improved. In this paper, we proposed an efficient cache management algorithm for computer system which have buffer cache and disk cache. The proposed algorithm can minimize the duplicated blocks between buffet cache and disk cache. We evaluate the proposed algorithm by trace-driven simulation. The simulation results show that the proposed algorithm can reduce the mean access time to disk blocks.

  • PDF

Performance Improvement of Operand Fetching with the Operand Reference Prediction Cache(ORPC) (오퍼랜드 참조 예측 캐쉬(ORPC)를 활용한 오퍼랜드 페치의 성능 개선)

  • Kim, Heung-Jun;Cho, Kyung-San
    • The Transactions of the Korea Information Processing Society
    • /
    • v.5 no.6
    • /
    • pp.1652-1659
    • /
    • 1998
  • To provide performance gains by reducing the operand referencing latency and data cache bandwidth requirements, we present an operand reference prediction cache (ORPC) which predicts operand value and address translation during the instruction fetch stage. The prediction is verified in the early stage, and thus it minimizes the performance penalty caused by the misprediction. Through the trace-driven simulation of six benchmark programs, the performance improvement by proposed three aRPC stmctures (OfiPC1, OfiPC2. ORPC3)is analysed and validated.

  • PDF