• Title/Summary/Keyword: cache performance

Search Result 659, Processing Time 0.031 seconds

An Index Structure for Main-memory Storage Systems using The Level Pre-fetching

  • Lee, Seok-Jae;Yoon, Jong-Hyun;Song, Seok-Il;Yoo, Jae-Soo
    • International Journal of Contents
    • /
    • v.3 no.1
    • /
    • pp.19-23
    • /
    • 2007
  • Recently, several main-memory index structures have been proposed to reduce the impact of secondary cache misses. In mainmemory storage systems, secondary cache misses have a substantial effect on the performance of index structures. However, recent studies still stiffer from secondary cache misses when visiting each level of index tree. In this paper, we propose a new index structure that minimizes the total amount of cache miss latency. The proposed index structure prefetched grandchildren of a current node. The basic structure of the proposed index structure is based on that of the CSB+-Tree, which uses the concept of a node group to increase fan-out. However, the insert algorithm of the proposed index structure significantly reduces the cost of a split. The superiority of our algorithm is shown through performance evaluation.

A New trace-driven Simulation Algorithm for Sector Cache Memories with Various Block Sizes (다양한 블럭 크기를 갖는 섹터 캐시 메모리의 Trace-driven 시뮬레이션 알고리즘)

  • Dong Gue Park
    • Journal of the Korean Institute of Telematics and Electronics B
    • /
    • v.32B no.6
    • /
    • pp.849-861
    • /
    • 1995
  • In this paper, a new trace driven simulation algorithm is proposed to evaluate the bus traffic and the miss ration of the various sector cache memories, which have various sub-block sizes and block sizes and associativities and number of sets, with a single pass through an address trace. Trace-driven simulaton is usually used as a method for performance evaluation of sector cache memories, but it spends a lot of simulation time for simulating the diverse cache configurations with a long address trace. The proposed algorithm shortens the simulation time by evaluating the performance of the various sector cache configurations. which have various sub-block sizes and block sizes and associativities and number of sets , with a single pass through an address trace. Our simulation results show that the run times of the proposed simulation algorithm can be considerably reduced than those of existing simulation algorithms, when the proposed algorithm is miplemented in C language and the address traces obtained from the various sample programs are used as a input of trace-driven simulation.

  • PDF

Improving Performance of Internet by Using Hierarchical Proxy Cache (계층적 프록시 캐쉬를 이용한 인터넷 성능 향상 기법)

  • 이효일;김종현
    • Journal of the Korea Society for Simulation
    • /
    • v.9 no.2
    • /
    • pp.1-14
    • /
    • 2000
  • Recently, as construction of information infra including high-speed communication networks remarkably expands, more various information services have been provided. Thus the number of internet users rapidly increases, and it results in heavy load on Web server and higher traffics on networks. The phenomena cause longer response time that means worse quality of service. To solve such problems, much effort has been attempted to loosen bottleneck on Web server, reduce traffic on networks and shorten response times by caching informations being accessed more frequently at the proxy server that is located near to clients. And it is also possible to improve internet performance further by allowing clients to share informations stored in proxy caches. In this paper, we perform simulations of hierarchical proxy caches with the 3-level 4-ary tree structure by using real web traces, and analyze cache hit ratio for various cache replacement policies and cache sizes when the delayed-store scheme is applied. According to simulation results, the delayed-store scheme increases the remote cache hit ratio, that improves quality of service by shortening the service response time.

  • PDF

A New Hybrid Architecture for Cooperative Web Caching

  • Baek, Jin-Suk;Kaur, Gurpreet;Yang, Jung-Hoon
    • Journal of Ubiquitous Convergence Technology
    • /
    • v.2 no.1
    • /
    • pp.1-11
    • /
    • 2008
  • An effective solution to the problems caused by the explosive growth of World Wide Web is a web caching that employing an additional server, called proxy cache, between the clients and main server for caching the popular web objects near the clients. However, a single proxy cache can easily become the bottleneck. Deploying groups of cooperative caches provides scalability and robustness by eliminating the limitations caused by a single proxy cache. Two common architectures to implement the cooperative caching are hierarchical and distributed caching systems. Unfortunately, both architectures suffer from performance limitations. We propose an efficient hybrid caching architecture eliminating these limitations by using both the hierarchical and same level caches. Our performance evaluation with our investigated simulator shows that the proposed architecture offers the best of both existing architectures in terms of cache hit rate, the number of query messages from clients, and response time.

  • PDF

A Register-Based Caching Technique for the Advanced Performance of Multithreaded Models (다중스레드 모델의 성능 향상을 위한 가용 레지스터 기반 캐슁 기법)

  • Go, Hun-Jun;Gwon, Yeong-Pil;Yu, Won-Hui
    • The KIPS Transactions:PartA
    • /
    • v.8A no.2
    • /
    • pp.107-116
    • /
    • 2001
  • A multithreaded model is a hybrid one which combines locality of execution of the von Neumann model with asynchronous data availability and implicit parallelism of the dataflow model. Much researches that have been made toward the advanced performance of multithreaded models are about the cache memory which have been proved to be efficient in the von Neumann model. To use an instruction cache or operand cache, the multithreaded models must have cache memories. If cache memories are added to the multithreaded model, they may have the disadvantage of high implementation cost in the mode. To solve these problems, we did not add cache memory but applied the method of executing the caching by using available registers of the multithreaded models. The available register-based caching method is one that use the registers which are not used on the execution of threads. It may accomplish the same effect as the cache memory. The multithreaded models can compute the number of available registers to be used during the process of the register optimization, and therefore this method can be easily applied on the models. By applying this method, we can also remove the access conflict and the bottleneck of frame memories. When we applied the proposed available register-based caching method, we found that there was an improved performance of the multithreaded model. Also, when the available-register-based caching method is compared with the cache based caching method, we found that there was the almost same execution overhead.

  • PDF

Performance Analysis of Flash Memory SSD with Non-volatile Cache for Log Storage (비휘발성 캐시를 사용하는 플래시 메모리 SSD의 데이터베이스 로깅 성능 분석)

  • Hong, Dae-Yong;Oh, Gi-Hwan;Kang, Woon-Hak;Lee, Sang-Won
    • Journal of KIISE
    • /
    • v.42 no.1
    • /
    • pp.107-113
    • /
    • 2015
  • In a database system, updates on pages that are made by a transaction should be stored in a secondary storage before the commit is complete. Generic secondary storages have volatile DRAM caches to hide long latency for non-volatile media. However, as logs that are only written to the volatile DRAM cache don't ensure durability, logging latency cannot be hidden. Recently, a flash SSD with capacitor-backed DRAM cache was developed to overcome the shortcoming. Storage devices, like those with a non-volatile cache, will increase transaction throughput because transactions can commit as soon as the logs reach the cache. In this paper, we analyzed performance in terms of transaction throughput when the SSD with capacitor-backed DRAM cache was used as log storage. The transaction throughput can be improved over three times, by committing right after storing the logs to the DRAM cache, rather than to a secondary storage device. Also, we showed that it could acquire over 73% of the ideal logging performance with proper tuning.

The Early Write Back Scheme For Write-Back Cache (라이트 백 캐쉬를 위한 빠른 라이트 백 기법)

  • Chung, Young-Jin;Lee, Kil-Whan;Lee, Yong-Surk
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.46 no.11
    • /
    • pp.101-109
    • /
    • 2009
  • Generally, depth cache and pixel cache of 3D graphics are designed by using write-back scheme for efficient use of memory bandwidth. Also, there are write after read operations of same address or only write operations are occurred frequently in 3D graphics cache. If a cache miss is detected, an access to the external memory for write back operation and another access to the memory for handling the cache miss are operated simultaneously. So on frequent cache miss situations, as the memory access bandwidth limited, the access time of the external memory will be increased due to memory bottleneck problem. As a result, the total performance of the processor or the IP will be decreased, also the problem will increase peak power consumption. So in this paper, we proposed a novel early write back cache architecture so as to solve the problems issued above. The proposed architecture controls the point when to access the external memory as to copy the valid data block. And this architecture can improve the cache performance with same hit ratio and same capacity cache. As a result, the proposed architecture can solve the memory bottleneck problem by preventing intensive memory accesses. We have evaluated the new proposed architecture on 3D graphics z cache and pixel cache on a SoC environment where ARM11, 3D graphic accelerator and various IPs are embedded. The simulation results indicated that there were maximum 75% of performance increase when using various simulation vectors.

Data Cache System based on the Selective Bank Algorithm for Embedded System (내장형 시스템을 위한 선택적 뱅크 알고리즘을 이용한 데이터 캐쉬 시스템)

  • Jung, Bo-Sung;Lee, Jung-Hoon
    • The KIPS Transactions:PartA
    • /
    • v.16A no.2
    • /
    • pp.69-78
    • /
    • 2009
  • One of the most effective way to improve cache performance is to exploit both temporal and spatial locality given by any program executive characteristics. In this paper we present a high performance and low power cache structure with a bank selection mechanism that enhances exploitation of spatial and temporal locality. The proposed cache system consists of two parts, i.e., a main direct-mapped cache with a small block size and a fully associative buffer with a large block size as a multiple of the small block size. Especially, the main direct-mapped cache is constructed as two banks for low power consumption and stores a small block which is selected from fully associative buffer by the proposed bank selection algorithm. By using the bank selection algorithm and three state bits, We selectively extend the lifetime of those small blocks with high temporal locality by storing them in the main direct-mapped caches. This approach effectively reduces conflict misses and cache pollution at the same time. According to the simulation results, the average miss ratio, compared with the Victim and STAS caches with the same size, is improved by about 23% and 32% for Mibench applications respectively. The average memory access time is reduced by about 14% and 18% compared with the he victim and STAS caches respectively. It is also shown that energy consumption of the proposed cache is around 10% lower than other cache systems that we examine.

Determination of a Grain Size for Reducing Cache Miss Rate of Direct-Mapped Caches (직접 사상 캐쉬의 캐쉬 실패율을 감소시키기 위한 성김도 정책)

  • Jung, In-Bum;Kong, Ki-Sok;Lee, Joon-Won
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.27 no.7
    • /
    • pp.665-674
    • /
    • 2000
  • In data parallel programs incurring high cache locality, the choice of grain sizes affects cache performance. Though the grain sizes chosen provide fair load balance among processors, the grain sizes that ignore underlying caching effect result in address interferences between grains allocated to a processor. These address interferences appear to have a negative impact on the cache locality, since they result in cache conflict misses. To address this problem, we propose a best grain size driven from a cache size and the number of processors based on direct mapped cache's characteristic. Since the proposed method does not map the grains to the same location in the cache, cache conflict misses are reduced. Simulation results show that the proposed best grain size substantially improves the performance of tested data parallel programs through the reduction of cache misses on direct-mapped caches.

  • PDF

Design of an Asynchronous Data Cache with FIFO Buffer for Write Back Mode (Write Back 모드용 FIFO 버퍼 기능을 갖는 비동기식 데이터 캐시)

  • Park, Jong-Min;Kim, Seok-Man;Oh, Myeong-Hoon;Cho, Kyoung-Rok
    • The Journal of the Korea Contents Association
    • /
    • v.10 no.6
    • /
    • pp.72-79
    • /
    • 2010
  • In this paper, we propose the data cache architecture with a write buffer for a 32bit asynchronous embedded processor. The data cache consists of CAM and data memory. It accelerates data up lood cycle between the processor and the main memory that improves processor performance. The proposed data cache has 8 KB cache memory. The cache uses the 4-way set associative mapping with line size of 4 words (16 bytes) and pseudo LRU replacement algorithm for data replacement in the memory. Dirty register and write buffer is used for write policy of the cache. The designed data cache is synthesized to a gate level design using $0.13-{\mu}m$ process. Its average hit rate is 94%. And the system performance has been improved by 46.53%. The proposed data cache with write buffer is very suitable for a 32-bit asynchronous processor.