• 제목/요약/키워드: Cache Memory

Search Result 412, Processing Time 0.338 seconds

Cache Architecture Design for the Performance Improvement of OpenRISC Core (OpenRISC 코어의 성능향상을 위한 캐쉬 구조 설계)

  • Jung, Hong-Kyun;Ryoo, Kwang-Ki
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.46 no.1
    • /
    • pp.68-75
    • /
    • 2009
  • As the recent performance of microprocessor is improving quickly, the necessity of cache is growing because of the increase of the access time of main memory. Every block of direct-mapped cache maps to one cache line. Although the mapping rule is simple, if different blocks map to one cache line, the miss ratio will be higher than the set-associative cache due to conflicts. In this paper, for the improvement of the direct-mapped cache of OpenRISC, 4-way set-associative cache is proposed. Four blocks of the main memory of the proposed cache map to one cache line so that the miss ratio is less than the direct-mapped cache. Pseudo-LRU Policy, which is one of the Line Replacement Policies, is used for decreasing the number of bits that store LRU value. The OpenRISC core including the 4-way set-associative cache was verified with FPGA emulation. As the result of performance measurement using test program, the performance of the OpenRISC core including the 4-way set-associative cache is higher than the previous one by 50% and the decrease of miss ratio is more than 15%.

A Register-Based Caching Technique for the Advanced Performance of Multithreaded Models (다중스레드 모델의 성능 향상을 위한 가용 레지스터 기반 캐슁 기법)

  • Go, Hun-Jun;Gwon, Yeong-Pil;Yu, Won-Hui
    • The KIPS Transactions:PartA
    • /
    • v.8A no.2
    • /
    • pp.107-116
    • /
    • 2001
  • A multithreaded model is a hybrid one which combines locality of execution of the von Neumann model with asynchronous data availability and implicit parallelism of the dataflow model. Much researches that have been made toward the advanced performance of multithreaded models are about the cache memory which have been proved to be efficient in the von Neumann model. To use an instruction cache or operand cache, the multithreaded models must have cache memories. If cache memories are added to the multithreaded model, they may have the disadvantage of high implementation cost in the mode. To solve these problems, we did not add cache memory but applied the method of executing the caching by using available registers of the multithreaded models. The available register-based caching method is one that use the registers which are not used on the execution of threads. It may accomplish the same effect as the cache memory. The multithreaded models can compute the number of available registers to be used during the process of the register optimization, and therefore this method can be easily applied on the models. By applying this method, we can also remove the access conflict and the bottleneck of frame memories. When we applied the proposed available register-based caching method, we found that there was an improved performance of the multithreaded model. Also, when the available-register-based caching method is compared with the cache based caching method, we found that there was the almost same execution overhead.

  • PDF

Preventing Fast Wear-out of Flash Cache with An Admission Control Policy

  • Lee, Eunji;Bahn, Hyokyung
    • JSTS:Journal of Semiconductor Technology and Science
    • /
    • v.15 no.5
    • /
    • pp.546-553
    • /
    • 2015
  • Recently, flash cache is widely adopted as the performance accelerator of legacy storage systems. Unlike other cache media, flash cache should be carefully managed as it has peculiar characteristics such as long write latency and limited P/E cycles. In particular, we make two prominent observations that can be utilized in managing flash cache. First, a serious worn-out problem happens when the working-set of a system is beyond the capacity of flash cache due to excessively frequent cache replacement. Second, more than 50% of data has no hit in flash cache as it is a second level cache. Based on these observations, we propose a cache admission control policy that does not cache data when it is first accessed, and inserts it into the cache only after its second access occurs within a certain time window. This allows the filtering of data disruptive to flash cache in terms of endurance and performance. With this policy, we prolong the lifetime of flash cache 2.3 times without any performance degradations.

BIM Geometry Cache Structure for Data Streaming with Large Volume (대용량 BIM 형상 데이터 스트리밍을 위한 캐쉬 구조)

  • Kang, Tae-Wook
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.18 no.9
    • /
    • pp.1-8
    • /
    • 2017
  • The purpose of this study is to propose a cache structure for processing large-volume building information modeling (BIM) geometry data,whereit is difficult to allocate physical memory. As the number of BIM orders has increased in the public sector, it is becoming more common to visualize and calculate large-volume BIM geometry data. Design and review collaboration can require a lot of time to download large-volume BIM data through the network. If the BIM data exceeds the physical free-memory limit, visualization and geometry computation cannot be possible. In order to utilize large amounts of BIM data on insufficient physical memory or a low-bandwidth network, it is advantageous to cache only the data necessary for BIM geometry rendering and calculation time. Thisstudy proposes acache structure for efficiently rendering and calculating large-volume BIM geometry data where it is difficult to allocate enough physical memory.

lpCSB+- tree : An Enhanced Main Memory Index Structure Employing the Level Prefetching Technique (lpCSB+-트리 : 레벨 프리페칭 기법을 이용하는 향상된 주기억장치 상주형 색인구조)

  • Hong Hyun Taek;Pee Jun Il;Song Seok Il;Yoo Jae Soo
    • Journal of KIISE:Databases
    • /
    • v.31 no.6
    • /
    • pp.675-683
    • /
    • 2004
  • In main-memory resident index structures, secondary cache misses considerably have an effect on the performance of index structures. Recently, several main-memory resident index structures that consider cache have been proposed to reduce the impact of secondary cache misses. However they still suffer from full secondary cache misses whenever visiting each level of a index tree. In this paper, we propose a new index structure that eliminates cache misses even when visiting each level of index tree. The proposed index structure prefetches the grandchildren of a current node. The basic structure of the proposed index structure is from CSB+-tree that uses the concepts of the node group to increase fan-out. However the insert algorithm of the proposed index structure reduces the cost of a split significantly. Also, we show the superiority of our algorithm through various performance evaluation.

Reduction of Read Access Latency by Invalid Hint in Directory-Based Cache Coherence Scheme (디렉토리를 이용한 캐쉬 일관성 유지 기법에서 무효화 힌트를 이용한 읽기 접근 시간 감소)

  • Oh, Seung-Taek;Rhee, Yun-Seok;Maeng, Seung-Ryoul;Lee, Joon-Won
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.27 no.4
    • /
    • pp.408-415
    • /
    • 2000
  • Large scale shared memory multiprocessors have suffered from large access latency to shared memory. The large latency partially stems from a feature of directory-based cache coherence schemes which require a shared memory access to be serviced at a home node of the memory block. The home visit results in three or more hops traversal for a memory read access. The traversal becomes much longer as a system scales up. In this paper, we propose a new cache coherence scheme that reduces read access latency. The proposed scheme exploits ideas of invalid hint. Invalid hint for a cache block means which node has invalidated the cache block before. Thus a read access request can be directly sent to and serviced by the node (called owner) without help of a home node. Execution-driven simulation is employed to evaluate performance of the proposed scheme. The simulation results show that read access latency and execution time are reduced.

  • PDF

Hashing Method with Dynamic Server Information for Load Balancing on a Scalable Cluster of Cache Servers (확장성 있는 캐시 서버 클러스터에서의 부하 분산을 위한 동적 서버 정보 기반의 해싱 기법)

  • Hwak, Hu-Keun;Chung, Kyu-Sik
    • The KIPS Transactions:PartA
    • /
    • v.14A no.5
    • /
    • pp.269-278
    • /
    • 2007
  • Caching in a cache sorrel cluster environment has an advantage that minimizes the request and response tine of internet traffic and web user. Then, one of the methods that increases the hit ratio of cache is using the hash function with cooperative caching. It is keeping a fixed size of the total cache memory regardless of the number of cache servers. On the contrary, if there is no cooperative caching, the total size of cache memory increases proportional to the number of cache sowers since each cache server should keep all the cache data. The disadvantage of hashing method is that clients' requests stress a few servers in all the cache servers due to the characteristics of hashing md the overall performance of a cache server cluster depends on a few servers. In this paper, we propose the method that distributes uniformly client requests between cache servers using dynamic server information. We performed experiments using 16 PCs. Experimental results show the uniform distribution o

A Review of Data Management Techniques for Scratchpad Memory (스크래치패드 메모리를 위한 데이터 관리 기법 리뷰)

  • DOOSAN CHO
    • The Journal of the Convergence on Culture Technology
    • /
    • v.9 no.1
    • /
    • pp.771-776
    • /
    • 2023
  • Scratchpad memory is a software-controlled on-chip memory designed and used to mitigate the disadvantages of existing cache memories. Existing cache memories have TAG-related hardware control logic, so users cannot directly control cache misses, and their sizes are large and energy consumption is relatively high. Scratchpad memory has advantages in terms of size and energy consumption because it eliminates such hardware overhead, but there is a burden on software to manage data. In this study, data management techniques of scratchpad memory were classified and examined, and ways to maximize the advantages were discussed.

Automatic Detection of Memory Subsystem Parameters for Embedded Systems (임베디드 시스템을 위한 메모리 서브시스템 파라미터의 자동 검출)

  • Ha, Tae-Jun;Seo, Sang-Min;Chun, Po-Sung;Lee, Jae-Jin
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.15 no.5
    • /
    • pp.350-354
    • /
    • 2009
  • To optimize the performance of software programs, it is important to know certain hardware parameters such as the CPU speed, the cache size, the number of TLB entries, and the parameters of the memory subsystem. There exist several ways to obtain the values of various hardware parameters. Firstly. the values can be taken from the hardware manual. Secondly, the parameters can be obtained by calling functions provided by the operating systems. Finally, hardware detection programs can find the desired values. Such programs are usually executed on PC or server systems and report the CPU speed, the cache size, the number of TLB entries, and so on. However, they do not sufficiently detect the parameters of one of the most important parts of the computer concerning performance, namely the memory bank layout in the memory subsystem. In this paper, we present an algorithm to detect the memory bank parameters. We run an implementation of our algorithm on various embedded systems and compare the detected values with the real hardware parameters. The results show that the presented algorithm detects the cache size, the number of TLB entries, and the memory bank layout with high accuracy.

An Efficient Cache Management Scheme for Load Balancing in Distributed Environments with Different Memory Sizes (상이한 메모리 크기를 가지는 분산 환경에서 부하 분산을 위한 캐시 관리 기법)

  • Choi, Kitae;Yoon, Sangwon;Park, Jaeyeol;Lim, Jongtae;Lee, Seokhee;Bok, Kyoungsoo;Yoo, Jaesoo
    • KIISE Transactions on Computing Practices
    • /
    • v.21 no.8
    • /
    • pp.543-548
    • /
    • 2015
  • Recently, volume of data has been growing dramatically along with the growth of social media and digital devices. However, the existing disk-based distributed file systems have limits to their performance of data processing or data access, due to I/O processing costs and bottlenecks. To solve this problem, the caching technique is being used to manage data in the memory. In this paper, we propose a cache management scheme to handle load balancing in a distributed memory environment. The proposed scheme distributes the data according to the memory size, n distributed environments with different memory sizes. If overloaded nodes occur, it redistributes the the access time of the caching data. In order to show the superiority of the proposed scheme, we compare it with an existing distributed cache management scheme through performance evaluation.