Search | Korea Science

An Efficient Instruction Prefetching Scheme Based on the Page Access Information (페이지 접근 정보에 기반한 효율적인 명령어 캐쉬 선인출 기법)

Shin Soong-Hyun;Kim Cheol-Hong;Jhon Chu-Shik
- Journal of KIISE:Computer Systems and Theory
- /
- v.33 no.5
- /
- pp.306-315
- /
- 2006
In general, the hit ratio of the first level cache is one of the most important factors in determining the performance of computer systems. Prefetching from lower level memory structure is one of the most useful techniques for improving the hit ratio of the first level cache. In this paper, we propose a prefetch on continuous same page access (CSPA) scheme which improves the prefetch efficiency of the instruction cache and reduces prefetch cost at the same time. The proposed CSPA scheme traces the page addresses of executed instructions to count how many times the same memory page is accessed continuously. To increase the prefetch efficiency, the CSPA scheme initiates prefetch only if the number of accesses to the same page exceeds the threshold value. Generally, the size of a L1 cache block is smaller than that of a L2 cache block. Therefore, one L2 cache block contains a number of L1 cache blocks. To reduce the number of unnecessary accesses to the L2 cache due to prefetch, the CSPA scheme enables prefetch only when the missed L1 block and the prefetch L1 block are in the same L2 cache block, leading to reduced prefetch cost. According to our simulations, the proposed prefetching scheme improves the performance by up to 6.7%.
PDF KSCI

A Page Cache Mechanism for JFFS2 Flash File System in Embedded Systems (내장형 시스템에서 JFFS2 플래쉬 파일 시스템에 적합한 페이지 캐쉬 구조)

송형근;차호정
- Proceedings of the Korean Information Science Society Conference
- /
- 2003.10a
- /
- pp.271-273
- /
- 2003
본 논문은 JFFS2 플래쉬 파일 시스템에 적합한 페이지 캐쉬 구조를 제안한다. JFFS2 플래쉬 파일 시스템은 공간활용을 높이기 위해 데이터를 압축 저장하므로 기존 리눅스의 페이지 캐쉬가 효과적으로 사용될 수 있다. 그러나 멀티미디어 파일과 같이 비압축과 순차읽기 특성을 보이는 데이터는 플래쉬 메모리의 빠른 읽기 속도와 낮은 캐쉬적중률로 인해 기존 페이지 캐쉬는 문제점을 보인다. 본 논문에서는 JFFS2 플래쉬 파일 시스템에서 사용하는 리눅스의 페이지 캐쉬를 기술하고 문제점을 분석한다. 그리고 기존 연구에서 제시된 저전력 소모를 위한 페이지 캐쉬 구조에 기반하여 throughput 향상을 위한 페이지 캐쉬 사용 기법을 제시하고 평가한다.
PDF

An Energy-Delay Efficient System with Adaptive Victim Caches (선택적 희생 캐쉬를 이용한 저전력 고성능 시스템 설계 방안)

Kim Cheol Hong;Shim Sunghoon;Jhon Chu Shik;Jhang Seong Tae
- Journal of KIISE:Computer Systems and Theory
- /
- v.32 no.11_12
- /
- pp.663-674
- /
- 2005
We propose a system aimed at achieving high energy-delay efficiency by using adaptive victim caches. Particularly, we investigate methods to improve the hit rates in the first level of memory hierarchy, which reduces the number of accesses to mort power consuming memory structures such as L2 cache. Victim cache is a memory element for reducing conflict misses in a direct-mapped L1 cache. We present two techniques to fill the victim cache with the blocks that have higher probability to be re-reqeusted by processor. Hit-based victim cache ks tilled with the blocks which were referenced frequently by processor. Replacement-based victim cache is filled with the blocks which were evicted from the sets where block replacements had happened frequently According to our simulations, replacement-based victim cache scheme outperforms the conventional victim cache scheme about $2\%$ on average and refutes the power consumption by up to $8\%$.
PDF KSCI

Analysis of GPGPU Performance by dedicating L2 Cache for Texture Data (텍스쳐 데이터를 위한 2차 캐쉬 구조를 가지는 그래픽 처리 장치의 성능 분석)

Kim, Gwang Bok;Kim, Cheol Hong
- Proceedings of the Korean Society of Computer Information Conference
- /
- 2017.01a
- /
- pp.143-144
- /
- 2017
최근 그래픽 처리 장치는 DRAM에 대한 접근을 줄이고자 여러 메모리 계층을 사용하고 있다. GPGPU의 L2 캐쉬는 요청 데이터의 타입에 따라 별도로 접근하는 L1 메모리와 다르게 레이턴시가 긴 DRAM에 접근하기 전에 모든 데이터 타입이 접근 가능한 캐쉬이다. 본 논문에서는 애플리케이션에서 명시하는 다양한 데이터 타입에 대하여 접근 및 적재를 허용하는 L2 캐쉬를 오직 텍스쳐 데이터만을 허용하도록 하여 변화하는 성능을 분석하고자 한다. 본 실험을 위해 텍스쳐 데이터 이외의 데이터 타입은 L2 캐쉬를 바이패스하여 바로 DRAM에 접근하도록 구조를 변경한다. 실험을 통한 분석 결과 텍스쳐 데이터만을 허용하는 경우 대부분의 벤치마크에서 성능 감소가 발생하여 기존 구조대비 평균 5.58% 감소율을 확인하였다. 반대로, 본 논문의 실험 환경에서의 L2 캐쉬의 적중률이 낮은 애플리케이션인 needle은 불필요한 L2 접근을 바이패스 함으로써 전체적인 성능 증가를 이끌어낸 것으로 분석된다.
PDF

1pCSB+ - tree: An Enhanced Main Memory Index Structure Employing Level Prefetching Technique (1pCSB+ - 트리: 레벨 프리페칭 기법을 이용하는 향상된 주기억장치 상주형 색인구조)

Hong, Hyun-Taek;Pee, Jun-Il;Song, Seok-Il;Yoo, Jae-Soo
- Proceedings of the Korea Information Processing Society Conference
- /
- 2002.11c
- /
- pp.1753-1756
- /
- 2002
주기억장치 상주형 색인구조에서는 2차 캐쉬 실패가 성능에 매우 큰 영향을 미친다. 기존에 제안된 주기억장치 상주형 색인구조들은 2차 캐쉬 실패를 고려하긴 했지만 여전히 트리의 각 레벨을 접근할 때는 2차 캐쉬실패가 발생한다. 본 논문에서는 이러한 문제점을 인식하고 트리 순회시 각 레벨을 방문할 때도 캐쉬 실패가 발생하지 않는 주기억장치 색인구조를 제안한다. 제안하는 색인구조는 다음 레벨에서 방문할 가능성이 있는 노드들을 프리페칭하여 다음 레벨을 방문할 때도 캐쉬 실패가 발생하지 않도록 한다. 또한, 기본적인 구조는 노드그룹 개념을 이용하여 노드의 팬-아웃을 증가시키는 CSB+-트리에 기반하지만 CSB+-트리의 다점인 분할 비용의 증가문제를 해결하기 위한 방법을 제안한다. 시뮬레이션을 통해 기존의 색인구조와 비교하여 제안하는 색인구조의 우수성을 보인다.
PDF

An Efficient Caching Algorithm to Minimize Duplicated Disk Blocks in 2-level Disk Cache System (2-레벨 디스크 캐쉬 시스템에서 디스크 블록 중복 저장을 최소화하는 효율적인 캐싱 알고리즘)

류갑상;정수목
- Journal of the Korea Computer Industry Society
- /
- v.5 no.1
- /
- pp.57-64
- /
- 2004
The speed gap between processors and disks is a serious problem. So, I/O sub-system limits the performance of computer system. To overcome the speed gap, caches have been used in computer system. By using cache, the access times to disk blocks can be reduced and the performance of computer system can be improved. In this paper, we proposed an efficient cache management algorithm for computer system which have buffer cache and disk cache. The proposed algorithm can minimize the duplicated blocks between buffet cache and disk cache. We evaluate the proposed algorithm by trace-driven simulation. The simulation results show that the proposed algorithm can reduce the mean access time to disk blocks.
PDF

A New Cache Replacement Policy for Improving Last Level Cache Performance (라스트 레벨 캐쉬 성능 향상을 위한 캐쉬 교체 기법 연구)

Do, Cong Thuan;Son, Dong Oh;Kim, Jong Myon;Kim, Cheol Hong
- Journal of KIISE
- /
- v.41 no.11
- /
- pp.871-877
- /
- 2014
Cache replacement algorithms have been developed in order to reduce miss counts. In modern processors, the performance gap between the processor and main memory has been increasing, creating a more important role for cache replacement policies. The Least Recently Used (LRU) policy is one of the most common policies used in modern processors. However, recent research has shown that the performance gap between the LRU and the theoretical optimal replacement algorithm (OPT) is large. Although LRU replacement has been proven to be adequate over and over again, the OPT/LRU performance gap is continuously widening as the cache associativity becomes large. In this study, we observed that there is a potential chance to improve cache performance based on existing LRU mechanisms. We propose a method that enhances the performance of the LRU replacement algorithm based on the access proportion among the lines in a cache set during a period of two successive replacement actions that make the final replacement action. Our experimental results reveals that the proposed method reduced the average miss rate of the baseline 512KB L2 cache by 15 percent when compared to conventional LRU. In addition, the performance of the processor that applied our proposed cache replacement policy improved by 4.7 percent over LRU, on average.
https://doi.org/10.5626/JOK.2014.41.11.871 인용

lpCSB+- tree : An Enhanced Main Memory Index Structure Employing the Level Prefetching Technique (lpCSB+-트리 : 레벨 프리페칭 기법을 이용하는 향상된 주기억장치 상주형 색인구조)

Hong Hyun Taek;Pee Jun Il;Song Seok Il;Yoo Jae Soo
- Journal of KIISE:Databases
- /
- v.31 no.6
- /
- pp.675-683
- /
- 2004
In main-memory resident index structures, secondary cache misses considerably have an effect on the performance of index structures. Recently, several main-memory resident index structures that consider cache have been proposed to reduce the impact of secondary cache misses. However they still suffer from full secondary cache misses whenever visiting each level of a index tree. In this paper, we propose a new index structure that eliminates cache misses even when visiting each level of index tree. The proposed index structure prefetches the grandchildren of a current node. The basic structure of the proposed index structure is from CSB+-tree that uses the concepts of the node group to increase fan-out. However the insert algorithm of the proposed index structure reduces the cost of a split significantly. Also, we show the superiority of our algorithm through various performance evaluation.
PDF KSCI

Performance Analysis of Multicore Processor Architectures Based On Cache Size Effects (캐쉬 용량 효과에 대한 멀티코어 프로세서의 성능 연구)

Lee, Jongbok
- The Journal of the Institute of Internet, Broadcasting and Communication
- /
- v.12 no.6
- /
- pp.175-180
- /
- 2012
In order to overcome the complexity and performance limit problems of superscalar processors, the multicore architecture has been prevalent recently. The configuration and the size of instruction and data caches greatly gives effect on the performance of multicore processors. Using SPEC 2000 benchmarks as input, the trace-driven simulation has been performed for the 2-core to 16-core architectures with different sizes of caches extensively. As a result, the 2-way set associative instruction and data cache with the size of 64KB brought the best cost-effective performance.
https://doi.org/10.7236/JIWIT.2012.12.6.175 인용 PDF KSCI

IpCSB+ - tree : An Enhanced Main Memory Index Structure Employing the Level Prefetching Technique (레벨 프리페칭 기법을 이용한 향상된 주기억장치 상주형 색인구조)

Hong Hyun-Taek;Kang Tae-Ho;Yoo Jae-Soo
- Journal of Internet Computing and Services
- /
- v.4 no.6
- /
- pp.75-86
- /
- 2003
In main-memory resident index structures, secondary cache misses considerably have an effect on the performance of index structures. Recently, several main-memory resident index structures that consider cache have been proposed to reduce the impact of secondary cache misses. However they still suffer from full secondary cache misses whenever visiting each level of a index tree, In this paper, we propose a new index structure that eliminates cache misses even when visiting each level of index tree. The proposed index structure prefetches the grandchildren of a current node. The basic structure of the proposed index structure is from CSB+-tree that uses the concepts of the node group to increase fan-out. However the insert algorithm of the proposed index structure reduces the cost of a split significantly, Also, we show the superiority of our algorithm through various performance evaluation.
PDF

Search Result 19, Processing Time 0.026 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)