Browse > Article

An Efficient Instruction Prefetching Scheme Based on the Page Access Information  

Shin Soong-Hyun (서울대학교 전기 컴퓨터공학부)
Kim Cheol-Hong (서울대학교 전기 컴퓨터공학부)
Jhon Chu-Shik (서울대학교 전기 컴퓨터공학부)
Abstract
In general, the hit ratio of the first level cache is one of the most important factors in determining the performance of computer systems. Prefetching from lower level memory structure is one of the most useful techniques for improving the hit ratio of the first level cache. In this paper, we propose a prefetch on continuous same page access (CSPA) scheme which improves the prefetch efficiency of the instruction cache and reduces prefetch cost at the same time. The proposed CSPA scheme traces the page addresses of executed instructions to count how many times the same memory page is accessed continuously. To increase the prefetch efficiency, the CSPA scheme initiates prefetch only if the number of accesses to the same page exceeds the threshold value. Generally, the size of a L1 cache block is smaller than that of a L2 cache block. Therefore, one L2 cache block contains a number of L1 cache blocks. To reduce the number of unnecessary accesses to the L2 cache due to prefetch, the CSPA scheme enables prefetch only when the missed L1 block and the prefetch L1 block are in the same L2 cache block, leading to reduced prefetch cost. According to our simulations, the proposed prefetching scheme improves the performance by up to 6.7%.
Keywords
Computer architecture; Instruction cache; Cache prefetch;
Citations & Related Records
연도 인용수 순위
  • Reference
1 SPEC2000 benchmarks, http://www.spec.org
2 Burger, D., Austin, T.M., and Bennett, S.: Evaluating future micro-processors: the SimpleScalar tool set. Technical Report TR-1308, Univ. of Wisconsin-Madison Computer Sciences Dept., 1996
3 Hennessy, J.L. and Patternson, D.A.: Computer Architecture: A Quantitative Approach, Second Edition, Morgan Kaufmann Publishers, 1996
4 Jung-H. L., Seh-woong J, Shin-D. K., and Charles C. W.: An Intelligent Cache System with Hardware Prefetching for High Performance, IEEE Transactions on Computers, pp. 607-616, 2003   DOI   ScienceOn
5 V. Milutinovic, M. Tomasevic, B. Markovic, and M. Tremblay: The Split Temporal/Spatial Cache: Initial Performance Analysis, Proceedings of the SCIzzL-5, Santa Clara, California, USA, pp. 72-78, 1996
6 Zhang, Y., Haga, S., and Barua, R.: Execution History Guided Instruction Prefetching, In Proc. of the 16th International Conference on Supercomputing, pp. 199-208, 2002   DOI
7 Batcher, K., and Walker, R.: Cluster Miss Prediction with Prefetch on Miss for Embedded CPU Instruction Caches, In Proc. of the 2004 International Conference on Compilers, Architecture, and Synthesis for Embedded Systems, pp. 24-34, 2004   DOI
8 Hsu, W.C. and Smith, J.E.: A Performance Study of Instruction Cache Prefetching Methods, IEEE Transactions on Computers, pp, 497-508, 1998   DOI   ScienceOn
9 Lee, C., Potkonjak, M., and Mangione-Smith, W.H.: Mediabench: A tool for evaluating and synthesizing multimedia and communications systems, In Proc. of the 30th Annual International Symposium on Microarchitecture, pp. 330-335, 1997
10 Smith, A.J.: Cache Memories, Computing Surveys, Vol.14, No.3, pp.473-530, 1982   DOI   ScienceOn
11 Dahlgren, F., Dubois, M. and Stenstrom, P.: Fixed and Adaptive Sequential Prefetching in Shared-memory Multiprocessors, Proc. International Conference on Parallel Processing, 1-56-63, 1993   DOI
12 Jouppi, N.P.: Improving Direct-mapped Cache Performance by the Addition of a Small Fully-associative Cache and Prefetch Buffers, Proc. 17th International Symposium on Computer Architecture, pp. 364-373, 1990   DOI
13 Reinman, G., Calder, B., and Austin, T.: Fetchdirected instruction prefetching, In 32nd International Symposium on Microarchitecture, pp. 16-27, 1999