A Cache Controller to Maximize Effectiveness of Hierarchical Memory Architecture

계층적 메모리 구조의 효과를 극대화하는 캐시 제어기

  • 어봉용 (유비쿼터스 바이오정보기술센터) ;
  • 주영관 (충북대학교 컴퓨터과학과) ;
  • 전중남 (충북대학교 전기전자컴퓨터공학부) ;
  • 김석일 (충북대학교 전기전자컴퓨터공학부)
  • Published : 2005.12.01

Abstract

A cache architecture is proposed here which evokes prefetch at level 1 cache miss. Existing structures only prefetch at level 2 cache miss. In the proposed cache architecture, level 1 cache miss would select demand fetch block and prefetch block from the level 2 cache and store to level 1 cache and prefetch cache, respectively. According to an experimental analysis using 11 benchmark programs, the hierarchical cache architecture that employs both a level 1 cache prefetcher and a level 2 cache prefetcher obtained a maximum $19\%$ increased performance when compared to the cache architecture that employs only a level 2 cache prefetcher.

이 논문에서는 계층적 캐시 구조에서 기존의 레벨 2 캐시 미스 시에만 선인출 하도록 되어있는 구조를 레벨 1 캐시 미스 시에도 선인출 하도록 하는 캐시구조를 제안하였다. 즉, 레벨 1 캐시 미스가 발생하면 레벨 2 캐시로부터 요구블록과 선인출 블록을 선택하여 레벨 1 캐시와 선인출 캐시에 각각 적재한다. 11개의 벤치마크 프로그램에 대한 실험결과, 레벨 1 캐시 선인출기와 레벨 2 캐시 선인출기로 구성한 계층적 캐시구조가 레벨 2 캐시 선인출기만 채용한 기존의 캐시구조에 비하여 최대 $19\%$의 성능향상을 얻을 수 있었다.

Keywords

References

  1. A. Grama, A. Gupta, G. Karapis and V. Kumar, Introduction to Parallel Computing, 2nd Edition, Addison Wesley, 2003
  2. J, Fritts, 'Multi-Level Memory Prefetching for Media and Streaming Processing,' Proceedings International Conference on Multimedia and Expo, 2002 https://doi.org/10.1109/ICME.2002.1035522
  3. J. L. Bear and W. H. Wang, 'Architectural Choices for Multi-level Cache Hierarchies,' Proceedings 16th international Conference on Parallel Processing, pp.258-256, 1987
  4. H. J, Moon, J, N. Jeon and S. I. Kim, 'Design of A Media Processor Equipped with Dual Cache,' Journal KISS, Vol.29, No.9, pp.573-581, October, 2002
  5. N. B. Gaddis, J, R. Butler, A. Kumar and W. J, Queen, 'A 56-entry instruction reorder buffer, Solid-State Circuits Conference,' Digest of Technical Papers. 43rd ISSCC., pp.212-213, 447, February, 1996 https://doi.org/10.1109/ISSCC.1996.488575
  6. D. Joseph and D. Grunwald, 'Prefetching Using Markov Predictors,' Proceedings 24th Inl, Symp. Computer Architecture, pp.252-263, June, 1997 https://doi.org/10.1145/264107.264207
  7. X. Zhuang and H-H S. Lee, 'Hardware-based Cache Pollution Filtering Mechanism for Aggressive Prefetches', in Proc. IEEE Int. conf. on Parallel Processing, pp.286-293, Oct., 2003 https://doi.org/10.1109/ICPP.2003.1240591
  8. A. Smith, 'Sequential Program Prefetching in Memory Hierarchies,' IEEE Computer, 11(2):7-21, 1997
  9. N. P. Jouppi, 'Improving Direct-mapped Cache Performance by the Addition of a Small Fully associative Cache and Prefetch Buffers,' Proceedings 17th Annual International Symposium on Computer Architecture, pp.364-373, May, 1990 https://doi.org/10.1109/ISCA.1990.134547
  10. T. Horel and G. Lauterbach, 'UltraSPARC-III: Designing Third-generation 64-bit Performance,' IEEE Micro, Vo1.l9, No.3, pp.73-85, May, 1999 https://doi.org/10.1109/40.768506
  11. T. F. Chen and J. L. Baer, 'Effective Hardware-Based Data Prefetching for High Performance Processors,' IEEE Transactions on Computers, 44(5):609-623, May, 1995 https://doi.org/10.1109/12.381947
  12. Y. S. Jeon, H. J. Moon, J. N. Jeon and S. L Kim, 'A Hardware Cache Prefetching Scheme for Multimedia Data with Intermittently Irregular Strides,' KIPS Architecture, Vo1.31, No.11, pp.0658-0672, 2004
  13. K. K. Chan, C. C. Hay, J. R. Keller, G. P. Kurpanek, F. X. Schumacher and J. Zheng, 'Design of the HP PA 7200 CPU,' Hewlett-Packard Journal, Vo1.47, No.1, pp.25-33, February, 1996
  14. Pentium Processor User's Manual, Vol. 1 Pentium Processor Databook, Intel, 1993
  15. IA-32 $Intel{\circled}R$ Architecture Software Developer's Manual, Vol. 1 Basic Architecture, Intel, 2004
  16. M. Denamn, 'PowerPC 604,' Hot Chips VI, pp.193-200, 1994
  17. O. Mutlu, H. S. Kim, D. N. Armstrong and Y. N. Patt, 'Cache Filtering Techniques to Reduce the Negative Impact of Useless Speculative Memory References on Processor Performance,' Computer Architecture and High Performance Computing, SBAC-PAD 2004. 16th Symposium on, 27-29, pp.2-9, October, 2004 https://doi.org/10.1109/CAHPC.2004.11
  18. S. J. Kim, The Cache Structure of A General Purpose Processor with Media Processing Capabilities, Ph. D. Thesis, Dept. of Computer Science, Chungbuk National University, February, 2005
  19. Y. K. Ju, J. N. Jeon and S. I. Kim, 'Performance Improvement of A Processor with Independent Media Cache,' KIPS Architecture, VoI.10, No.02, pp.0143-0146, 2003
  20. J. H. Lee, et. al., 'An Intelligent Cache System with Hardware Prefetching for High Performance,' IEEE Transactions on Computers, 5(5), pp.607-617. 2003 https://doi.org/10.1109/TC.2003.1197127
  21. Y. Solihin, J. Lee and J. Torrellas, 'Correlation prefetching with a user-level memory thread,' IEEE Transactions on Parallel and Distributed Systems, Vol.14, pp.563-580, June. 2003 https://doi.org/10.1109/TPDS.2003.1206504
  22. A. Srivastava and A. Eustace, ATOM: A System for Building Customized Program Analysis Tools. Proceedings ACM SIGPLAN 94. 196-205, 1994 https://doi.org/10.1145/178243.178260
  23. J. Hennessy, D. Citron. D. Patterson and G. Sohi, 'The use and abuse of SPEC: An ISCA panel,' IEEE Micro. Vol.23, pp.73-77, July-August, 2003 https://doi.org/10.1109/MM.2003.1225977
  24. H. S. Stone. High-Performance Computer Architecture. Addison Wesley, 1993
  25. J. R. Goodman, Cache Consistency and Sequential Consistency, Technical Report TR-1006 University of Wisconsin-Madison, February. 1991