Browse > Article

Remote Cache Replacement Policy using Processor Locality in Multi-Processor System  

Han Sang Yoon (서울대학교 전기컴퓨터공학부)
Kwak Jong Wook (서울대학교 전기컴퓨터공학부)
Jhang Seong Tae (수원대학교 컴퓨터공학부)
Jhon Chu Shik (서울대학교 전기컴퓨터공학부)
Abstract
The memory access latency of the system has been a primary factor of performance degradation in single-processor system and multi-processor system. The remote memory access latency takes a lot of overhead over the local memory access latency especially in the distributed shared-memory system. To resolve this problem, the multi-level cache architecture that contains a remote cache in the multi-processor system has been proposed. In this paper, we propose a new cache replacement policy that improves the performance of the multi-processor system with the remote cache. If the multi-level cache keeps the multi-level inclusion(MLI) property and uses the LRU(Least Recently Used) cache replacement policy, the LRU information of the higher-level cache(a processor cache) would be different with that of the lower-level cache(a remote cache). In this situation, the replacement of a remote cache line can induce the exchange of a processor cache line that is used by the processor. It is a main factor of performance degradation in a whole system. To alleviate this disadvantage of the LRU replacement polity, the new policy analyses tht processor's remote memory access pattern of each node and uses this information to reduce the number of invalidations of the useful cache line in the higher-level cache. The new replacement policy of the remote cache can improve the performance by $3.5\%$ in maximum and $2.5\%$ in average on SPLASH-2 benchmarks, compared to the general LRU cache replacement policy.
Keywords
Multi-processor system; remote Cache; LRU cache replacement policy; Multi-level cache Inclusion property; Remote memory access pattern; Processor locality;
Citations & Related Records
연도 인용수 순위
  • Reference
1 D. Lee, J. Choi, H. Choe, Noh, S. Min, Y. Cho, 'Implementation and Performance Evaluation of the LRFU Replacement Policy,' In Proceedings of 23rd Euromicro Conference: New Frontiers of Information Technology-Short Contributions, September 01-04, 1997   DOI
2 Zhang, Z. and J. Torrelas, 'Reducing Remote Conflict Misses : NUMA with Remote Cacae versus COMA,' In Proc. of the 3rd IEEE Symp. on High performance Computer Architecture (HPCA-3), pp. 272-281, Feb. 1997   DOI
3 김형호, '지점간 링크를 이용한 스누핑 버스의 설계 및 성능 분석', 서울대학교 석사학위 논문, 1996
4 B. Nelson, J. Archibald, and K. Flanagan, 'Performance Analysis of Inclusion Effects in Multi-Level Multiprocessor Caches,' Proceedings of the Third IEEE Symposium, pp. 513-516, Dec. 1991   DOI
5 Jeffrey D. Gee, Alan Jay Smith, 'Analysis of Multiprocessor Memory Reference Behavior,' in Proceedings of the IEEE International conference on, 10-12 Oct. 1994   DOI
6 P. Foglia, R. Giorgi and C.A. Prete, 'Simulation study of memory performance of SMP multiprocessors running a TPC-W workload,' in Proceedings of the IEE Proc. -Comput. Digit. Tech, Vol. 151, No.2, March 2004   DOI   ScienceOn
7 J. E.Veenstra and R.J.Fowler. 'MINT : A Front End for Efficient Simulation of Shared-Memory Multiprocessors,' Proc., 2nd Int'l Workshop on Modeling, Analysis and Simulation of Computer and Telecommunication Systems, pp. 201-207, Jan. 1994   DOI
8 S. Woo et al., 'The SPLASH-2 Programs: Characterization and Methodological Considerations,' In Proceedings of 22nd International Symposium on Computer Architecture, pp. 24-36, June 1995
9 Manuel E. Acacio, Jose Gonzalez, 'A Two-Level Directory Architecture for Highly Scalable ccNUMA Multiprocessors,' IEEE Transactions on Parallel and Distributed System, Vol. 16, No.1, January 2005   DOI   ScienceOn
10 J.-L.. Bear and W.-H. Wang. 'On the inclusion properties for multi-level cache hierarchies,' In proceeding of 15th International Symp. on Computer Architecture, pp. 73-80, IEEE, 1988   DOI
11 J. Alghazo, A. Akaaboune, N. Botros, 'SF-LRU Cache Replacement Algorithm,' in Proceedings of Memory Technology, Design and Testing, 2004. Records of the 2004 International Workshop on, 9-10 Aug. 2004   DOI
12 서효중, 장성태, 전주식, '내포성이 제거된 공유 캐쉬에 기반한 계층 버스 CC-NUMA 다중처리기', Journal of KISS A, vol. 25, no. 3, pp. 306-321, 1998
13 E. Cecchet, 'Parallel pull-based LRU : a request distribution algorithm for clustered Web caches using a DSM for memory mapped networks,' in Proceedings of the Cluster Computer and the Grid. IEEE/ACM International Symposium on, 15-18 May 2001   DOI
14 Wayne A. Wong and Jean-Loup Baer, 'Modified LRU Policies for Improving Second-level Cache Behavior,' In Proceedings of the 6th International Symposium on High-Performance Computer Architecture (HPCA), pp. 49-60, January 2000   DOI
15 J. L. Hennessy and D.A Patterson, Computer Architecture : A Quantitative Approach, Third Edition, Morgan Kaufmann Publishers, Inc, 2003