Efficient Cache Architecture for Transactional Memory

트랜잭셔널 메모리를 위한 효율적인 캐시 구조

  • Choi, Dong-Min (Department of Electrical and Electronic Engineering, Yonsei University) ;
  • Kim, Seung-Hun (Department of Electrical and Electronic Engineering, Yonsei University) ;
  • Ro, Won-Woo (Department of Electrical and Electronic Engineering, Yonsei University)
  • 최동민 (연세대학교 전기전자공학과) ;
  • 김승훈 (연세대학교 전기전자공학과) ;
  • 노원우 (연세대학교 전기전자공학과)
  • Received : 2011.05.26
  • Accepted : 2011.06.30
  • Published : 2011.07.25

Abstract

Traditional transactional memory systems are no longer able to guarantee the performance of diverse applications with overflowed transactions since there is the drawback that tracking the data for logging is difficult. Especially, this mechanism has a disadvantage of increasing communication delay for sustaining the state which is required to detect the conflict on the overflowed transactions from the first level cache in the transactional memory systems. To address this point, we have focused on the cache architecture of the systems to reduce the overhead caused by overflows and cache misses. In this paper, we present Supportive Cache which reduces additional overhead during transactions. Supportive Cache performs a parallel look-up with L1 private cache and uses the same replacement policy as L1 private cache. We evaluate the performance of the proposed design by comparing LogTM-SE with and without Supportive Cache. The simulation results show that our system improves the performance by 37% on average, compared to the original LogTM-SE which uses the same hardware resource.

트랜잭셔널 메모리 시스템에서 오버플로우(overflow) 발생 시 이를 처리하기 위한 데이터의 기록은 그 복잡성으로 인해 전체 시스템 성능 저하의 주요 요인이 된다. 특히, 오버플로우 된 데이터가 일으킬 수 있는 충돌감지를 위해 캐시 일관성 프로토콜 상에 추가적인 상태 설정이 요구되며 이로 인해 트랜잭션간 커뮤니케이션에 지연이 발생한다. 이러한 문제점을 해결하기 위해 우리는 트랜잭셔널 메모리 시스템에서 오버플로우에 의해 발생하는 오버 헤드를 줄이기 위한 효율적인 캐시 구조를 연구하였다. 본 논문에서 제안하는 보조 캐시(supportive cache)는 1차 캐시와 동일한 교체 정책을 사용하며 병렬 룩업이 가능하도록 작동한다. 보조 캐시의 성능 평가를 위해 하드웨어 트랜잭셔널 메모리 시스템인 LogTM-SE를 사용하였으며 시뮬레이션 결과 평균적으로 37%의 성능 향상을 보였다.

Keywords

References

  1. M. Herlihy and J. Moss. Transactional memory: Architectural support for lock-free data structures. In Proceedings of the 20th Annual International Symposium on Computer Architecture, pages 289-300, May 1993.
  2. L. Hammond, V. Wong, M. Chen, B. Carlstrom, J. Davis, B. Hertzberg, M. Prabhu, H. Wijaya, C. Kozyrakis, and K. Olukotun. Transactional memory coherence and consistency. In Proceedings of the 31st International Symposium on Computer Architecture, pages 102-113, June 2004.
  3. K. Moore, J. Bobba, M. Moravan, M. Hill, and D. Wood. Logtm; log-based transactional memory. In Proceedings of the 12th International Symposium on High-Performance Computer Architecture, pages 254-265, February 2006.
  4. M. Lupon, G. Magklis, and A. Gonzalez. Fastm; a log-based hardware transactional memory with fast abort recovery. In Proceedings of the 18th International Conference on Parallel Architecture and Compilation Techniques, pages 293-302, September 2009.
  5. 김승훈, 김선우, 노원우, "집중 충돌 병렬 처리를 위한 효율적인 다중 코어 트랜잭셔널 메모리," 전자공학회논문지, 제48권 CI편, 제1호, 72-79쪽, 2011년 1월.
  6. V. J. Marathe, W. N. S. III, and M. L. Scott. Adaptive software transactional memory. In Proceedings of the 19th International Symposium on Distributed Computing, pages 354-368, 2005.
  7. B. Saha, A.-R. Adl-Tabatabai, R. L. Hudson, C. C. Minh, and B. Hertzberg. Mcrtstm; a high performance software transactional memory system for a multi-core runtime. In Proceedings of the 11th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pages 187-197, 2006.
  8. M. F. Spear, L. Dalessandro, V. J. Marathe, and M. L. Scott. A comprehensive strategy for contention management in software transactional memory. In Proceedings of the 14th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pages 141-150, February 2009.
  9. P. Damron, A. Fedorova, Y. Lev, V. Luchangco, M. Moir, and D. Nussbaum. Hybrid transactional memory. In Proceedings of the 12th International Conference on Architectural Support for Programming Languages and Operating Systems, pages 336-346, October 2006.
  10. A. Shriraman, M. F. Spear, H. Hossain, S. Dwarkadas, and M. L. Scott. An integrated hardware-software approach to flexible transactional memory. In Proceedings of the 34th International Symposium on Computer Architecture, pages 104-115, 2007.
  11. R. I. Bahar, D. Grunwald, and B. Calder. A comparison of software code reordering and victim buffers, March 1999.
  12. B. M. Beckmann, M. R. Marty, and D. A. Wood. Asr: Adaptive selective replication for cmp caches. In Proceedings of the 39th Annual International Symposium on Microarchitecture, pages 443-454, December 2006.
  13. M. Qureshi. Adaptive spill-receive for robust high-performance caching in cmps. In Proceedings of the 15th International Symposium on High-Performance Computer Architecture, pages 45-54, February 2009.
  14. L. Yen, J. Bobba, M. Marty, K. Moore, H. Volos, M. Hill, M. Swift, and D. Wood. Logtm-se; decoupling hardware transactional memory from caches. In Proceedings of the 13th International Symposium on High-Performance Computer Architecture, pages 261-272, February 2007.
  15. M. Martin, D. Sorin, B. Beckmann, M. Marty, M. Xu, A. Alameldeen, K. Moore, M. Hill, and D. Wood. Multifacet's general execution-driven multiprocessor simulator (gems) toolset. ACM SIGARCH Computer Architecture News, 33(4):92-99, November 2005. https://doi.org/10.1145/1105734.1105747
  16. P. S. Magnusson, M. Christensson, J. Eskilson, D. Forsgren, G. Hallberg, J. Hogberg, F. Larsson, A. Moestedt, and B. Werner. Simics: A full system simulation platform. IEEE Computer, 35(2):50-58, February 2002. https://doi.org/10.1109/2.982916
  17. C. Minh, J. Chung, C. Kozyrakis, and K. Olukotun. Stamp; stanford transactional applications for multi-processing. In Proceedings of the IEEE International Symposium on Workload Characterization, pages 35-46, September 2008.