DOI QR코드

DOI QR Code

Providing scalable single-operating-system NUMA abstraction of physically discrete resources

  • Baik Song An (Artificial Intelligence Research Laboratory, Electronics and Telecommunications Research Institute) ;
  • Myung Hoon Cha (Artificial Intelligence Research Laboratory, Electronics and Telecommunications Research Institute) ;
  • Sang-Min Lee (SMEs and Commercialization Division, Electronics and Telecommunications Research Institute) ;
  • Won Hyuk Yang (Department of Computer Science and Engineering, POSTECH) ;
  • Hong Yeon Kim (Artificial Intelligence Research Laboratory, Electronics and Telecommunications Research Institute)
  • Received : 2023.02.16
  • Accepted : 2023.10.12
  • Published : 2024.06.20

Abstract

With an explosive increase of data produced annually, researchers have been attempting to develop solutions for systems that can effectively handle large amounts of data. Single-operating-system (OS) non-uniform memory access (NUMA) abstraction technology is an important technology that ensures the compatibility of single-node programming interfaces across multiple nodes owing to its higher cost efficiency compared with scale-up systems. However, existing technologies have not been successful in optimizing user performance. In this paper, we introduce a single-OS NUMA abstraction technology that ensures full compatibility with the existing OS while improving the performance at both hypervisor and guest levels. Benchmark results show that the proposed technique can improve performance by up to 4.74× on average in terms of execution time compared with the existing state-of-the-art opensource technology.

Keywords

Acknowledgement

This research was supported by the Institute of Information & Communications Technology Planning & Evaluation (IITP), Government of Korea (MSIT), Grant/Award Number: 2022-0-00498.

References

  1. R. Jain, S. Cheng, V. Kalagi, V. Sanghavi, S. Kaul, M. Arunachalam, K. Maeng, A. Jog, A. Sivasubramaniam, M. T. Kandemir, and C. R. Das, Optimizing CPU performance for recommendation systems at-scale, (Proceedings of the 50th Annual International Symposium on Computer Architecture, Orlando, FL, USA), 2023, pp. 1-15.
  2. J. Zhang, Z. Ding, Y. Chen, X. Jia, B. Yu, Z. Qi, and H. Guan, GiantVM: a type-II hypervisor implementing many-to-one virtualization, (Proceedings of the 16th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments, Association for Computing Machinery, Lausanne, Switzerland), 2020, pp. 30-44.
  3. P. Fatourou and N. D. Kallimanis, Revisiting the combining synchronization technique, (Proceedings of the 17th ACM SIGPLAN symposium on Principles and Practice of Parallel Programming, New Roleans, LA, USA) 2012, pp. 257-266.
  4. C. Bienia, Benchmarking modern multiprocessors, Ph.D. Thesis, Princeton University, 2011.
  5. NASA, NAS parallel benchmarks. https://www.nas.nasa.gov/software/npb.html. Accessed: 2023-08-15.
  6. C. Amza, A. L. Cox, S. Dwarkadas, P. Keleher, H. Lu, R. Rajamony, W. Yu, and W. Zwaenepoel, Treadmarks: shared memory computing on networks of workstations, Computer 29 (1996), no. 2, 18-28.
  7. K. Li and P. Hudak, Memory coherence in shared virtual memory systems, ACM Trans. Comput. Syst. 7 (1989), no. 4, 321-359.
  8. Y. Zhou, L. Iftode, J. P. Sing, K. Li, B. R. Toonen, I. Schoinas, M. D. Hill, and D. A. Wood, Relaxed consistency and coherence granularity in DSM systems: a performance evaluation, (Proceedings of the sixth ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, Association for Computing Machinery, New York, NY, USA), 1997, pp. 193-205.
  9. J. Nelson, B. Holt, B. Myers, P. Briggs, L. Ceze, S. Kahan, and M. Oskin, Latency-tolerant software distributed shared memory, (2015 Usenix Annual Technical Conference (Usenix Atc 15), Santa Clara, CA, USA), 2015, pp. 291-305.
  10. Y. Shan, S.-Y. Tsai, and Y. Zhang, Distributed shared persistent memory, (Proceedings of the 2017 Symposium on Cloud Computing, Association for Computing Machinery, Santa Clara CA, USA), 2017, pp. 323-337.
  11. C. Morin, R. Lottiaux, G. Vallee, P. Gallard, D. Margery, J.-Y. Berthou, and I. D. Scherson, Kerrighed and data parallelism: cluster computing on single system image operating systems, (IEEE International Conference on Cluster Computing (CLUSTER 2004), San Diego, CA, USA), 2004, pp. 277-286.
  12. M. Bar, Openmosix project. http://openmosix.org. Accessed: 2022-02-14.
  13. K. Koh, K. Kim, S. Jeon, and J. Huh, Disaggregated cloud memory with elastic block management, IEEE Trans. Comput. 68 (2019), no. 1, 39-52.
  14. M. K. Aguilera, N. Amit, I. Calciu, X. Deguillard, J. Gandhi, S. Novakovic, A. Ramanathan, P. Subrahmanyam, L. Suresh, K. Tati, R. Venkatasubramanian, and M. Wei, Remote regions: a simple abstraction for remote memory, (2018 Usenix Annual Technical Conference (Usenix Atc 18), Boston, MA, USA), 2018, pp. 775-787.
  15. Y. Shan, Y. Huang, Y. Chen, and Y. Zhang, LegoOS: a disseminated, distributed OS for hardware resource disaggregation, (13th Usenix Symposium on Operating Systems Design and Implementation (OSDI 18), Carlsbad, CA, USA), 2018, pp. 69-87.
  16. ScaleMP, ScaleMP, Inc. http://www.scalemp.com. Accessed: 2021-05-31.
  17. TidalScale, Tidalscale-scalable solutions for in-memory computing. http://tidalscale.com. Accessed: 2022-02-14.
  18. M. Chapman and G. Heiser, vNUMA: a virtual shared-memory multiprocessor, (2009 Usenix Annual Technical Conference (Usenix Atc 09), San Diego, CA, USA), 2009, pp. 349-362.