Browse > Article
http://dx.doi.org/10.3745/KTCCS.2021.10.5.123

GPU Memory Management Technique to Improve the Performance of GPGPU Task of Virtual Machines in RPC-Based GPU Virtualization Environments  

Kang, Jihun (고려대학교 정보대학 4단계 BK21 컴퓨터학교육연구단)
Publication Information
KIPS Transactions on Computer and Communication Systems / v.10, no.5, 2021 , pp. 123-136 More about this Journal
Abstract
RPC (Remote Procedure Call)-based Graphics Processing Unit (GPU) virtualization technology is one of the technologies for sharing GPUs with multiple user virtual machines. However, in a cloud environment, unlike CPU or memory, general GPUs do not provide a resource isolation technology that can limit the resource usage of virtual machines. In particular, in an RPC-based virtualization environment, since GPU tasks executed in each virtual machine are performed in the form of multi-process, the lack of resource isolation technology causes performance degradation due to resource competition. In addition, the GPU memory competition accelerates the performance degradation as the resource demand of the virtual machines increases, and the fairness decreases because it cannot guarantee equal performance between virtual machines. This paper, in the RPC-based GPU virtualization environment, analyzes the performance degradation problem caused by resource contention when the GPU memory requirement of virtual machines exceeds the available GPU memory capacity and proposes a GPU memory management technique to solve this problem. Also, experiments show that the GPU memory management technique proposed in this paper can improve the performance of GPGPU tasks.
Keywords
GPU Virtualization; GPU Memory; Resource Managements; Cloud Computing; HPC Cloud;
Citations & Related Records
연도 인용수 순위
  • Reference
1 P. Barham, B. Dragovic, K. Fraser, S. Hand, T. Harris, A. Ho, R. Neugebauer, I. Pratt, and A. Warfield, "Xen and the art of virtualization," In Proceedings of the Nineteenth ACM Symposium on Operating Systems Principles, SOSP '03. ACM: New York, NY, USA, 2003, pp.164-177.
2 D. Abramson, J. Jackson, S. Muthrasanallur, G. Neiger,G. Regnier, R. Sankaran, I. Schoinas, R. Uhlig, B. Vembu, and J. Wiegert, "Intel virtualization technology for directed I/O," Intel Technology Journal, 2006.
3 L. Shi, H. Chen, J. Sun, and K. Li, "vCUDA: GPU-accelerated high-performance computing in virtual machines," IEEE Transactions on Computers, Vol.61, No.6, pp.804-816, 2012.   DOI
4 J. Duato, A. J. Pena, F. Silla, R. Mayo, and E. S. Quintana-Ort, "rCUDA: Reducing the number of GPU-based accelerators in high performance clusters," High Performance Computing and Simulation, pp.224-23, 2010.
5 J. Kehne, J. Metter, and F. Bellosa, "GPUswap: Enabling oversubscription of GPU memory through transparent swapping," ACM SIGPLAN Notices, Vol.50, No.7, pp.65-77, 2015.   DOI
6 I. Gelado, "Garland M. Throughput-oriented GPU memory allocation," In Proceedings of the 24th Symposium on Principles and Practice of Parallel Programming, pp.27-37, 2019.
7 S. Rai and M. Chaudhuri, "Using criticality of GPU accesses in memory management for CPU-GPU heterogeneous multi-core processors," ACM Transactions on Embedded Computing Systems (TECS), Vol.16, No.5s, pp.1-23, 2017.
8 P. Li, X. Hu, D. Chen, J. Brock, H. Luo, E. Z. Zhang, and C. Ding, "LD: Low-overhead GPU race detection without access monitoring," ACM Transactions on Architecture and Code Optimization (TACO), Vol.14, No.9, pp.1-25, 2017.
9 AMD, OpenCL: Open Computing Language [Internet], https://www.khronos.org/opencl/.
10 R. Ausavarungnirun, V. Miller, J. Landgraf, S. Ghose, J. Gandhi, A. Jog, C. Rossbach, and O. Mutlu, "Mask: Redesigning the gpu memory hierarchy to support multi-application concurrency," ACM SIGPLAN Notices, Vol.53, No.2, pp.503-518, 2018.   DOI
11 AMD, AMD Radeon Pro [Internet], https://www.amd.com/ko/graphics/workstation-virtualization-solutions-csp
12 R. Ausavarungnirun, J. Landgraf, V. Miller, S. Ghose, J. Gandhi, C. J. Rossbach, and O. Mutlu, "Mosaic: a GPU memory manager with application-transparent support for multiple page sizes," In Proceedings of the 50th Annual IEEE/ACM International Symposium on Microarchitecture, pp.136-150, 2017.
13 Y. Dong, M. Xue, X. Zheng, J. Wang, Z. Qi, and H. Guan, "Boosting GPU Virtualization Performance with Hybrid Shadow Page Tables," USENIX Annual Technical Conference, pp.517-528, 2015.
14 M. Xue, K. Tian, Y. Dong, J. Ma, J. Wang, Z. Qi, S. Jiao, B. He, and H. Guan, "gScale: Scaling up GPU Virtualization with Dynamic Sharing of Graphics Memory Space," USENIX Annual Technical Conference, pp.579-590, 2016.
15 Alibaba Cloud, Elastic GPU Service [Internet], https://hpc.aliyun.com/product/gpu_bare_metal.
16 NVIDIA, NVIDIA GRID [Internet], https://www.nvidia.com/ko-kr/data-center/virtual-gpu-technology/
17 NVIDIA, CUDA: Compute Unified Device Architecture [Internet], http://www.nvidia.com/object/cuda_home_new.html.
18 NVIDIA, NVIDIA V100 [Internet], https://www.nvidia.com/ko-kr/data-center/v100/
19 Amazon, Amazon EC2 Instance Types [Internet]. https://aws.amazon.com/ec2/instance-types/?nc1=f_ls.