DOI QR코드

DOI QR Code

GPGPU Task Management Technique to Mitigate Performance Degradation of Virtual Machines due to GPU Operation in Cloud Environments

클라우드 환경에서 GPU 연산으로 인한 가상머신의 성능 저하를 완화하는 GPGPU 작업 관리 기법

  • 강지훈 (고려대학교 정보창의교육연구소) ;
  • 길준민 (대구가톨릭대학교 컴퓨터소프트웨어학부)
  • Received : 2020.07.06
  • Accepted : 2020.07.23
  • Published : 2020.09.30

Abstract

Recently, GPU cloud computing technology applying GPU(Graphics Processing Unit) devices to virtual machines is widely used in the cloud environment. In a cloud environment, GPU devices assigned to virtual machines can perform operations faster than CPUs through massively parallel processing, which can provide many benefits when operating high-performance computing services in a variety of fields in a cloud environment. In a cloud environment, a GPU device can help improve the performance of a virtual machine, but the virtual machine scheduler, which is based on the CPU usage time of a virtual machine, does not take into account GPU device usage time, affecting the performance of other virtual machines. In this paper, we test and analyze the performance degradation of other virtual machines due to the virtual machine that performs GPGPU(General-Purpose computing on Graphics Processing Units) task in the direct path based GPU virtualization environment, which is often used when assigning GPUs to virtual machines in cloud environments. Then to solve this problem, we propose a GPGPU task management method for a virtual machine.

최근 클라우드 환경에서는 고성능 연산이 가능한 GPU(Graphics Processing Unit) 장치를 가상머신에게 적용한 GPU 클라우드 컴퓨팅 기술이 많이 사용되고 있다. 클라우드 환경에서 가상머신에게 할당된 GPU 장치는 대규모 병렬 처리를 통해 CPU보다 더 빠르게 연산을 수행할 수 있으며, 이로 인해 다양한 분야의 고성능 컴퓨팅 서비스들을 클라우드 환경에서 운용할 때 많은 이점을 얻을 수 있다. 클라우드 환경에서 GPU 장치는 가상머신의 성능 향상에 많은 도움을 주지만 가상머신의 CPU 사용 시간을 기반으로 작동하는 가상머신 스케줄러에서는 GPU 장치의 사용 시간이 고려되지 않아 다른 가상머신들의 성능에 영향을 미친다. 본 논문에서는 클라우드 환경에서 가상머신에게 GPU를 할당할 때 많이 사용되는 직접 통로기반 GPU 가상화 환경에서 GPGPU(General-Purpose computing on Graphics Processing Units) 작업을 수행하는 가상머신으로 인한 다른 가상머신들의 성능 저하 현상을 검증하고 분석하며, 이를 해결하기 위한 가상머신의 GPGPU 작업 관리 기법을 제안한다.

Keywords

References

  1. W. Hwu and D. Kirk, "Programming Massively Parallel Processors: A Hands-On Approach," Morgan Kaufmann, 2010, pp.20-24.
  2. Amazon, Amazon EC2 Instance Types [Internet]. https://aws.amazon.com/ec2/instanc e-types/?nc1=f_ls.
  3. Alibaba Cloud, Elastic GPU Service [Internet], https://hpc.aliyun.com/product/gpu_bare_metal.
  4. P. Barham, B. Dragovic, K. Fraser, S. Hand, T. Harris, A. Ho, R. Neugebauer, I. Pratt and A. Warfield, "Xen and the art of virtualization," In Proceedings of the Nineteenth ACM Symposium on Operating Systems Principles, SOSP '03. ACM: New York, NY, USA, 2003, pp.164-177.
  5. Xen Project, Credit Scheduler [Internet], https://wiki.xen.org/wiki/Credit_Scheduler.
  6. AMD, OpenCL: Open Computing Language [Internet], https://www.khron os.org/opencl/.
  7. nVidia, CUDA: Compute Unified Device Architecture [Internet], http://www.nvidia.com/object/cuda_home_new. html.
  8. Xen Project, VGA Passthrough [Internet], https://wiki.xen.org/wiki/Xen_VGA_ Passthrough.
  9. D. Abramson, J. Jackson, S. Muthrasanallur, G. Neiger,G. Regnier, R. Sankaran, I. Schoinas, R. Uhlig, B. Vembu, and J. Wiegert, "Intel virtualization technology for directed I/O," Intel Technology Journal, 2006.
  10. L. Shi, H. Chen, J. Sun, and K. Li, "vCUDA: GPU-accelerated high-performance computing in virtual machines," IEEE Transactions on Computers, Vol.61, No.6, pp.804-816, 2012. https://doi.org/10.1109/TC.2011.112
  11. Y. Suzuki, S. Kato, H. Yamada, and K. Kono, "Gpuvm: Gpu virtualization at the hypervisor," IEEE Transactions on Computers, Vol.65, No.9, pp.2752-2766, 2016. https://doi.org/10.1109/TC.2015.2506582
  12. K. Tian, Y. Dong, and D. Cowperthwaite, "A Full GPU Virtualization Solution with Mediated Pass-Through," USENIX Annual Technical Conference, pp.121-132, 2014.
  13. A. A. Sani, K. Boos, S. Qin and L. Zhong, "I/o paravirtualization at the device file boundary," In ACM SIGPLAN Notices, Vol.49, No.4, pp.319-332, 2014. https://doi.org/10.1145/2644865.2541943
  14. Y. Zhang, P. Qu, J. Cihang, and W. Zheng, "A cloud gaming system based on user-level virtualization and its resource scheduling," IEEE Transactions on Parallel and Distributed Systems, Vol.27, No.5, pp.1239-1252, 2016. https://doi.org/10.1109/TPDS.2015.2433916