1 |
Docker, Docker CLI [Internet], https://docs.docker.com/engine/reference/commandline/pause/.
|
2 |
NVIDIA, Multi-Process Service(MPS) [Internet], https://docs.nvidia.com/deploy/mps/index.html.
|
3 |
H. H. Chen, E. T. Lin, Y. M. Chou, and J. Chou, "Gemini: Enabling multi-tenant gpu sharing based on kernel burst estimation," IEEE Transactions on Cloud Computing(Early Access), 2021.
|
4 |
W. Xiao, S. Ren, Y. Li, Y. Zhang, P. Hou, Z. Li, Y. Feng, W. Lin, and Y. Jia, "AntMan: Dynamic scaling on GPU clusters for deep learning," In Proceedings of the 14th USENIX Conference on Operating Systems Design and Implementation, pp.533-548, 2020.
|
5 |
Google Brain, Tensorflow [Internet], https://www.tensorflow.org/.
|
6 |
Facebook AI Research, PyTorch [Internet] https://pytorch.org/.
|
7 |
T. A. Yeh, H. H. Chen, and J. Chou, "Kubeshare: A framework to manage gpus as first-class and shared resources in container cloud," In Proceedings of the 29th International Symposium on High-Performance Parallel and Distributed Computing, pp.173-184, 2020.
|
8 |
NVIDIA, Compute Unified Device Architecture (CUDA) [Internet], https://docs.nvidia.com/cuda/cuda-toolkit-release-notes/index.html.
|
9 |
NVIDIA, NVIDIA Docker Wiki [Internet], https://github.com/NVIDIA/nvidia-docker/wiki.
|
10 |
Docker, Docker [Internet], https://www.docker.com/.
|
11 |
NVIDIA, CUDA C++ Programming Guide [Internet], https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html.
|
12 |
NVIDIA, NVIDIA Docker [Internet], https://github.com/NVIDIA/nvidia-docker.
|
13 |
Docker, docker ps [Internet], https://docs.docker.com/engine/reference/commandline/ps/.
|
14 |
Docker, docker top [Internet], https://docs.docker.com/engine /reference/commandline/top/.
|
15 |
P. Thinakaran, J. R. Gunasekaran, B. Sharma, M. T. Kandemir, and C. R. Das, "Kube-knots: Resource harvesting through dynamic container orchestration in gpubased datacenters," In 2019 IEEE International Conference on Cluster Computing (CLUSTER), pp.1-13, 2019.
|
16 |
Q. Chen, J. Oh, S. Kim, and Y. Kim, "Design of an adaptive GPU sharing and scheduling scheme in container-based cluster," Cluster Computing, Vol.23, No.3, pp.2179-2191, 2020.
DOI
|
17 |
M. Lee, H. Ahn, C. H. Hong, and D. S. Nikolopoulos, "gShare: A centralized GPU memory management framework to enable GPU memory sharing for containers," Future Generation Computer Systems, Vol.130, pp.181-192, 2022.
DOI
|
18 |
J. Gu, S. Song, Y. Li, and H. Luo, "GaiaGPU: sharing GPUs in container clouds," In 2018 IEEE Intl Conf on Parallel & Distributed Processing with Applications, Ubiquitous Computing & Communications, Big Data & Cloud Computing, Social Computing & Networking, Sustainable Computing & Communications (ISPA/IUCC/BDCloud/SocialCom/SustainCom), pp.469-476, 2018.
|
19 |
Linux Foundation, kubernetes [Internet], https://kubernetes.io/ko/.
|
20 |
Y. Peng, Y. Bao, Y. Chen, C. Wu, and C. Guo, "Optimus: an efficient dynamic resource scheduler for deep learning clusters," In Proceedings of the Thirteenth EuroSys Conference, pp.1-14, 2018.
|
21 |
Z. Liu, C. Chen, J. Li, Y. Cheng, Y. Kou, and D. Zhang, "KubFBS: A fine-grained and balance-aware scheduling system for deep learning tasks based on kubernetes," Concurrency and Computation: Practice and Experience, Vol.34, No.11, pp.e6836, 2022.
DOI
|
22 |
NVIDIA, NVIDIA System Management Interface [Internet], https://developer.nvidia.com/nvidia-system-management-interface.
|
23 |
J. Shao, J. Ma, Y. Li, B. An, and D. Cao, "GPU scheduling for short tasks in private cloud," In 2019 IEEE International Conference on Service-Oriented System Engineering (SOSE), pp.215-2155, 2019.
|