DOI QR코드

DOI QR Code

Container-based Cluster Management System for User-driven Distributed Computing

사용자 맞춤형 분산 컴퓨팅을 위한 컨테이너 기반 클러스터 관리 시스템

  • 박주원 (한국과학기술정보연구원 슈퍼컴퓨팅융합연구센터) ;
  • 함재균 (한국과학기술정보연구원 슈퍼컴퓨팅서비스센터)
  • Received : 2015.04.07
  • Accepted : 2015.06.10
  • Published : 2015.09.15

Abstract

Several fields of science have traditionally demanded large-scale workflow support, which requires thousands of central processing unit (CPU) cores. In order to support such large-scale scientific workflows, large-capacity cluster systems such as supercomputers are widely used. However, as users require a diversity of software packages and configurations, a system administrator has some trouble in making a service environment in real time. In this paper, we present a container-based cluster management platform and introduce an implementation case to minimize performance reduction and dynamically provide a distributed computing environment desired by users. This paper offers the following contributions. First, a container-based virtualization technology is assimilated with a resource and job management system to expand applicability to support large-scale scientific workflows. Second, an implementation case in which docker and HTCondor are interlocked is introduced. Lastly, docker and native performance comparison results using two widely known benchmark tools and Monte-Carlo simulation implemented using various programming languages are presented.

전통적으로 고에너지 물리, 해양, 기상, 천문 우주 등 다양한 과학 분야에서 수천 코어 이상의 CPU를 사용하는 대규모 워크플로우 지원을 요구하고 있으며 이를 위해 대부분 슈퍼컴퓨터와 같은 클러스터 기반의 대용량 시스템이 활용되고 있다. 이러한 시스템은 다수의 사용자 및 기관에 의해 공유되고 있으며, 사용자들의 다양한 요구 사항으로 인해 시스템 운영 및 관리에 많은 어려움이 있다. 본 논문에서는 가상화로 인한 성능 저하 문제를 최소화하고 사용자가 원하는 환경을 동적으로 제공하기 위해 컨테이너 기반 클러스터 관리 플랫폼 방안을 제시하고 구축 사례를 소개한다. 본 논문의 의의는 다음 3가지로 볼 수 있다. 먼저, 컨테이너 기반 가상화 기술과 스케줄러 기능을 연동하여 큰 성능 저하 없이 대규모의 과학워크플로우 지원을 위한 클러스터 구성 및 관리 방안을 제시하였다. 둘째, Docker 와 HTCondor를 활용하여 제시된 방안을 손쉽게 구축한 사례를 소개하였다. 셋째, 널리 활용되는 벤치마크 툴을 이용하여 Docker 성능을 검증하였으며, 다양한 프로그램 언어로 구현된 몬테카를로 시뮬레이션을 통해 과학 워크플로우 지원 예제를 제시하였다.

Keywords

Acknowledgement

Supported by : 한국천문연구원, KISTI

References

  1. E. Deelman, D. Gannon, M. Shields, and I. Taylor, "Workflows and e-science: An overview of workflow system features and capabilities," Future Generation Computer Systems, Vol. 25, No. 5, pp. 528-540, 2009. https://doi.org/10.1016/j.future.2008.06.012
  2. Y. Gil, E. Deelman, M. Ellisman, T. Fahringer, G. Fox, D. Gannon, C. Goble, M. Livny, L. Moreau, and J. Myers, "Examining the challenges of scientific workflows," IEEE Computer, Vol. 40, No. 12, pp. 24-32, Dec. 2007.
  3. K. Chen, J. Xin, and W. Zheng, "Virtualcluster: Customizing the cluster environment through virtual machines," Proc. of IEEE/IFIP International Conference on Embedded and Ubiquitous Computing, 2008, Vol. 2, pp. 411-416, Dec. 2008.
  4. P. Ruth, P. McGachey, and D. Xu, "Viocluster: Virtualization for dynamic computational domains," Proc. of IEEE International Cluster Computing, 2005, pp. 1-10, Sep. 2005.
  5. M. A. Murphy, B. Kagey, M. Fenn, and S. Goasguen, "Dynamic provisioning of virtual organization clusters," Proc. of the 9th IEEE/ACM International Symposium on Cluster Computing and the Grid, Washington, DC, USA, pp. 364-371, 2009.
  6. P. Marshall, K. Keahey, and T. Freeman, "Elastic site: Using clouds to elastically extend site resources," Proc. of the 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing, Washington, DC, USA, pp. 43-52, 2010.
  7. E. Walker, "Benchmarking amazon EC2 for high-performance scientific computing," LOGIN, Vol. 33, pp. 18-23, 2008.
  8. Q. He, S. Zhou, B. Kobler, D. Duffy, and T. McGlynn, "Case study for running HPC applications in public clouds," Proc. of the 19th ACM International Symposium on High Performance Distributed Computing, pp. 395-401, 2010.
  9. K. Jackson, L. Ramakrishnan, K. Muriki, S. Canon, S. Cholia, J. Shalf, H. J. Wasserman, and N. Wright, "Performance analysis of high performance computing applications on the amazon web services cloud," Proc. of IEEE Second International Conference on Cloud Computing Technology and Science (CloudCom), pp. 159-168, Nov. 2010.
  10. N. Regola and J.-C. Ducom, "Recommendations for virtualization technologies in high performance computing," Proc. of IEEE Second International Conference on Cloud Computing Technology and Science (CloudCom), pp. 409-416, Nov. 2010.
  11. S. Soltesz, H. Potzl, M. E. Fiuczynski, A. Bavier, and L. Peterson, "Container-based operating system virtualization: A scalable, high performance alternative to hypervisors," SIGOPS Oper. Syst. Rev., Vol. 41, No. 3, pp. 275-287, Mar. 2007. https://doi.org/10.1145/1272998.1273025
  12. M. Xavier, M. Neves, F. Rossi, T. Ferreto, T. Lange, and C. De Rose, "Performance evaluation of container-based virtualization for high performance computing environments," Proc. of 21st Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP), 2013, pp. 233-240, Feb. 2013.
  13. J. Fink, "Docker: a software as a service, operating system-level virtualization framework," Code4Lib Journal, Vol. 25, 2014.
  14. D. Thain, T. Tannenbaum, and M. Livny, "Distributed computing in practice: the condor experience," Concurrency - Practice and Experience, Vol. 17, No. 2-4, pp. 323-356, 2005. https://doi.org/10.1002/cpe.938
  15. B. Smith, R. Grehan, T. Yager, and D. C. Niemi, "Byte-unixbench: a unix benchmark suite," [Online]. Available: https://code.google.com/p/byte-unixbench/ (accessed 2015, May 26)
  16. A. Kopytov, "SysBench: a system performance benchmark," [Online]. Available: http://sysbench.sourceforge.net/index.html (accessed 2015, May 26)
  17. Korea Institute of Science and Technology, "PLSI: Partnership & leadership for the nationwide supercomputing infrastructure," [Online]. Available: http://www.plsi.or.kr (accessed 2015, May 26) (in Korean)
  18. A. Iosup, S. Ostermann, M. N. Yigitbasi, R. Prodan, T. Fahringer, and D. H. Epema, "Performance analysis of cloud computing services for many-tasks scientific computing," IEEE Transactions on Parallel and Distributed Systems, Vol. 22, No. 6, pp. 931-945, 2011. https://doi.org/10.1109/TPDS.2011.66
  19. R. Ihaka and R. Gentleman, "R: A language for data analysis and graphics," Journal of Computational and Graphical Statistics, Vol. 5, No. 3, pp. 299-314, Sep. 1996. https://doi.org/10.2307/1390807