• Title/Summary/Keyword: HTCondor

Search Result 3, Processing Time 0.022 seconds

Dynamic Resource Scheduling for HTCondor Cluster (HTCondor 클러스터를 위한 동적 자원 스케줄링)

  • Lee, Jungha;Yeom, Jaekeun;Jeong, Ki-Moon;Cho, Hyeyoung;Jung, Daeyong
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2015.10a
    • /
    • pp.250-252
    • /
    • 2015
  • 다양한 분야에서 활발히 연구되는 빅 데이터와 최근 이슈가 되고 있는 딥러닝(Deep-learning) 등은 컴퓨터공학 분야뿐만 아니라 다양한 분야와 접목하여 이에 대한 관심이 증가하고 있다. 대규모 클러스터를 통하여 빅데이터와 딥러닝 같은 계산 집약적인(computational-intensive) 작업을 빠르게 처리할 수 있다. 하지만 대규모 클러스터의 잦은 유휴상태는 클러스터의 활용률은 매우 낮아지게 한다. 본 논문에서는 작업 실행 시간 개선과 클러스터 활용 효율성을 향상시키는 HTCondor 클러스터를 위한 동적 자원 스케줄링 기법을 제안한다. 동적으로 자원 할당을 위해 가상머신으로 HTCondor 클러스터 환경을 구성하였으며, 가상머신의 관리를 위해 OpenStack을 사용하였다. OpenStack기반 HTCondor 클러스터 환경에서 HTCondor Python API와 OpenStack Python API를 사용하여 우리가 제안하는 동적 자원 스케줄링 기법을 구현하였으며, 실험을 통해 제안하는 기법의 성능 및 실현 가능성을 확인하였다.

Container-based Cluster Management System for User-driven Distributed Computing (사용자 맞춤형 분산 컴퓨팅을 위한 컨테이너 기반 클러스터 관리 시스템)

  • Park, Ju-Won;Hahm, Jaegyoon
    • KIISE Transactions on Computing Practices
    • /
    • v.21 no.9
    • /
    • pp.587-595
    • /
    • 2015
  • Several fields of science have traditionally demanded large-scale workflow support, which requires thousands of central processing unit (CPU) cores. In order to support such large-scale scientific workflows, large-capacity cluster systems such as supercomputers are widely used. However, as users require a diversity of software packages and configurations, a system administrator has some trouble in making a service environment in real time. In this paper, we present a container-based cluster management platform and introduce an implementation case to minimize performance reduction and dynamically provide a distributed computing environment desired by users. This paper offers the following contributions. First, a container-based virtualization technology is assimilated with a resource and job management system to expand applicability to support large-scale scientific workflows. Second, an implementation case in which docker and HTCondor are interlocked is introduced. Lastly, docker and native performance comparison results using two widely known benchmark tools and Monte-Carlo simulation implemented using various programming languages are presented.

A Workflow Execution System for Analyzing Large-scale Astronomy Data on Virtualized Computing Environments

  • Yu, Jung-Lok;Jin, Du-Seok;Yeo, Il-Yeon;Yoon, Hee-Jun
    • International Journal of Contents
    • /
    • v.16 no.4
    • /
    • pp.16-25
    • /
    • 2020
  • The size of observation data in astronomy has been increasing exponentially with the advents of wide-field optical telescopes. This means the needs of changes to the way used for large-scale astronomy data analysis. The complexity of analysis tools and the lack of extensibility of computing environments, however, lead to the difficulty and inefficiency of dealing with the huge observation data. To address this problem, this paper proposes a workflow execution system for analyzing large-scale astronomy data efficiently. The proposed system is composed of two parts: 1) a workflow execution manager and its RESTful endpoints that can automate and control data analysis tasks based on workflow templates and 2) an elastic resource manager as an underlying mechanism that can dynamically add/remove virtualized computing resources (i.e., virtual machines) according to the analysis requests. To realize our workflow execution system, we implement it on a testbed using OpenStack IaaS (Infrastructure as a Service) toolkit and HTCondor workload manager. We also exhaustively perform a broad range of experiments with different resource allocation patterns, system loads, etc. to show the effectiveness of the proposed system. The results show that the resource allocation mechanism works properly according to the number of queued and running tasks, resulting in improving resource utilization, and the workflow execution manager can handle more than 1,000 concurrent requests within a second with reasonable average response times. We finally describe a case study of data reduction system as an example application of our workflow execution system.