• Title/Summary/Keyword: File provisioning

Search Result 3, Processing Time 0.015 seconds

Provisioning Scheme of Large Volume File for Efficient Job Execution in Grid Environment (그리드 환경에서 효율적인 작업 처리를 위한 대용량 파일 프로비저닝 방안)

  • Kim, Eun-Sung;Yeom, Beon-Y.
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.15 no.8
    • /
    • pp.525-533
    • /
    • 2009
  • Staging technique is used to provide files for a job in the Grid. If a staged file has large volume, the start time of the job is delayed and the throughput of job in the Grid may decrease. Therefore, removal of staging overhead helps the Grid operate more efficiently. In this paper, we present two methods for efficient file provisioning to clear the overhead. First, we propose RA-RFT, which extends RFT of Globus Toolkit and enables it to utilize RLS with replica information. RA-RFT can reduce file transfer time by doing partial transfer for each replica in parallel. Second, we suggest Remote Link that uses remote I/O instead of file transfer. Remote link is able to save storage of computational nodes and enables fast file provisioning via prefetching. Through various experiments, we argue that our two methods have an advantage over existing staging techniques.

A Study on the Design of Ambari Service for Lustre Parallel File System Auto Provisioning (Lustre 병렬파일시스템 오토 프로비저닝을 위한 Ambari 서비스 설계에 관한 연구)

  • Kwak, Jae-Hyuck;Kim, Sangwan;Byun, Eunkyu;Nam, Dukyun
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2017.11a
    • /
    • pp.45-47
    • /
    • 2017
  • 하둡은 대표적인 빅데이터 처리 프레임워크로 널리 사용되고 있지만 하둡 어플리케이션은 고성능컴퓨팅 환경에서 하둡 분산파일시스템이 아닌 러스터 병렬 파일시스템 위에서도 수행될 수 있다. 그러나 이를 위해서 추가적으로 러스터 병렬파일시스템을 구축하고 관리하는 것은 시간 소모적인 업무가 될 수 있다. 본 연구는 러스터 병렬파일시스템의 오토 프로비저닝을 위한 암바리 서비스의 설계 방안에 대해서 제안한다. 암바리는 하둡 클러스터의 프로비저닝, 관리, 모니터링을 위한 운영 관리 프레임워크이며 운영자의 필요에 따라서 확장할 수 있는 서비스 프레임워크를 제공한다. 본 연구에서는 암바리를 통해서 러스터 병렬파일시스템을 오토 프로비저닝하고 관리하기 위한 확장 서비스를 설계하였으며 서비스를 위한 컴포넌트와 각 컴포넌트별 중요한 기능 사항에 대해서 논하였다.

Data Processing Architecture for Cloud and Big Data Services in Terms of Cost Saving (비용절감 측면에서 클라우드, 빅데이터 서비스를 위한 대용량 데이터 처리 아키텍쳐)

  • Lee, Byoung-Yup;Park, Jae-Yeol;Yoo, Jae-Soo
    • The Journal of the Korea Contents Association
    • /
    • v.15 no.5
    • /
    • pp.570-581
    • /
    • 2015
  • In recent years, many institutions predict that cloud services and big data will be popular IT trends in the near future. A number of leading IT vendors are focusing on practical solutions and services for cloud and big data. In addition, cloud has the advantage of unrestricted in selecting resources for business model based on a variety of internet-based technologies which is the reason that provisioning and virtualization technologies for active resource expansion has been attracting attention as a leading technology above all the other technologies. Big data took data prediction model to another level by providing the base for the analysis of unstructured data that could not have been analyzed in the past. Since what cloud services and big data have in common is the services and analysis based on mass amount of data, efficient operation and designing of mass data has become a critical issue from the early stage of development. Thus, in this paper, I would like to establish data processing architecture based on technological requirements of mass data for cloud and big data services. Particularly, I would like to introduce requirements that must be met in order for distributed file system to engage in cloud computing, and efficient compression technology requirements of mass data for big data and cloud computing in terms of cost-saving, as well as technological requirements of open-source-based system such as Hadoop eco system distributed file system and memory database that are available in cloud computing.