Browse > Article

Provisioning Scheme of Large Volume File for Efficient Job Execution in Grid Environment  

Kim, Eun-Sung (서울대학교 컴퓨터공학부)
Yeom, Beon-Y. (서울대학교 컴퓨터공학부)
Abstract
Staging technique is used to provide files for a job in the Grid. If a staged file has large volume, the start time of the job is delayed and the throughput of job in the Grid may decrease. Therefore, removal of staging overhead helps the Grid operate more efficiently. In this paper, we present two methods for efficient file provisioning to clear the overhead. First, we propose RA-RFT, which extends RFT of Globus Toolkit and enables it to utilize RLS with replica information. RA-RFT can reduce file transfer time by doing partial transfer for each replica in parallel. Second, we suggest Remote Link that uses remote I/O instead of file transfer. Remote link is able to save storage of computational nodes and enables fast file provisioning via prefetching. Through various experiments, we argue that our two methods have an advantage over existing staging techniques.
Keywords
Grid; File provisioning; Replica; Remote I/O; Staging;
Citations & Related Records
연도 인용수 순위
  • Reference
1 T. Foster, C. Kesselman, S. Tuecke, “The Ana-tomy of the Grid: Enabling Scalable Virtual Orga-nizations,” Journal of Supercomputer Applications, 15(3), 2001   DOI
2 I. Foster, “Globus Toolkit Version 4: Software for Service-Oriented Systems,” Conference on Network and Parallel Computing, Tokyo, Japan, 2006
3 J. Bester, I. Foster, C. Kesselman, J. Tedesco, S. Tuecke, “GASS: A Data Movement and Access Service for Wide Area Computing Systems,” Work-shop on I/O in Parallel and Distributed Systems, Atlanta, USA, 1999   DOI
4 I. Foster, D. Kohr, R. Krishnaiyer, J. Mogill, "Remote I/O: Fast Access to Distant Storage," Workshop on Input/Output in Parallel and Dis-tributed Systems, San Jose, USA, 1997   DOI
5 Jonghyun Lee, R. Ross, R. Thakur, Xiaosong Ma, M. Winslett, “RFS: efficient arid flexible remote file access for MPI-IO,” Conference on Cluster Computing, San Diego, USA, 2004
6 DRS, http://www.globus.org/toolkit/docs/4.0/techpre-view/datarep/
7 D. Yin, B. Chen, and Y. Fang, “A fast replica selection algorithm for data grid,” Computer Soft-ware and Applications Conference, Beijing, China, 2007   DOI
8 XiaoLi Zhou, Eunsung Kim, Jai Wug Kim, and Heon Y. Veom, “ReCon: A Fast and ReliabIe Replica Retrieval Service for the Data Grid,” Symposium on Cluster Computing and the Grid, Singapore, 2006   DOI
9 Netperf, http://www.netperf.org
10 M. Ripeanu, I. Foster and A. Iamnitchi, “Mapping the Gnutella Network: Properties of Large-Scale Peer-to-Peer Systems and Implications for Sys-tem Design,” IEEE Internet Computing, 6(1), 2002   DOI   ScienceOn
11 Y. Zhao and Y. Hu, “Gress-a grid replica selection service,” Parallel and Distributed Com-puting Systems, Marina del Rey, USA, 2003
12 FUSE, http://fuse.sourceforge.net
13 Eunsung Kim, Hyeong S. Kim, Heon Y. Yeom, and Jongsook Lee, "GiSK: Making Secure, Re-liable and Scalable VO Repository Virtualizing Generic Disks in the Grid," Conference on High-Performance Computing in Asia-Pacific Region, Beijing, China, 2005   DOI
14 Osamu Tatebe, Noriyuki Soda, Youhei Morita, Satoshi Matsuoka, Satoshi Sekiguchi, "Gfarm v2: A Grid file system that supports high-perfor-mance distributed and parallel data computing," Computing in High Energy and Nuclear Physics, Interlaken, Switzerland, 2004
15 PlanetLab, http://www.planet-lab.org
16 R. Woiski, N. T. Spring, and J. Hayes, “The network weather service: a distributed resource performance forecasting service for metacompu-ting,“ Future Generation Computer Systems, 15 (5-6), 1999   DOI   ScienceOn
17 R. M. Rahman, K. Barker, and R. Alhaij, “Replica selection in grid environrnnt: a data-mining app-roach,” Symposium on Applied computing, Santa Fe, USA, 2005
18 A. Shishani, A. Sim, and J. Gu, “Storage Re-source Managers: Middleware Components for Grid Storage,” Symposium on Mass Storage Systems, College Park, USA, 2002