Proceedings of the Korea Information Processing Society Conference (한국정보처리학회:학술대회논문집)
- 2013.05a
- /
- Pages.476-477
- /
- 2013
- /
- 2005-0011(pISSN)
- /
- 2671-7298(eISSN)
DOI QR Code
Sim-Hadoop : Leveraging Hadoop Distributed File System and Parallel I/O for Reliable and Efficient N-body Simulations
Sim-Hadoop : 신뢰성 있고 효율적인 N-body 시뮬레이션을 위한 Hadoop 분산 파일 시스템과 병렬 I / O
- Awan, Ammar Ahmad (Dept. of Computer Engineering, Kyung Hee University) ;
- Lee, Sungyoung (Dept. of Computer Engineering, Kyung Hee University) ;
- Chung, Tae Choong (Dept. of Computer Engineering, Kyung Hee University)
- Published : 2013.05.10
Abstract
Gadget-2 is a scientific simulation code has been used for many different types of simulations like, Colliding Galaxies, Cluster Formation and the popular Millennium Simulation. The code is parallelized with Message Passing Interface (MPI) and is written in C language. There is also a Java adaptation of the original code written using MPJ Express called Java Gadget. Java Gadget writes a lot of checkpoint data which may or may not use the HDF-5 file format. Since, HDF-5 is MPI-IO compliant, we can use our MPJ-IO library to perform parallel reading and writing of the checkpoint files and improve I/O performance. Additionally, to add reliability to the code execution, we propose the usage of Hadoop Distributed File System (HDFS) for writing the intermediate (checkpoint files) and final data (output files). The current code writes and reads the input, output and checkpoint files sequentially which can easily become bottleneck for large scale simulations. In this paper, we propose Sim-Hadoop, a framework to leverage HDFS and MPJ-IO for improving the I/O performance of Java Gadget code.
Keywords