Browse > Article
http://dx.doi.org/10.5626/KTCP.2015.21.8.561

Implementation and Performance Analysis of Hadoop MapReduce over Lustre Filesystem  

Kwak, Jae-Hyuck (KISTI)
Kim, Sangwan (KISTI)
Huh, Taesang (KISTI)
Hwang, Soonwook (KISTI)
Publication Information
KIISE Transactions on Computing Practices / v.21, no.8, 2015 , pp. 561-566 More about this Journal
Abstract
Hadoop is becoming widely adopted in scientific and commercial areas as an open-source distributed data processing framework. Recently, for real-time processing and analysis of data, an attempt to apply high-performance computing technologies to Hadoop is being made. In this paper, we have expanded the Hadoop Filesystem library to support Lustre, which is a popular high-performance parallel distributed filesystem, and implemented the Hadoop MapReduce execution environment over the Lustre filesystem. We analysed Hadoop MapReduce over Lustre by using Hadoop standard benchmark tools. We found that Hadoop MapReduce over Lustre execution has a performance 2-13 times better than a typical Hadoop MapReduce execution.
Keywords
Lustre; Hadoop; MapReduce; high performance computing;
Citations & Related Records
연도 인용수 순위
  • Reference
1 S. Conway, C. DeKate, "High-Performance Data Analysis: Big Data Meets HPC," IDC Directions Conference 2013, 2013.
2 R. Appuswamy, C. Gkantsidis, D. Narayanan, O. Hodson, A. Rowstron, "Scale-up vs Scale-out for Hadoop: time to rethink?," Proc. of the 4th annual Symposium on Cloud Computing, pp. 1-13, 2013.
3 Apache Hadoop [Online]. Available: http://hadoop.apache.org
4 T. White, Hadoop: The Definitive Guide, OREILLY, 2010.
5 Lustre [Online]. Available: http://lustre.opensfs.org
6 Top500 Supercomputing System [Online]. Available: http://www.top500.org
7 O. Kulkarni, "Hadoop MapReduce over Lustre," Lustre User Group Conference 2013, 2013.
8 S. Huang, J. Huang, Y. Liu, J. Dai, "HiBench: A Representative and Comprehensive Hadoop Benchmark Suite," Proc. of ICDE Workshops, pp. 1-2, 2010.