Browse > Article

Design and Implementation of a Monitor for Hadoop Cluster  

Keum, Tae-Hoon (LG Electronics, Mobile Communications Company)
Lee, Won-Joo (Department of Computer Science, Inha Technical College)
Jeon, Chang-Ho (Department of Computer Science & Engineering, Hanyang University ERICA Campus)
Publication Information
Abstract
In this paper, we propose a new monitor for collecting job information from Hadoop clusters in real time. This monitor is made of two programs called Collector and Agent. Agent collects Hadoop cluster's node information and job information, and Collector analyzes the collected information and saves it in a database. Also, Collector was placed in a new node outside the Hadoop cluster so that it does not affect Hadoop's work and will not cause overload. When the proposed monitor was implemented and applied, the testbed cluster was able to detect the occurrence of dead nodes immediately. In addition, we were able to find Hadoop jobs which were inefficient and when we modified such jobs to further enhance the performance of Hadoop.
Keywords
Cloud computing; Cluster monitoring; Hadoop;
Citations & Related Records
Times Cited By KSCI : 1  (Citation Analysis)
연도 인용수 순위
1 Tae Hoon Keum, Won Joo Lee, Chang Ho Jeon, "A Performance Analysis Based on Hadoop Application's Characteristics in Cloud Computing," Journal of The Korea Society of Computer and Information, Vol. 15, No. 5, pp. 49-56, May 2010.   DOI   ScienceOn
2 A. Kimball, S. Michels-Slettvet and C.Biscigilia, "Cluster Computing for Web-Scale Data Processing," Proceeding of the 39th SIGCSE technical symposium on Computer science education, Portland, Oregon, pp. 116-120, March 2008.
3 Hadoop, http://hadoop.apache.org
4 D. Nurmi, R. Wolski, C. Grzegorczyk, G. Obertelli, S. Soman, L. Youseff, and D. Zagorodnov, "The Eucalyptus open-source cloud-computing system," Proceedings of the 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid table of contents, pp. 124-131, 2009.
5 Amazon Elastic Compute Cloud, http://aws.amazon.com/ec2
6 J. Boulon, A. Konwinski, R. Qi, A. Rabkin, E. Yang and M. Yang, "Chukwa: A large-scale monitoring system," Proceeding of international conference on Cloud Computing and Its Applications, pp. 1-5, Oct. 2008.
7 J. Tan, X. Pan, S. Kavulya, R. Gandhi and P. Narasimhan, "Mochi: Visualizing Log-Anlaysis Based Tools for Debugging Hadoop," In USENIX Workshop on Hot Topics in Cloud Computing(HotCloud), SanDiego, CA, Jun. 2009.
8 S. Ghemawat, H. Gobioff, S.T. Leung, "The Google file system," ACM SIGOPS Operating Systems Review, Vol. 37, No. 5, pp. 29-43, Dec. 2003.   DOI   ScienceOn