Browse > Article
http://dx.doi.org/10.9708/jksci.2010.15.5.049

A Performance Analysis Based on Hadoop Application's Characteristics in Cloud Computing  

Keum, Tae-Hoon (한양대 컴퓨터공학과)
Lee, Won-Joo (인하공업전문대학 컴퓨터정보과)
Jeon, Chang-Ho (한양대 전자컴퓨터공학부)
Abstract
In this paper, we implement a Hadoop based cluster for cloud computing and evaluate the performance of this cluster based on application characteristics by executing RandomTextWriter, WordCount, and PI applications. A RandomTextWriter creates given amount of random words and stores them in the HDFS(Hadoop Distributed File System). A WordCount reads an input file and determines the frequency of a given word per block unit. PI application induces PI value using the Monte Carlo law. During simulation, we investigate the effect of data block size and the number of replications on the execution time of applications. Through simulation, we have confirmed that the execution time of RandomTextWriter was proportional to the number of replications. However, the execution time of WordCount and PI were not affected by the number of replications. Moreover, the execution time of WordCount was optimum when the block size was 64~256MB. Therefore, these results show that the performance of cloud computing system can be enhanced by using a scheduling scheme that considers application's characteristics.
Keywords
Cloud Computing; HDFS(Hadoop Distributed File System); Cluster; Hadoop;
Citations & Related Records
연도 인용수 순위
  • Reference
1 J. Dean and S. Ghemawat, "Mapreduce: Simplified Data Processing on Large Clusters," Communications of the ACM, Vol. 51, No. 1, pp. 107-113, Jan. 2008.   DOI   ScienceOn
2 Hadoop, http://hadoop.apache.org, 2009.
3 J. Boulon, A. Konwinski, R. Qi, A. Rabkin, E. Yang and M. Yang, "Chukwa: A large-scale monitoring system," Proceeding of international conference on Cloud Computing and Its Applications, pp. 1-5, Oct. 2008.
4 Wikipedia, http://en.wikipedia.org/wiki/Christophe_Bisciglia, 2009
5 Amazon Elastic Compute Cloud, http://aws.amazon.com/ec2, 2007
6 IBM Blue Cloud project, http://www04.ibm.com/jct03001c/press/us/en/pressrelease/22613.wss,2009.
7 Google App Engine, http://code.google.com/appengine, 2009.
8 L. Wang and G. Von Laszewski, "Cloud Computing: A Perspective Study," In Proceedings of the Grid Computing Environments (GCE) workshop, Nov. 2008.
9 S. Ghemawat, H. Gobioff, S.T. Leung, "The Google file system," ACM SIGOPS Operating Systems Review, Vol. 37, No. 5, pp. 29-43, Dec. 2003.   DOI   ScienceOn
10 J. Tan, X. Pan, S. Kavulya, R. Gandhi and P. Narasimhan, "Mochi: Visualizing Log-Anlaysis Based Tools for Debugging Hadoop," In USENIX Workshop on Hot Topics in Cloud Computing(HotCloud), SanDiego, CA, June 2009.
11 금태훈, 김세회, 김건우, 이원주, 전창호, "클라우드 컴퓨팅을 위한 Hadoop 애플리케이션 특성 분석(An Analysis of Hadoop Application's Characteristics for Cloud Computing)," 한국컴퓨터정보학회 2009하계학술발표논문집, 제17권, 제1호, 11-12쪽, 2009년 7월.