Browse > Article
http://dx.doi.org/10.13089/JKIISC.2013.23.5.803

A Study on Security Improvement in Hadoop Distributed File System Based on Kerberos  

Park, So Hyeon (Graduate School of Information Security, Korea University)
Jeong, Ik Rae (Graduate School of Information Security, Korea University)
Abstract
As the developments of smart devices and social network services, the amount of data has been exploding. The world is facing Big data era. For these reasons, the Big data processing technology which is a new technology that can handle such data has attracted much attention. One of the most representative technologies is Hadoop. Hadoop Distributed File System(HDFS) designed to run on commercial Linux server is an open source framework and can store many terabytes of data. The initial version of Hadoop did not consider security because it only focused on efficient Big data processing. As the number of users rapidly increases, a lot of sensitive data including personal information were stored on HDFS. So Hadoop announced a new version that introduces Kerberos and token system in 2009. However, this system is vulnerable to the replay attack, impersonation attack and other attacks. In this paper, we analyze these vulnerabilities of HDFS security and propose a new protocol which complements these vulnerabilities and maintains the performance of Hadoop.
Keywords
Hadoop distributed file system(HDFS); Hadoop; Big Data; Cloud computing; Authentication; Kerberos;
Citations & Related Records
연도 인용수 순위
  • Reference
1 K. Shvachko, H. Huang, S. Radia, and R. Chansler, "The hadoop distributed file system," Proceedings of the 2010 IEEE 26th Symposium on Massive Storage Systems and Technologies (MSST), pp. 1-10, May 2010.
2 A. Becherer, "Hadoop security design just add kerberos? really?," iSEC PARTNER, 2010.
3 C. Neuman, T. Yu, S. Hartman, and K. Raeburn, "The kerberos network authentication service (V5)," RFC 4120, Jul. 2005.
4 O. O'Malley, K. Zhang, S. Radia, R. Marti, and C. Harrell, "Hadoop security design," http://bit.ly/75011o, Oct. 2009.
5 Apache Hadoop, http://hadoop.apache.org/
6 S. Ghemawat, H. Gobioff, and S. Leung, "The google file system," Proceedings of ACM Symposium on Operating Systems Principles, pp. 29-43, Oct. 2003.
7 T. White, "Hadoop: the definitive guide," O'Reilly Media, Yahoo! Press, Jun. 2009.
8 E. Sammer, "Hadoop operations," O'R-eilly Media, Oct. 2012.
9 D. Borthakur, "The hadoop distributed file system: architecture and design," http://hadoop.apache.org/docs/r1.0.4/ hdfs_design.pdf
10 J. Gantz and D. Reinsel, "Extracting value from chaos," IDC, Jun. 2011.
11 A. Melnikov and K. Zeilenga, "Simple authentication and security layer (SASL)," RFC 4422, Jun. 2006.
12 J. Dean and S. Ghemawat, "Map- Reduce: simplified data processing on large clusters," Communications of the ACM, pp. 107-113, 2008.
13 김형준, 조준호, 안성화, 김병준, 클라우드 컴퓨팅 구현 기술, 에이콘 출판사, 2012년 5월.
14 O. O'Malley, "Integrating kerberos into apache hadoop," Kerberos Conference 2010, Oct. 2010.