Browse > Article
http://dx.doi.org/10.3745/KTSDE.2015.4.2.77

Analysis of the Influence Factors of Data Loading Performance Using Apache Sqoop  

Chen, Liu (부경대학교 컴퓨터공학과)
Ko, Junghyun (부경대학교 컴퓨터공학과)
Yeo, Jeongmo (부경대학교 컴퓨터공학과)
Publication Information
KIPS Transactions on Software and Data Engineering / v.4, no.2, 2015 , pp. 77-82 More about this Journal
Abstract
Big Data technology has been attracted much attention in aspect of fast data processing. Research of practicing Big Data technology is also ongoing to process large-scale structured data much faster in Relatioinal Database(RDB). Although there are lots of studies about measuring analyzing performance, studies about structured data loading performance, prior step of analyzing, is very rare. Thus, in this study, structured data in RDB is tested the performance that loads distributed processing platform Hadoop using Apache sqoop. Also in order to analyze the influence factors of data loading, it is tested repeatedly with different options of data loading and compared with data loading performance among RDB based servers. Although data loading performance of Apache Sqoop in test environment was low, but in large-scale Hadoop cluster environment we can expect much better performance because of getting more hardware resources. It is expected to be based on study improving data loading performance and whole steps of performance analyzing structured data in Hadoop Platform.
Keywords
Apache Sqoop; Hadoop; Relational Database; Data Loading Performance;
Citations & Related Records
Times Cited By KSCI : 1  (Citation Analysis)
연도 인용수 순위
1 Apache Hadoop [Internet], http://hadoop.apache.org
2 Apache Sqoop [Internet], http://sqoop.apache.org
3 Lee Hyunjong, "Use of Big Data Hadoop Platform," in Journal of Communications and Networks, Vol.29, No.11, 2012.
4 Shvachko, K., Hairong Kuang, Radia, S., and Chansler, R., "The Hadoop Distributed File System," in Mass Storage Systems and Technologies (MSST), 2010 IEEE 26th Symposium on, Mar., 2010.
5 HooYoung Ahn, KyongHa Lee, SooHo Lee, YoonJoon Lee, SangMin Lee, and YoungKyun Kim, "An Efficient Method for Enhancing the Storage Efficiency in Hadoop DFS," in Journal of KISS : computing practices, Vol.19, No.3, 2013.
6 Dae Soon Choi, Jeehong Kim, and Young Ik Eom, "Analyses of Replica Placement Schemes in Distributed File Systems," in Journal of Computing Science and Engineering, Vol.39, No.1A, 2012.
7 Tom White, "Hadoop: The Definitive Guide, Third Edition," O'Reilly/Yahoo Press, 2012.
8 Kathleen Ting, Jarek Jarcec Cecho, "Apache Sqoop Cookbook," O'Reilly, 2013.
9 Rinusha Irudeen, Sanjeeva Samaraweera, "Big data solution for Sri Lankan development: A case study from travel and tourism," in Advances in ICT for Emerging Regions, 2013 International Conference on.
10 Nodar Momtselidze, Alex Kuksin "Hadoop Integrating with Oracle Data Warehouse and Data Mining," in Journal of Technical Science and Technologies, Vol.2, No.1, 2013.
11 Ankit Jain, "Instant Apache Sqoop," Packt Publishing Ltd, 2013.
12 Ognjen V. Joldzic, Dijana R. Vukovic, "The Impact of Cluster Characteristics on HiveQL Query Optimization," in Telecommunications Forum (TELFOR), 2013 21st.