Browse > Article
http://dx.doi.org/10.30693/SMJ.2019.8.4.46

Data Transmitting and Storing Scheme based on Bandwidth in Hadoop Cluster  

Kim, Youngmin (숭실대학교 컴퓨터학과)
Kim, Heejin (숭실대학교 컴퓨터학과)
Kim, Younggwan (숭실대학교 컴퓨터학과)
Hong, Jiman (숭실대학교 컴퓨터학부)
Publication Information
Smart Media Journal / v.8, no.4, 2019 , pp. 46-52 More about this Journal
Abstract
The size of data generated and collected at industrial sites or in public institutions is growing rapidly. The existing data processing server often handles the increasing data by increasing the performance by scaling up. However, in the big data era, when the speed of data generation is exploding, there is a limit to data processing with a conventional server. To overcome such limitations, a distributed cluster computing system has been introduced that distributes data in a scale-out manner. However, because distributed cluster computing systems distribute data, inefficient use of network bandwidth can degrade the performance of the cluster as a whole. In this paper, we propose a scheme that compresses data when transmitting data in a Hadoop cluster considering network bandwidth. The proposed scheme considers the network bandwidth and the characteristics of the compression algorithm and selects the optimal compression transmission scheme before transmission. Experimental results show that the proposed scheme reduces data transfer time and size.
Keywords
Hadoop; Hadoop Cluster; Network Bandwidth; Data Transmission; Compression;
Citations & Related Records
Times Cited By KSCI : 1  (Citation Analysis)
연도 인용수 순위
1 김남호, 노진헌, 정희자, "RFID/NFC 물류의 빅 데이터 처리를 위한 하둡 시스템의 설계," 스마트미디어저널, 제2권, 제3호, 47-53쪽, 2013년 9월
2 https://wikibon.com/wikibons-2018-big-data-analytics-trends-forecast (accessed Nov., 22, 2019).
3 이영훈, 김용일, "Hadoop 클러스터에서 네임 노드와 데이터 노드가 빅 데이터처리 성능에 미치는 영향에 관한 연구," 스마트미디어저널, 제6권, 제3호, 68-74쪽, 2017년 9월
4 M. Laurent and R. James, "Bandwidth Sharing: Objectives and Algorithms," IEEE/ACM Transactions on Networking, vol. 10, pp. 320-328, 2002.   DOI
5 J. Pane and L. Joe, "Making Better Use of Bandwidth, Data Compression and Network Management Technologies," Technical Report, Santa Monica, 2005.
6 R. Ehab, "PERFORMANCE EVALUATION OF DATA COMPRES-SION TECHNIQUES VERSUS DIFFERENT TYPES OF DATA," International Journalof Computer Science and Information Security, vol. 11, no. 12, pp. 73-78, 2013.
7 노승준, 엄영익, "하둡 시스템의 네트워크 자원 사용량 감소를 위한 스트리밍 압축 기법," 한국정보과학회, 2018
8 R. Kritwara and K. Sureerat, "Imporving Hadoop MapReduce Performance with Data Compression: A Study using Wordcount Job," Proc. of the 14th International Conference on Electrical Engineering/Electronics, Computer, Telecommunications and Information Technology (ECTI -CON ), pp. 564-567, IEEE, 2017
9 F. Xinxin, L. Bo, Z. Yuan, and Z. Tianning, "Adding network bandwidth resource management to Hadoop YARN," Proc. of the Seventh International Conference on Information Science and Technology(ICIST), pp. 444-449, 2017.
10 Y. Guo, J. Rao, and X. Zhou, "iShuffle: Improving Hadoop Performance with Shuffle-on-Write," IEEE transactions on parallel and distributed systems, 2016.
11 H. Herodotos, "Hadoop Performance Models," Technical Report CS-2011-05, Duke Computer Science, 2011.
12 https://github.com/inikep/lzbench (accessed Nov., 22, 2019).