Browse > Article
http://dx.doi.org/10.5392/JKCA.2016.16.02.163

In-Memory Based Incremental Processing Method for Stream Query Processing in Big Data Environments  

Bok, Kyoungsoo (충북대학교 정보통신공학과)
Yook, Misun (충북대학교 정보통신공학과)
Noh, Yeonwoo (충북대학교 정보통신공학과)
Han, Jieun (충북대학교 정보통신공학과)
Kim, Yeonwoo (충북대학교 정보통신공학과)
Lim, Jongtae (충북대학교 정보통신공학과)
Yoo, Jaesoo (충북대학교 정보통신공학과)
Publication Information
Abstract
Recently, massive amounts of stream data have been studied for distributed processing. In this paper, we propose an incremental stream data processing method based on in-memory in big data environments. The proposed method stores input data in a temporary queue and compare them with data in a master node. If the data is in the master node, the proposed method reuses the previous processing results located in the node chosen by the master node. If there are no previous results of data in the node, the proposed method processes the data and stores the result in a separate node. We also propose a job scheduling technique considering the load and performance of a node. In order to show the superiority of the proposed method, we compare it with the existing method in terms of query processing time. Our experimental results show that our method outperforms the existing method in terms of query processing time.
Keywords
Big Data; In-memory; Distribute Processing; Real-time Processing; Streaming Data;
Citations & Related Records
Times Cited By KSCI : 3  (Citation Analysis)
연도 인용수 순위
1 J. Dean and S. Ghemawat, "MapReduce: simplified data processing on large clusters," Proc. conference on Symposium on Operating Systems Design & Implementation, pp.137-150, 2004.
2 Matei Zaharia, Mosharaf Chowdhury, Tathagata Das, Ankur Dave, Justin Ma, Murphy McCauly, Michael J. Franklin, Scott Shenker, and Ion Stoica, "Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing," NSDI pp.15-28, 2012.
3 https://storm.apache.org/
4 D. Tiwari and Y. Soligin, "MapReusing Computation in an In-Memory MapReduce System," Proc. International Parallel and Distributed Processing Symposium, pp.61-71, 2014.
5 Pramod Bhatotia, Marcel Dischinge, Rodrigo Rodrigues, and Umut A Acar, "Slider: Incremental Sliding-Window Computations for Large-Scale Data Analysis," Middleware, pp.61-72, 2014.
6 Fan Zhang, Junwei Cao, Samee U. Khan, Keqin Li, and Kai Hwang, "A task-level adaptive MapReduce framework for real-time streaming data in healthcare application," Future Generation Computer System, pp.149-160, 2015.
7 Doug Laney, 3D data management: Controlling data volume, velocity, and variety, Technical report, META Group, 2001.
8 이미영, 최완, "빅데이터 분석을 위한 빅데이터 처리 기술 동향," 정보처리학회지, 제19권, 제2호, pp.20-28, 2012.
9 김현규, 강우람, 김명호, "중첩 윈도우를 가진 데이터 스트림을 위한 효율적인 조인 알고리즘," 정보과학회논문지, 제15권, 제5호, pp.365-369, 2012.
10 이욱현, "스트림 데이터에서 회귀분석에 기반한 빈발항목 예측," 한국콘텐츠학회논문지, 제9권, 제1호, pp147-158, 2009.   DOI
11 김재인, 김대인, 송명진, 한대영, 황부현, "다차원 스트림 데이터 환경에서 이벤트 가중치를 고려한 시간 관계 탐사," 한국콘텐츠학회논문지, 제10권, 제2호, pp.99-110, 2011.   DOI
12 S. Chandrasekar, R. Dakshinamurthy, P. G. Seshakumar, B. Prabavathy, and Chitra Babu, "A Novel Indexing Scheme for Efficient Handling of Small Files in Hadoop Distributed File System," International Conference on Computer Communication and Informatics, pp.1-8, 2013.