Browse > Article
http://dx.doi.org/10.5392/IJoC.2019.15.3.032

Design and Implementation of Incremental Learning Technology for Big Data Mining  

Min, Byung-Won (Division of Information and Communication Convergence Engineering Mokwon University)
Oh, Yong-Sun (Division of Information and Communication Convergence Engineering Mokwon University)
Publication Information
Abstract
We usually suffer from difficulties in treating or managing Big Data generated from various digital media and/or sensors using traditional mining techniques. Additionally, there are many problems relative to the lack of memory and the burden of the learning curve, etc. in an increasing capacity of large volumes of text when new data are continuously accumulated because we ineffectively analyze total data including data previously analyzed and collected. In this paper, we propose a general-purpose classifier and its structure to solve these problems. We depart from the current feature-reduction methods and introduce a new scheme that only adopts changed elements when new features are partially accumulated in this free-style learning environment. The incremental learning module built from a gradually progressive formation learns only changed parts of data without any re-processing of current accumulations while traditional methods re-learn total data for every adding or changing of data. Additionally, users can freely merge new data with previous data throughout the resource management procedure whenever re-learning is needed. At the end of this paper, we confirm a good performance of this method in data processing based on the Big Data environment throughout an analysis because of its learning efficiency. Also, comparing this algorithm with those of NB and SVM, we can achieve an accuracy of approximately 95% in all three models. We expect that our method will be a viable substitute for high performance and accuracy relative to large computing systems for Big Data analysis using a PC cluster environment.
Keywords
Incremental Learning; Classifier; Classification Scheme; Big Data Mining; Re-learn; Feature(s);
Citations & Related Records
Times Cited By KSCI : 2  (Citation Analysis)
연도 인용수 순위
1 Jang-Won Gim, Myung-Gwon Hwang, Sa-Kwang Song, Jin-Hyung Kim, Do-Heon Jeong, and Han-Min Jung, "Researcher history tracking service for prescriptive analytics based on researcher activities," Journal of KIISE: Computing Practices and Letters, vol. 20, no. 6, 2014, pp. 359-363.
2 Do-Heon Jeong, "A study on automatic database selection technique using the maximal concept strength recognition method," Journal of the Korean Society for Information Management, vol. 27, no. 3, 2010, pp. 265-281. https://doi.org/10.3743/kosim.2010.27.3.265   DOI
3 Do-Heon Jeong, Hwan-Min Kim, Hye-Sun Kim, and Ki-Jeong Shin, "The relationship between the specificity of S&T terms and auto-classification accuracy," Proceedings of the 14th Conference of Korean Society for Information Management, 2007, pp. 31-36.
4 Jae-Yun Lee, "Improving the performance of a fast text classifier with document-side feature selection," Journal of Information Management, vol. 36, no. 4, 2005, pp. 51-69. https://doi.org/10.1633/jim.2005.36.4.051   DOI
5 Jae-Yun Lee, "A novel clustering method for examining and analyzing the intellectual structure of a scholarly field," Journal of the Korean Society for Information Management, vol. 23, no.4, 2006, pp. 215-231. https://doi.org/10.3743/kosim.2006.23.4.215   DOI
6 Won-Goo Lee, Sung-Ho Shin, Kwang-Young Kim, Do-Heon Jeong, Hwa-Mook Yoon, Won-Kyung Sung, and Min-Ho Lee, "Semi-automatic management of classification scheme with interoperability," The Journal of the Korea Contents Association, vol. 11, no. 12, 2011, pp. 466-474. https://doi.org/10.5392/jkca.2011.11.12.466   DOI
7 R. Burke, "Hybrid Recommender Systems : Survey and Experiments," User Modeling and User-Adapted Interaction, vol. 12, no. 4, Nov. 2011, pp. 331-370.   DOI
8 Francesco Ricci, Recommender Systems Handbook, Springer, 2011.
9 G. Adomavicius and A. Tuzhilin, "Toward the Next Generation of Recommender Systems : A Survey of the State-of-the-Art and Possible Extensions," IEEE Transactions on Knowledge and Data Engineering, vol. 17, no. 6, 2005, pp734-749.   DOI
10 R. Plutchik, "The Nature of Emotions," American Scientist, vol. 89, no. 4, 2001, pp. 344-350.   DOI
11 J. A. Russell, J. M. Fernabdez-Dols, A. S. R. Mastead, and J. C. Wellenkamp, Everyday Conceptions of Emotion : An introduction to the Psychology, Anthropology and Linguistics of Emotion, Kluwer Academic Publishers, 1995.
12 Jeong-Won Lee, Byung-Won Min, and Yong-Sun Oh, "Design of Moa Contents Curation Service System Based on Incremental Learning Technology," Proceedings of the Korea Contents Association 2018 Spring Symposium, 2018, pp.401-402.
13 Won-Goo Lee, Myung-Gwon Hwang, Min-Ho Lee, Sung-Ho Shim, Kwang-Young Kim, Hwa-Mook Yoon, Won-Kyung Sung, and Do-Heon Jeon, "The Automatic Management of Classification Scheme with Interoperability on Heterogeneous Data," Journal of the Korea Institute of Information and Communication Engineering, vol. 15, no. 12, 2011, pp. 2609-2618.   DOI
14 Jeong-Won Lee, Byung-Won Min, and Yong-Sun Oh, "Design and Implementation of Contents Curation LOD(Linked Open Data) Cloud Service System Based on Incremental Learning Model for Big Data Mining," Proceedings of the Korea Contents Association 2018 International Conference on Convergence Content, 2018, pp.131-132.