DOI QR코드

DOI QR Code

Concept Drift Based on CNN Probability Vector in Data Stream Environment

  • Kim, Tae Yeun (National Program of Excellence in Software center, Chosun University) ;
  • Bae, Sang Hyun (Department of Computer Science & Statistics, Chosun University)
  • Received : 2020.10.02
  • Accepted : 2020.11.28
  • Published : 2020.12.31

Abstract

In this paper, we propose a method to detect concept drift by applying Convolutional Neural Network (CNN) in a data stream environment. Since the conventional method compares only the final output value of the CNN and detects it as a concept drift if there is a difference, there is a problem in that the actual input value of the data stream reacts sensitively even if there is no significant difference and is incorrectly detected as a concept drift. Therefore, in this paper, in order to reduce such errors, not only the output value of CNN but also the probability vector are used. First, the data entered into the data stream is patterned to learn from the neural network model, and the difference between the output value and probability vector of the current data and the historical data of these learned neural network models is compared to detect the concept drift. The proposed method confirmed that only CNN output values could be used to reduce detection errors compared to how concept drift were detected.

Keywords

References

  1. K. J. Kim, S. Y. Oh and M. S. Lee, "Pattern Classification for IoT Stream Data using Convolutional Neural Networks", Journal of KIISE, Vol. 35, No. 2, pp. 106-115, 2019.
  2. A. Haque, L. Khan and M. Baron, "Semi supervised adaptive framework for clasifying evolving data stream", Advances in Knowledge Discovery and Data Mining, volume 9078 of Lecture Notes in Computer Science, Springer International Publishing, pp. 383-394, 2015.
  3. T. Y. Kim, S. H. Bae and Y. E. An. "Design of Smart Home Implementation Within IoT Natural Language Interface", IEEE Access, Vol. 8 pp. 84929-84949, 2020. https://doi.org/10.1109/access.2020.2992512
  4. E. J. Lee, S. Y. Oh and M. S. Lee, "Pattern Classification based on Attention Mechanism and CNN for Sensor Stream Data including Missing Values", Journal of KIISE, Vol. 36, No. 2, pp. 56-68, 2020.
  5. S. Cheng and G. Zhou, "Multi-stream CNN for facial expression recognition in limited training data", Multimedia Tools and Applications, Vol. 78, No. 16, pp. 22861-22882, 2019. https://doi.org/10.1007/s11042-019-7530-7
  6. E. S. Lee and E. R. Jeong, "Deep Learning based Frame Synchronization Using Convolutional Neural Network", Journal of the Korea Institute of Information and Communication Engineering, Vol. 24, No. 4, pp. 501-507, 2020.
  7. A. S. Iwashita and J. P. Papa, "An overview on concept drift learning", IEEE Access, Vol. 7, pp. 1532-1547, 2018. https://doi.org/10.1109/ACCESS.2018.2886026
  8. J. Lu, A. Liu, Y. Song and G. Zhang, "Data-driven decision support under concept drift in streamed big data", Complex & Intelligent Systems, Vol. 6, No. 1, pp. 157-163, 2020. https://doi.org/10.1007/s40747-019-00124-4
  9. S. B. Yang and S. J. Lee, "Improved CNN Algorithm for Object Detection in Large Images", Journal of The Korea Society of Computer and Information, Vol. 25, No. 1, pp. 45-53, 2020. https://doi.org/10.9708/JKSCI.2020.25.01.045
  10. J. H. Choi, "Binary CNN Operation Algorithm using Bit-plane Image", Journal of Korea Institute of Information, Electronics, and Communication Technology, Vol. 12, No. 6, pp. 567-572, 2019. https://doi.org/10.17661/JKIIECT.2019.12.6.567
  11. B. Krawczyk and M. Wozniak, "One-class classifiers with incremental learning and forgetting for data streams with concept drift", Soft Computing, Vol. 19, No. 12, pp. 3387-3400, 2015. https://doi.org/10.1007/s00500-014-1492-5
  12. S. Wang, L. L. Minku and X. Yao , "A systematic study of online class imbalance learning with concept drift", IEEE transactions on neural networks and learning systems, Vol. 29, No. 10, pp. 4802-4821, 2018. https://doi.org/10.1109/tnnls.2017.2771290
  13. Z. J. Gao, N. Pansare and C. Jermaine, "Declarative Parameterizations of User-Defined Functions for Large-Scale Machine Learning and Optimization", IEEE Transactions on Knowledge and Data Engineering, Vol. 31, No. 11, pp. 2079-2092. 2018. https://doi.org/10.1109/tkde.2018.2873325