DOI QR코드

DOI QR Code

Dimension Reduction Methods on High Dimensional Streaming Data with Concept Drift

개념 변동 고차원 스트리밍 데이터에 대한 차원 감소 방법

  • 박정희 (충남대학교 컴퓨터공학과)
  • Received : 2016.01.08
  • Accepted : 2016.04.06
  • Published : 2016.08.31

Abstract

While dimension reduction methods on high dimensional data have been widely studied, research on dimension reduction methods for high dimensional streaming data with concept drift is limited. In this paper, we review incremental dimension reduction methods and propose a method to apply dimension reduction efficiently in order to improve classification performance on high dimensional streaming data with concept drift.

고차원데이터에 대한 차원 감소 기법들은 많이 연구되어져 온 반면, 개념 변동을 가진 고차원 스트리밍 데이터에서 적용할 수 있는 차원 감소 기법에 대한 연구는 제한적이다. 이 논문에서는 스트리밍 데이터에서 적용할 수 있는 점층적 차원 감소 기법들을 살펴보고, 개념 변동 고차원 스트리밍 데이터에 대해 분류 성능을 향상시킬 수 있도록 차원 감소를 효과적으로 적용하는 방법을 제안한다.

Keywords

References

  1. J. Gama, I. Zliobaite, A. Bifet, M. Pechennizkiy, and A. Bouchachia, "A survey on concept drift adaptation," ACM Computing Surveys, Vol.46, No.4, pp.1-37, 2014.
  2. P. Domingos and G. Hulten, "Mining high-speed data streams," in Proceedings of International Conference on Knowledge Discovery and Data Mining, 2000.
  3. H. Wang, W. Fan, P. Yu, and J. Han, "Mining concept-drifting data streams using ensemble classifiers," in Proceedings of International Conference on Knowledge Discovery and Data Mining, 2003.
  4. J. Z. Kolter and M. A. Malloof, "Dynamic weighted majority: an ensemble method for drifting concepts," Journal of Machine Learning Research, Vol.8, pp.2755-2790, 2007.
  5. S. Ho and H. Wechsler, "A martingale framework for detecting changes in data streams by testing exchange ability," IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol.32, No.12, pp.2113-2127, 2010. https://doi.org/10.1109/TPAMI.2010.48
  6. L. I. Kuncheva and C. O. Plumpton, "Adaptive learning rate for online linear discriminant classifiers," LNCS 5342, pp. 510-519, 2008.
  7. C. H. Park, "An adaptive incremental classifier for data stream with concept drift," submitted for publication.
  8. Juyang Weng, Yilu Zhang, and Wey-Shiuan Hwang, "Candid covariance-free incremental principal component analysis," IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol.25, No.8, pp.1034-1040, 2003. https://doi.org/10.1109/TPAMI.2003.1217609
  9. J. Yan, B. Zhang, S. Yan, N. Liu, Q. Yang, Q. Cheng, H. Li, Z. Chen, and W. Ma, "A scalable supervised algorithm for dimensionality reduction on streaming data," Information Sciences, Vol.17, No.6, pp.2042-2065, 2006.
  10. X. Zeng and G. Li, "Incremental partial least squares analysis of big streaming data," Pattern Recognition, Vol.47, pp. 3726-3735, 2014. https://doi.org/10.1016/j.patcog.2014.05.022
  11. J. Yan, B. Zhang, S. Yan, Q. Yang, and H. Li, "IMMC: Incremental Maximum Margin Criterion," in Proceedings of International Conference on Knowledge Discovery and Data Mining, 2004.
  12. Y. Ghassabeh and H. Moghaddam, "Adaptive linear discriminant analysis for online feature extraction," Machine Vision and Applications, Vol.24, pp.777-794, 2013. https://doi.org/10.1007/s00138-012-0439-z
  13. J. Gama, P. Medas, G. Castillo, and P. Rodrigues, "Learning with drift detection," in Proceedings of SBIA Brazilian Symposium on Artificial Intelligence, 2004.
  14. R. Duda, P. Hart, and D. Stork, "Pattern Classification," New York: Wiley-Interscience, 2001.
  15. C. Lanquillon, "Information filtering in changing domains," in Proceedings of the 16th International Joint Conference on Artificial Intelligence, 1999.
  16. P. Lindstrom, B. Namee, and S. Delany, "Drift detection using uncertainty distribution divergence," Evolving Systems, Vol. 4, pp.13-25, 2013. https://doi.org/10.1007/s12530-012-9061-6
  17. http://people.csail.mit.edu/jrennie/20Newsgroups.
  18. UCI machine learning repository.
  19. http://yann.lecun.com/exdb/mnist.