Semi-supervised classification with LS-SVM formulation

최소제곱 서포터벡터기계 형태의 준지도분류

  • 석경하 (인제대학교 데이터정보학과)
  • Received : 2010.04.08
  • Accepted : 2010.05.19
  • Published : 2010.05.31

Abstract

Semi supervised classification which is a method using labeled and unlabeled data has considerable attention in recent years. Among various methods the graph based manifold regularization is proved to be an attractive method. Least squares support vector machine is gaining a lot of popularities in analyzing nonlinear data. We propose a semi supervised classification algorithm using the least squares support vector machines. The proposed algorithm is based on the manifold regularization. In this paper we show that the proposed method can use unlabeled data efficiently.

라벨 있는 자료가 분류규칙을 만들 만큼 충분하지 않거나, 라벨 없는 자료가 분류규칙을 만드는데 도움을 줄 수 있는 경우에는 라벨 있는 자료와 라벨 없는 자료를 모두 사용하는 준지도분류가 더 효과적이다. 준지도분류 중 그래프기반 다양체정칙법이 개발되어 최근에 많은 연구가 이루어지고 있다. 본 연구에서는 통계적학습에서 좋은 성능을 보이는 최소제곱 서포터벡터기계를 준지도분류에 적용시키는 방법을 제안한다. 모의실험을 통해 제안된 방법이 라벨 없는 자료를 잘 활용하는 것을 볼 수 있었다.

Keywords

References

  1. Belkin, M., Niyogi, P. and Sindhwani, V. (2006). On manifold regularization: A geometric framework for learning from labeled and unlabeled examples. Journal of Machine Learning Research, 1-48.
  2. Blum, A. and Mitchell, T. (1998). Combining labeled and unlabeled data with co-training. Proceedings of the 11th Annual Conference on Computational Learning Theory, Madison, 92-100.
  3. Chapelle, O., Scholkopf , B. and Zien, A. (2006). Semi-supervised learning, The MIT Press, Cambridge, Massachusetts.
  4. Holub, A., Welling, M. and Perona, P. (2005). Exploiting unlabelled data for hybrid object classification. NIPS 2005 Workshop in Inter-Class Transfer.
  5. Huang, T. M. and Kecman, V. (2004). Semi-supervised learning from unbalanced labeled data - An improvement. Knowledge Based and Emergent Technologies Relied Intelligent Information and Engineering Systems.
  6. Joachims, T. (1999). Transductive inference for text classification using support vector machines. Proceedings of the 16th International Conference on Machine Learning, 200-209.
  7. Mercer, J. (1909). Functions of positive and negative type and their connection with theory of integral equations. Philosophical Transactions of Royal Society, A, 415-446.
  8. Nigam, K. and Ghani, R. (2000). Analyzing the effectiveness and applicability of co-training. Ninth International Conference on Information and Knowledge Management, 86-93.
  9. Nigam, K., McCallum, A. K., Thrun, S. and Mitchell, T. (2000). Text classification from labeled and unlabeled documents using EM. Machine Learning, 39, 103-134. https://doi.org/10.1023/A:1007692713085
  10. Scholkopf , B. and Smola, A. (2002). Learning with kernels- Support vector machines, regularization, optimizations and beyond, MIT Press.
  11. Seeger, M. (2001). Learning with labeled and unlabeled data, Technical report, University of Edinburgh.
  12. Seok, K. H. (2007). Data-adaptive ECOC for multicategory classification. Journal of Korean Data & Information Science Society, 19, 25 -36.
  13. Seok, K. H. (2007). Semi-supervised learning using kernel estimation. Journal of Korean Data & Information Science Society, 18, 629-636.
  14. Shim , J. and Lee, J. T. (2009). Kernel method for autoregressive data. Journal of Korean Data & Information Science Society, 20, 949-964 .
  15. Shim, J., Park, H. J. and Seok, K. H. (2009). Variance function estimation with LS-SVM for replicated data. Journal of Korean Data & Information Science Society, 20, 925 -931.
  16. Suykens, J. A. K., Gastel, T. V., Bravanter, J. D., Moore, B. D. and Vandewalle, J. (2002). Least squares support vector machines, World Scientific.
  17. Suykens, J. A. K. and Vandewalle, J. (1999). Least squares support vector machine classifiers. Neural Processing Letters, 9, 293-300. https://doi.org/10.1023/A:1018628609742
  18. Vapnik, V. (1998). Statistical learning theory, Wiley, New York.
  19. Zhou, D., Bousquet, T. N., Lal, J . and Scholkopf, B . (2004). Learning with local and global consistency. Advances in Neural Information Processing Systems, 16, 321-328.
  20. Zhou, Y. and Goldman, S. (2004). Democratic co-learning. Proceedings of the 16th IEEE International Conference on Tools with Artificial Intelligence (ICTAI2004).
  21. Zhou, X., Ghahramani, Z. and Lafferty, J. (2003). Semi-supervised learning using Gaussian fields and harmonic functions. Proc. of the 20th International Conference on Machine Learning, Washington DC.
  22. Zhou, Z., Chen. K., and Dai, H. (2006). Enhancing relevance in image retrieval using unbalance data. ACM Transactions on Information Systems, 24, 219-244. https://doi.org/10.1145/1148020.1148023
  23. Zhou, Z. and Li, M. (2007). Semi-supervised regression with co-training style algorithm. IEEE Transactions on Knowledge and Data Engineering.
  24. Zhu, D. (2005). Semi-supervised learning literature survey, Technical Report Computer Sciences 1530, University of Wisconsin - Madison.