DOI QR코드

DOI QR Code

The use of support vector machines in semi-supervised classification

  • Received : 2021.08.21
  • Accepted : 2021.10.28
  • Published : 2022.03.31

Abstract

Semi-supervised learning has gained significant attention in recent applications. In this article, we provide a selective overview of popular semi-supervised methods and then propose a simple but effective algorithm for semi-supervised classification using support vector machines (SVM), one of the most popular binary classifiers in a machine learning community. The idea is simple as follows. First, we apply the dimension reduction to the unlabeled observations and cluster them to assign labels on the reduced space. SVM is then employed to the combined set of labeled and unlabeled observations to construct a classification rule. The use of SVM enables us to extend it to the nonlinear counterpart via kernel trick. Our numerical experiments under various scenarios demonstrate that the proposed method is promising in semi-supervised classification.

Keywords

Acknowledgement

This work is partially funded by the National Research Foundation of Korea (NRF) grants 2018R1D1 A1B07043034 and 2019R1A4A1028134, and by Korea University grant K2105791.

References

  1. Chapelle O, Sindhwani V, and Keerthi SS (2008). Optimization techniques for semi-supervised support vector machines, Journal of Machine Learning Research, 9, 203-233.
  2. Collobert R, Sinz F, Weston J, and Bottou L (2006). Large scale transductive svms, Journal of Machine Learning Research, 7, 1687-1712.
  3. Dempster AP, Laird NM, and Rubin DB (1977). Maximum likelihood from incomplete data via the em algorithm, Journal of the Royal Statistical Society: Series B (Methodological), 39, 1-22. https://doi.org/10.2307/2347807
  4. Ester M, Kriegel HP, Sander J, and Xu X (1996). A density-based algorithm for discovering clusters in large spatial databases with noise., Kdd, 96, 226-231.
  5. Le Thi Hoai A and Tao PD (1997). Solving a class of linearly constrained indefinite quadratic problems by dc algorithms, Journal of global optimization, 11, 253-285. https://doi.org/10.1023/a:1008288411710
  6. Lu Y, Liu PY, Xiao P, and Deng HW (2005). Hotelling's t 2 multivariate profiling for detecting differential expression in microarrays, Bioinformatics, 21, 3105-3113. https://doi.org/10.1093/bioinformatics/bti496
  7. Shaw RG and Mitchell-Olds T (1993). Anova for unbalanced data: an overview, Ecology, 74, 1638-1645. https://doi.org/10.2307/1939922
  8. Shen X, Tseng GC, Zhang X, and Wong WH (2003). On-learning, Journal of the American Statistical Association, 98, 724-734. https://doi.org/10.1198/016214503000000639
  9. Vapnik V (2015). The Nature of Statistical Learning Theory, Springer science & business media.
  10. Wahba G (1990). Spline Models for Observational Data, Philadelphia, SIAM.
  11. Wu Y and Liu Y (2007). Robust truncated hinge loss support vector machines, Journal of the American Statistical Association, 102, 974-983. https://doi.org/10.1198/016214507000000617
  12. Yarowsky D (1995). Unsupervised word sense disambiguation rivaling supervised methods, 33rd Annual Meeting of the Association for Computational Linguistics, 189-196.