Browse > Article
http://dx.doi.org/10.9708/jksci.2022.27.05.077

Unsupervised feature selection using orthogonal decomposition and low-rank approximation  

Lim, Hyunki (Div. of AI Computer Science and Engineering, Kyonggi University)
Abstract
In this paper, we propose a novel unsupervised feature selection method. Conventional unsupervised feature selection method defines virtual label and uses a regression analysis that projects the given data to this label. However, since virtual labels are generated from data, they can be formed similarly in the space. Thus, in the conventional method, the features can be selected in only restricted space. To solve this problem, in this paper, features are selected using orthogonal projections and low-rank approximations. To solve this problem, in this paper, a virtual label is projected to orthogonal space and the given data set is also projected to this space. Through this process, effective features can be selected. In addition, projection matrix is restricted low-rank to allow more effective features to be selected in low-dimensional space. To achieve these objectives, a cost function is designed and an efficient optimization method is proposed. Experimental results for six data sets demonstrate that the proposed method outperforms existing conventional unsupervised feature selection methods in most cases.
Keywords
Feature selection; Unsupervised learning; Low-rank approximation; Orthogonal projection; Regularization;
Citations & Related Records
Times Cited By KSCI : 1  (Citation Analysis)
연도 인용수 순위
1 D. Cai, X. He, J. Han, and T. S. Huang, "Graph regularized nonnegative matrix factorization for data representation", IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol.33, No. 8, pp. 1548-1560, August 2010. DOI: 10.1109/TPAMI.2010.231   DOI
2 F. Nie, H. Huang, X. Cai, C. H. Ding, "Efficient and robust feature selection via joint l2, 1-norms minimization", Advances in Neural Information Processing Systems, pp. 1813-1821, 2010.
3 D. Cai, Deng, C. Zhang, and He, Xiaofei, "Unsupervised feature selection for multi-cluster data", Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data mining", pp. 333-342, 2010.
4 E. Elhamifar and R. Vidal, "Sparse subspace clustering: Algorithm, theory, and applications", IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 35, No. 11, pp. 2765-2781, March 2013. DOI: 10.1109/TPAMI.2013.57   DOI
5 S. Sharmin, M. Shoyaib, A. A. Ali, M. A. H. Khan, and O. Chae, "Simultaneous feature selection and discretization based on mutual information", Pattern Recognition, Vol. 91, pp. 162-174, July 2019. 10.1016/j.patcog.2019.02.016   DOI
6 S. Du, Y. Ma, S. Li, and Y. Ma, "Robust unsupervised feature selection via matrix factorization", Neurocomputing, Vol. 241, pp. 115-127, June 2017. 10.1016/j.neucom.2017.02.034   DOI
7 P. Zhu, W. Zuo, L. Zhang, Q. Hu, and S.C. Shiu, "Unsupervised feature selection by regularized self-representation", Pattern Recognition, Vol. 48, No. 2, pp. 438-446, February 2015. DOI: 10.1016/j.patcog.2014.08.006   DOI
8 D. Han and J. Kim, "Unsupervised simultaneous orthogonal basis clustering feature selection", Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5016-5023, June 2015.
9 J. Lee, W. Seo, D.-W. Kim, "Efficient information-theoretic unsupervised feature selection", Electronics Letters, Vol. 54, No. 2, pp. 76-77, January 2017.   DOI
10 C. Ding, X. He, and H. D. Simon, "On the equivalence of nonnegative matrix factorization and spectral clustering", Proceedings of the International Conference on Data Mining, pp. 606-610, November 2005.
11 B. Recht, M. Fazel, and P. A. Parrilo, "Guaranteed minimum-rank solutions of linear matrix equations via nuclear norm minimization", SIAM Review, Vol. 52, No. 3, pp. 471-501, 2010. DOI: 10.1137/070697835   DOI
12 J. G. Dy and C. E. Brodley, "Feature selection for unsupervised learning", Journal of Machine Learning Research, Vol. 5, pp. 845-889, August 2004.
13 E. Smirni, and G. Ciardo, "Workload-Aware Load Balancing for Cluster Web Servers," IEEE Trans. on Parallel and Distributed Systems, Vol. 16, No. 3, pp. 219-232, March 2005. DOI: 10.1016/j.patcog.2019.03.026   DOI
14 H. Zhang, J. Qian, B. Zhang, J. Yang, C. Gong, and Y. Wei, "Low-rank matrix recovery via modified Schatten-p norm minimization with convergence guarantees", IEEE Transactions on Image Processing, Vol. 29, pp. 3132-3142, December 2019. DOI: 10.1109/TIP.2019.2957925   DOI
15 X. Wu, K. Yu, W. Ding, H. Wang, and X. Zhu, "Online feature selection with streaming features", IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 33, No. 5, pp. 1178-1192, September 2012. DOI: 10.1109/TPAMI.2012.197   DOI
16 R. Sheikhpour, M. A. Sarram, S. Gharaghani, and M. A. Z. Chahooki, "A survey on semi-supervised feature selection methods", Pattern Recognition, Vol. 64, April 2017. 10.1016/j.patcog.2016.11.003   DOI
17 X. He and P. Niyogi, "Locality preserving projections", Advances in neural information processing systems, Vol. 16, 2003.
18 S. Wang, J. Tang, and H. Liu, "Embedded unsupervised feature selection", Proceedings of the AAAI Conference on Artificial Intelligence}, Vol. 29, pp. 470-476, 2015.
19 S. Du, Y. Ma, S. Li, and Y. Ma, "Robust unsupervised feature selection via matrix factorization", Neurocomputing, Vol. 241, pp. 115-127, June 2017. DOI: 10.1016/j.neucom.2017.02.034   DOI
20 X. Zhen, M. Yu, X. He, and S. Li, "Multi-target regression via robust low-rank learning", IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 40, No. 2, pp. 497-504, February 2017. DOI: 10.1109/TPAMI.2017.2688363   DOI
21 X. Zhong, L. Xu, Y. Li, Z. Liu, and E. Chen, "A nonconvex relaxation approach for rank minimization problems", Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 29, No. 1, January 2015.
22 Z. Li, Y. Yang, J. Liu, X. Zhou, and H. Lu, "Unsupervised feature selection using nonnegative spectral analysis", Proceedings of the AAAI Conference on Artificial Intelligence}, Vol. 2, pp. 1026-1032, 2012.