DOI QR코드

DOI QR Code

Feature Selection via Embedded Learning Based on Tangent Space Alignment for Microarray Data

  • Ye, Xiucai (Department of Computer Science, University of Tsukuba) ;
  • Sakurai, Tetsuya (Department of Computer Science, University of Tsukuba)
  • Received : 2017.05.25
  • Accepted : 2017.11.25
  • Published : 2017.12.30

Abstract

Feature selection has been widely established as an efficient technique for microarray data analysis. Feature selection aims to search for the most important feature/gene subset of a given dataset according to its relevance to the current target. Unsupervised feature selection is considered to be challenging due to the lack of label information. In this paper, we propose a novel method for unsupervised feature selection, which incorporates embedded learning and $l_{2,1}-norm$ sparse regression into a framework to select genes in microarray data analysis. Local tangent space alignment is applied during embedded learning to preserve the local data structure. The $l_{2,1}-norm$ sparse regression acts as a constraint to aid in learning the gene weights correlatively, by which the proposed method optimizes for selecting the informative genes which better capture the interesting natural classes of samples. We provide an effective algorithm to solve the optimization problem in our method. Finally, to validate the efficacy of the proposed method, we evaluate the proposed method on real microarray gene expression datasets. The experimental results demonstrate that the proposed method obtains quite promising performance.

Keywords

Acknowledgement

Supported by : MEXT KAKENHI

References

  1. E. De Rinaldis and A. Lahm, DNA Microarrays: Current Applications. Norfolk, UK: Horizon Scientific Press, 2007.
  2. T. R. Golub, D. K. Slonim, P. Tamayo, C. Huard, M. Gaasenbeek, J. P. Mesirov, H. Coller, M. L. Loh, J. R. Downing, M. A. Caligiuri, et al., "Molecular classification of cancer: class discovery and class prediction by gene expression monitoring," Science, vol. 286, no. 5439, pp. 531-537, 1999. https://doi.org/10.1126/science.286.5439.531
  3. R. L. Somorjai, B. Dolenko, and R. Baumgartner, "Class prediction and discovery using gene microarray and proteomics mass spectroscopy data: curses, caveats, cautions," Bioinformatics, vol. 19, no. 12, pp. 1484-1491, 2003. https://doi.org/10.1093/bioinformatics/btg182
  4. I. Guyon, J. Weston, S. Barnhill, and V. Vapnik, "Gene selection for cancer classification using support vector machines," Machine Learning, vol. 46, no. 1, pp. 389-422, 2002. https://doi.org/10.1023/A:1012487302797
  5. J. G. Dy and C. E. Brodley, "Visualization and interactive feature selection for unsupervised data," in Proceedings of the 6th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Boston, MA, 2000, pp. 360-364.
  6. S. Zhang, H. S. Wong, Y. Shen, and D. Xie, "A new unsupervised feature ranking method for gene expression data based on consensus affinity," IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB), vol. 9, no. 4, pp. 1257-1263, 2012. https://doi.org/10.1109/TCBB.2012.34
  7. P. Zhu, W. Zuo, L. Zhang, Q. Hu, and S. C. Shiu, "Unsupervised feature selection by regularized self-representation," Pattern Recognition, vol. 48, no. 2, pp. 438-446, 2015. https://doi.org/10.1016/j.patcog.2014.08.006
  8. X. Ye, K. Ji, and T. Sakurai, "Unsupervised feature selection with correlation and individuality analysis," International Journal of Machine Learning and Computing, vol. 6, no. 1, pp. 36-41, 2016.
  9. X. He, D. Cai, and P. Niyogi, "Laplacian score for feature selection," Advances in Neural Information Processing Systems, vol. 18, pp. 507-514, 2006.
  10. Z. Zhao and H. Liu, "Spectral feature selection for supervised and unsupervised learning," in Proceedings of the 24th International Conference on Machine Learning, Corvalis, OR, 2007, pp. 1151-1157.
  11. X. Liu, L. Wang, J. Zhang, J. Yin, and H. Liu, "Global and local structure preservation for feature selection," IEEE Transactions on Neural Networks and Learning Systems, vol. 25, no. 6, pp. 1083-1095, 2014. https://doi.org/10.1109/TNNLS.2013.2287275
  12. X. Ye, K. Ji, and T. Sakurai, "Global discriminant analysis for unsupervised feature selection with local structure preservation," in Proceedings of the 29th International Florida Artificial Intelligence Research Society Conference, Key Largo, FL, 2016, pp. 454-459.
  13. D. Cai, C. Zhang, and X. He, "Unsupervised feature selection for multi-cluster data," in Proceedings of ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Washington, DC, 2010, pp. 333-342.
  14. C. Hou, F. Nie, D. Yi, and Y. Wu, "Feature selection via joint embedding learning and sparse regression," in Proceedings of International Joint Conference on Artificial Intelligence, Barcelona, Spain, 2011, pp. 1324-1329.
  15. Z. Li, Y. Yang, J. Liu, X. Zhou, and H. Lu, "Unsupervised feature selection using nonnegative spectral analysis," in Proceedings of the 26th AAAI Conference on Artificial Intelligence, Toronto, Canada, 2012, pp. 1026-1032.
  16. Z. Zhang and H. Zha, "Nonlinear dimension reduction via local tangent space alignment," in Proceedings of the International Conference on Intelligent Data Engineering and Automated Learning, Hong Kong, 2003, pp. 477-481.
  17. F. Nie, H. Huang, X. Cai, and C. Ding, "Efficient and robust feature selection via joint l2,1-norms minimization," Advances in Neural Information Processing Systems, vol. 23, pp. 1813-1821, 2010.