DOI QR코드

DOI QR Code

Truncated Kernel Projection Machine for Link Prediction

  • Huang, Liang (School of Mathematics and Computer Science, Wuhan Polytechnic University) ;
  • Li, Ruixuan (School of Computer Science and Technology, Huazhong University of Science and Technology) ;
  • Chen, Hong (College of Science, Huazhong Agricultural University)
  • 투고 : 2015.08.25
  • 심사 : 2016.05.21
  • 발행 : 2016.06.30

초록

With the large amount of complex network data that is increasingly available on the Web, link prediction has become a popular data-mining research field. The focus of this paper is on a link-prediction task that can be formulated as a binary classification problem in complex networks. To solve this link-prediction problem, a sparse-classification algorithm called "Truncated Kernel Projection Machine" that is based on empirical-feature selection is proposed. The proposed algorithm is a novel way to achieve a realization of sparse empirical-feature-based learning that is different from those of the regularized kernel-projection machines. The algorithm is more appealing than those of the previous outstanding learning machines since it can be computed efficiently, and it is also implemented easily and stably during the link-prediction task. The algorithm is applied here for link-prediction tasks in different complex networks, and an investigation of several classification algorithms was performed for comparison. The experimental results show that the proposed algorithm outperformed the compared algorithms in several key indices with a smaller number of test errors and greater stability.

키워드

참고문헌

  1. M. E. Newman, "Clustering and preferential attachment in growing networks," Physical Review E, vol. 64, no. 2, article ID. 025102, 2001. https://doi.org/10.1103/PhysRevE.64.025102
  2. A. L. Barabasi, H. Jeong, Z. Neda, E. Ravasz, A. Schubert, and T. Vicsek, "Evolution of the social network of scientific collaborations," Physica A: Statistical Mechanics and Its Applications, vol. 311, no. 3, pp. 590-614, 2002. https://doi.org/10.1016/S0378-4371(02)00736-7
  3. D. Liben-Nowell and J. Kleinberg, "The link prediction problem for social networks," in Proceedings of the 12th International Conference on Information and Knowledge Management (CIKM'03), New Orleans, LA, 2003, pp. 556-559.
  4. M. Al Hasan, V. Chaoji, S. Salem, and M. Zaki, "Link prediction using supervised learning," in Proceedings of 4th Workshop on Link Analysis, Counter-terrorism and Security, Bethesda, MD, 2006.
  5. H. Kashima and N. Abe, "A parameterized probabilistic model of network evolution for supervised link prediction," in Proceedings of 6th International Conference on Data Mining (ICDM'06), Hong Kong, 2006, pp. 340-349.
  6. W. S. Noble, "Support vector machine applications in computational biology," in Kernel Methods in Computational Biology, Cambridge, MA: MIT Press, pp. 71-92, 2004.
  7. Y. Guo, L. Yu, Z. Wen, and M. Li, "Using support vector machine combined with auto covariance to predict protein-protein interactions from protein sequences," Nucleic Acids Research, vol. 36, no. 9, pp. 3025-3030, 2008. https://doi.org/10.1093/nar/gkn159
  8. A. Ben-Hur and W. S. Noble, "Kernel methods for predicting protein-protein interactions," Bioinformatics, vol. 21, no. suppl 1, pp. i38-i46, 2005. https://doi.org/10.1093/bioinformatics/bti1016
  9. C. N. Magnan, A. Randall, and P. Baldi, "SOLpro: accurate sequence-based prediction of protein solubility," Bioinformatics, vol. 25, no. 17, pp. 2200-2207, 2009. https://doi.org/10.1093/bioinformatics/btp386
  10. L. Zwald, G. Blanchard, P. Massart, and R. Vert, "Kernel projection machine: a new tool for pattern recognition," Advances in Neural Information Processing Systems, vol. 17, pp. 1649-1656, 2005.
  11. X. Guo and D. X. Zhou, "An empirical feature-based learning algorithm producing sparse approximations," Applied and Computational Harmonic Analysis, vol. 32, no. 3, pp. 389-400, 2012. https://doi.org/10.1016/j.acha.2011.07.005
  12. L. Zwald and G. Blanchard, "On the convergence of eigenspaces in kernel principal component analysis," Advances in Neural Information Processing Systems, vol. 18, pp. 1649-1656, 2006.
  13. S. Smale and D. X. Zhou, "Online learning with Markov sampling," Analysis and Applications, vol. 7, no. 1, pp. 87-113, 2009. https://doi.org/10.1142/S0219530509001293
  14. G. Blanchard and L. Zwald, "Finite-dimensional projection for classification and statistical learning," IEEE Transactions on Information Theory, vol. 54, no. 9, pp. 4169-4182, 2008. https://doi.org/10.1109/TIT.2008.926312
  15. G. Raetsch, "Benchmark repository used in several Boosting, KFD, and SVM papers," Available: http://archive.ics.uci.edu/ml/datasets.html.
  16. DBLP dataset, http://dblp.uni-trier.de/xml/.
  17. J. Shen, J. Zhang, X. Luo, W. Zhu, K. Yu, K. Chen, Y. Li, and H. Jiang, "Predicting protein-protein interactions based only on sequences information," Proceedings of the National Academy of Sciences, vol. 104, no. 11, pp. 4337-4341, 2007. https://doi.org/10.1073/pnas.0607879104
  18. Database of Interaction Proteins, http://dip.doe-mbi.ucla.edu.
  19. S. V. N. Vishwanathan, N. N. Schraudolph, R. Kondor, and K. M. Borgwardt, "Graph kernels," Journal of Machine Learning Research, vol. 11, pp. 1201-1242, 2010.