Browse > Article
http://dx.doi.org/10.6109/jkiice.2014.18.10.2562

Prediction of Protein Subcellular Localization using Label Power-set Classification and Multi-class Probability Estimates  

Chi, Sang-Mun (Department of Computer Science and Engineering, Kyungsung University)
Abstract
One of the important hints for inferring the function of unknown proteins is the knowledge about protein subcellular localization. Recently, there are considerable researches on the prediction of subcellular localization of proteins which simultaneously exist at multiple subcellular localization. In this paper, label power-set classification is improved for the accurate prediction of multiple subcellular localization. The predicted multi-labels from the label power-set classifier are combined with their prediction probability to give the final result. To find the accurate probability estimates of multi-classes, this paper employs pair-wise comparison and error-correcting output codes frameworks. Prediction experiments on protein subcellular localization show significant performance improvement.
Keywords
Protein subcellular localization; label power-set classification; multi-class probability estimates; pair-wise comparison; error-correcting output codes;
Citations & Related Records
Times Cited By KSCI : 1  (Citation Analysis)
연도 인용수 순위
1 H.-B. Shen and K.-C. Chou, "A top-down approach to enhance the power of predicting human protein subcellular localization: Hum-mPLoc 2.0," Anaytical Biochemistry, vol. 394, no. 2, pp. 269-274, 2009.   DOI   ScienceOn
2 S.-M. Chi and D. Nam, "WegoLoc: accurate prediction of protein subcellular localization using weighted gene ontology terms," Bioinformatics, vol. 28, no. 7, pp. 1028-1030, 2012.   DOI   ScienceOn
3 J. He, H. Gu, and W. Liu, "Imbalanced multi-modal multi-label learning for subcellular localization prediction of human proteins with both single and multiple sites," Plos One, vol. 7, no. 6, e37155, 2012.   DOI   ScienceOn
4 S. Mei, "Multi-label multi-kernel transfer learning for human protein subcellular localization," Plos One, vol. 7, no. 6, e37716, 2012.   DOI
5 G.-Z. Li, X. Wang, X. Hu, J.-M. Liu, and R.-W. Zhao, "Multilabel learning for protein subcellular location prediction," IEEE transactions on Nanobioscience, vol. 11, no. 3, pp. 237-243, 2012.   DOI   ScienceOn
6 S. Wan, M.-W. Mak, and S.-Y. Kung, "mGOASVM: multi-label protein subcellular localization based on gene ontology and support vector machines," BMC Bioinformatics, 13:290, 2012.   DOI
7 W.-Z. Lin, J.-A. Fang, X. Xiao, and K.-C. Chou, "iLoc-Animal: a multi-label learning classifier for predicting subcellular localization of animal proteins," Molecular BioSystems, vol. 9, no. 4, pp. 634-644, 2013.   DOI   ScienceOn
8 X. Wang and G.-Z. Li, "Multilabel learning via random label selection for protein subcellular multilocations prediction," IEEE transactions on computational biology and bioinformatics, vol. 10, no. 2, pp. 436-446, 2013.   DOI   ScienceOn
9 S.-M. Chi, "A performance comparison of multi-label classification methods for protein subcellular localization prediction," Journal of the Korea Institute of Information and Communication Engineering, vol. 18, no. 4, pp. 992-999, Apr. 2014.   과학기술학회마을   DOI
10 G. Tsoumakas, I. Katakis, and I. Vlahavas, "Mining multi-label data," in Data Mining and Knowledge Discovery Handbook. Boston, MA: Springer, ch. 34, pp. 667-685, 2010.
11 J. Read, B. Pfahringer, and H. Geoff, "Multi-Label Classification using Ensembles of Pruned Sets," in Proceeding of the 8th IEEE International Conference on Data Mining, pp. 995-1000, 2008.
12 G. Madjarov, D. Kocev, D. Gjorgjevikj, and S. Dzeroski, "An extensive experimental comparison of methods for multi-label learning," Pattern Recognition, vol. 45, no. 9, pp. 3084-3104, 2012.   DOI   ScienceOn
13 M.-L. Zhang and Z-H. Zhou, "A review on multi-label learning algorithms," IEEE transactions on knowledge and data engineering, http://doi.ieeecomputersociety.org/10.1109/TKDE.2013.39.   DOI
14 J. Read, B. Pfahringer, H. Geoff, and F. Eibe, "Classifier Chains for Multi-label Classification," Machine Learning, vol. 85, no. 3. pp. 335-359, 2011.
15 D. Price, S. Knerr, L. Personnaz, and G. Dreyfus, "Pairwise neural network classifiers with probabilistic outputs," in Neural Information Processing Systems, vol. 7, pp. 1109-1116, 1995.
16 T. Hastie and R. Tibshirani, "Classification by pairwise coupling," The Annals of Statistics, vol. 26, no. 1, pp. 451-471, 1998.   DOI
17 T.-F. Wu, C.-J. Lin, and R.C. Weng, "Probability estimates for multi-class classification by pairwise coupling," Journal of Machine Learning Research, vol. 5, pp. 975-1005. 2004.
18 T.-K. Huang, R.C. Weng, and C.-J. Lin, "Generalized Bradley-Terry models and multi-class probability estimates," Journal of Machine Learning Research, vol. 7, pp. 85-115. 2006.
19 E.L. Allwein, R.E. Schapire, and Y. Singer, "Reducing multiclass to binary: a unifying approach for margin classifier," Journal of Machine Learning Research, vol. 1, pp. 113-141. 2001.
20 S. Escalera, O. Pujol, and P. Radeva, "Separability of ternary codes for sparse designs of error-correcting output codes," Pattern Recognition Letters, vol. 30, pp. 285-297. 2009.   DOI
21 S.-M. Chi, "Prediction of protein subcellular localization by weighted gene ontology terms," Biochemical and biophysical research communications, vol. 399, no. 3, pp. 402-405, 2010.   DOI   ScienceOn
22 G. Tsoumakas, E. Spyromitros-Xioufis, J. Vilcek, I. Vlahavas, "Mulan: a java library for multi-Label learning," Journal of Machine Learning Research, vol. 12, pp. 2411-2414. 2011.
23 C.-C. Chang and C.-J. Lin, "LIBSVM : a library for support vector machines," ACM Transactions on Intelligent Systems and Technology, vol. 2, Issue 3, pp. 27:1-27:27, 2011.
24 H. Lodish, et al., Molecular cell biology, 6th ed. New York, NY:W. H. Freeman and Company, 2008.
25 T.G. Dietterich and G. Bakiri, "Solving multiclass learning problems via error-correcting output codes," Journal of Artificial Intelligence Research, vol. 2, pp. 263-286. 1995.