Browse > Article

Comparison of External Information Performance Predicting Subcellular Localization of Proteins  

Chi, Sang-Mun (경성대학교 컴퓨터학부)
Abstract
Since protein subcellular location and biological function are highly correlated, the prediction of protein subcellular localization can provide information about the function of a protein. In order to enhance the prediction performance, external information other than amino acids sequence information is actively exploited in many researches. This paper compares the prediction capabilities resided in amino acid sequence similarity, protein profile, gene ontology, motif, and textual information. In the experiments using PLOC dataset which has proteins less than 80% sequence similarity, sequence similarity information and gene ontology are effective information, achieving a classification accuracy of 94.8%. In the experiments using BaCelLo IDS dataset with low sequence similarity less than 30%, using gene ontology gives the best prediction accuracies, 93.2% for animals and 86.6% for fungi.
Keywords
Protein Subcellular Localization Prediction; Amino Acid Sequence Similarity; Protein Profile; Gene Ontology;
Citations & Related Records
Times Cited By KSCI : 1  (Citation Analysis)
연도 인용수 순위
1 R. Casadio, P. L. Martelli, A. Pierleoni, "The prediction of protein subcellular localization from sequence: a shortcut to functional genome annotation," Brief Funct Genomic Proteomics, 7, pp.63-73, 2008.   DOI   ScienceOn
2 D. Barrell, et al., "The GOA database in 2009-an integrated gene ontology annotation resource," Nucleic Acids Res., 37, Database issue doi:10.1093 /nar/gkn803, 2009.
3 S. Hunter, et al., "InterPro: the integrative protein signature database," Nucleic Acids Res., 37, Database issue D211-D215, 2009.   DOI   ScienceOn
4 C.-C. Chang, C.-J. Lin, LIBSVM : a library for support vector machines, 2001. Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm.
5 A. Reinhardt, T. Hubbard, "Using neural networks for prediction of the subcellular location of proteins," Nucleic Acids Res., 26, pp.2230-2236, 1998.   DOI
6 A. Fyshe, Y. Liu, D. Szafron, R. Greiner, P. Lu, "Improving subcellular localization prediction using text classification and the Gene Ontology," Bioinformatics, vol.24, no.21, pp.2512-2517, 2008.   DOI   ScienceOn
7 Z. Lei, Y. Dai, "Assessing protein similarity with Gene Ontology and its use in subnuclear localization prediction," BMC Bioinformatics, vol.7, no.491, 2006.
8 W.-L. Huang, et al., "ProLoc-GO: Utilizing informative Gene Ontology terms for sequence-based prediction of protein subcellular localization," BMC Bioinformatics, vol.9, no.80, 2008.
9 S. Briesemeister, et al., "SherLoc2: A high-accuracy hybrid method for predicting subcellular localization of proteins," J. Proteome Research, vol.8, no.11, pp.5363-5366, 2009.   DOI   ScienceOn
10 S. Henikoff, J. G. Henikoff, "Amino acid substitution matrices from protein blocks," proc. natl. acad. sci., 89, pp.11915-11919, 1992.   DOI   ScienceOn
11 S. F. Altschul, et al., "Gapped BLAST and PSIBLAST: a new generation of protein database search programs," Nucleic Acids Res., 25, pp.3389-3402, 1997.   DOI
12 R. Nair, B. Rost, "Sequence conserved for subcellular localization," Protein Sci., 11, pp.2836-2847, 2002.
13 C. S. Yu, Y. C. Chen, C. H. Lu, J. K. Hwang, "Prediction of protein subcellular localization," Proteins, 64, pp.643-651, 2006.   DOI   ScienceOn
14 A. Bairoch, et al., "The universal protein resource (UniProt) in 2010," Nucleic Acids Res., D142-D148, 2010.
15 M. Ashburner, et al., "Gene ontology: tool for the unification of biology. The Gene Ontology Consortium," Nat Genet, 25, pp.25-29, 2000.   DOI   ScienceOn
16 E. Granseth, G. von Heijne, A. Elofsson, "A study of the membrane-water interface region of membrane proteins," J. Mol. Biol., 346, pp.377-385, 2005.   DOI   ScienceOn
17 M. A. Andrade, S. I. O'Donoghue, B. Rost, "Adaption of protein surfaces to subcellular location," J. Mol. Biol., 276, pp.517-525, 1998.   DOI   ScienceOn
18 D. Xie, A. Li, M. Wang, Z. Fan, H. Feng, "LOCSVMPSI: a web server for subcellular localization of eukaryotic proteins using SVM and profile of PSI-BLAST," Nucleic Acids Res., 33, W105-W110, 2005.   DOI   ScienceOn
19 R. Nair, B. Rost, "Inferring sub-cellular localization through automated lexical analysis," Bioinformatics, 18 Supple(1), S78-S86, 2002.   DOI
20 S. Brady, H. Shatkay, "EpiLoc: a (working) text-based system for predicting protein subcellular location," Pac. Symp. Biocomput., pp.604-615, 2008.
21 M. Paetzel, A. Karla, N. C. Strynadka, R. E. Dalbey, "Signal peptidases," Chem. Rev., 102, pp.4549-4580, 2002.   DOI   ScienceOn
22 V. Goder, M. Spiess, "Molecular mechanism of signal sequence orientation in the endoplasmic reticulum," The EMBO Journal, 22, pp.3645-3653, 2003.   DOI   ScienceOn
23 K.-J. Park, M. Kanehisa, "Prediction of protein subcellular location by support vector machines using compositions of amino acids and amino acid pairs," Bioinformatics, 19, pp.1656-1663, 2003.   DOI   ScienceOn
24 W.-W. Yang, B.-L. Lu, Y. Yang, "A comparative study on feature extraction from protein sequences for subcellular localization prediction," IEEE Symposium on CIBCB, pp.201-208, Toronto, Canada, 2006.
25 O. Emanuelsson, H. Nielson, S. Brunak, G. von Heijne, "Predicting subcellular localization of protein based on their N-terminal amino acid sequence," J. Mol. Biol., 300, pp.1005-1016, 2000.   DOI   ScienceOn
26 Q. Cui, T. Jiang, B. Liu, S. Ma, "Esub8: a novel tool to predict protein subcellular localizations in eukaryotic organisms," BMC Bioinformatics, vol.5, no.66, 2004.
27 K. Chou, Y. Cai, "Prediction of protein subcellular locations by GO-FunD-PseAA predictor," Biochem Biophys Res Commun, 320, pp.1236-1239, 2004   DOI   ScienceOn
28 S.-M. Chi, "Estimating amino acids composition of protein sequences using position-dependent similarity spectrum," Journal of KIISE : Software and Applications, vol.37, no.1, pp.74-79, JAN. 2010. (in Korean)   과학기술학회마을
29 A. Pierleoni, P. L. Martelli, P. Fariselli, R. Casadio, "BaCelLo: a balanced subcellular localization predictor," Bioinformatics, 22, e408-e416, 2006.   DOI   ScienceOn
30 R. Nair, B. Rost, "Mimicking cellular sorting improves prediction of subcellular localization," J. Mol. Biol., 348, pp.85-100, 2005.   DOI   ScienceOn
31 A. Hoglund, P. Donnes, T, Blum, H.-W. Adolph, O. Kohlbacher, "MultiLoc: prediction of protein localization using n-terminal targeting sequences, sequence motifs and amino acid compositions," Bioinformatics, 22, pp.1158-1165, 2006.   DOI   ScienceOn
32 P. Horton, et al. "WoLF PSORT: protein localization predictor," Nucleic Acids Res., 35:W585-W587, 2007.   DOI
33 H. Shatkay et al., "SherLoc: high-accuracy prediction of protein subcellular localization by integrating text and protein sequence data," Bioinformatics, 23, pp.1410-1417. 2007.   DOI   ScienceOn
34 T. Blum, S. Briesemeister, O. Kohlbacher, "Multi- Loc2: integrating phylogeny and Gene Ontology terms improves subcellular protein localization prediction," BMC Bioinformatics, vol.10, no.274, doi: 10.1186/ 1471-2105-10-274. 2009.
35 M. Bhasin, G. P. S. Raghava, "ESLpred: SVMbased method for subcellular localization of eukaryotic proteins using dipeptide composition and PSI-BLAST," Nucleic Acids Res., 32, W414- W419, 2004.   DOI   ScienceOn
36 H. Lodish, A. Berk, C.A. Kaiser, et al., Molecular Cell Biology, sixth Ed., p.710, W.H. Freeman and Company, New York, 2007.