Browse > Article

Estimating Amino Acid Composition of Protein Sequences Using Position-Dependent Similarity Spectrum  

Chi, Sang-Mun (경성대학교 컴퓨터과학과)
Abstract
The amino acid composition of a protein provides basic information for solving many problems in bioinformatics. We propose a new method that uses biologically relevant similarity between amino acids to determine the amino acid composition, where the BOLOSUM matrix is exploited to define a similarity measure between amino acids. Futhermore, to extract more information from a protein sequence than conventional methods for determining amino acid composition, we exploit the concepts of spectral analysis of signals such as radar and speech signals-the concepts of time-dependent analysis, time resolution, and frequency resolution. The proposed method was applied to predict subcellular localization of proteins, and showed significantly improved performance over previous methods for amino acid composition estimation.
Keywords
Amino Acid Composition; Similarity Measure; Spectral Analysis; Protein Subcellular Localization Prediction;
Citations & Related Records
연도 인용수 순위
  • Reference
1 V. Goder, and M. Spiess, "Molecular mechanism of signal sequence orientation in the endoplasmic reticulum," The EMBO Journal., 22, pp.3645-3653, 2003.   DOI
2 E. Granseth, G. von Heijne, and A. Elofsson, "A study of the membrane-water interface region of membrane proteins," J. Mol. Biol., 346, pp.377-385, 2005.   DOI   ScienceOn
3 W.-W. Yang, B.-L. Lu, and Y. Yang, "A comparative study on feature extraction from protein sequences for subcellular localization prediction," IEEE Symposium on CIBCB, pp.201-208, Toronto, Canada, 2006.
4 S. Matsuda, J.-P. Vert, H. Saigo, N. Ueda, H. Toh, and T. Akutsu, "A novel representation of protein sequences for prediction of subcellular location using support vector machines," Protein Sci., 14(11), pp.2804-2813, 2005.   DOI   ScienceOn
5 M. A. Andrade, S. I. O'Donoghue, and B. Rost, "Adaption of protein surfaces to subcellular location," J. Mol. Biol., 276, pp.517-525, 1998.   DOI   ScienceOn
6 K.-J. Park, and M. Kanehisa, "Prediction of protein subcellular location by support vector machines using compositions of amino acids and amino acid pairs," Bioinformatics, 19, pp.1656-1663, 2003.   DOI   ScienceOn
7 H. Shatkay, A. Hoglund, S. Brady, T. Blum, P. Donnes, and O. Kohlbacher, "Sherloc: high-accuracy prediction of protein subcellular localization by integrating text and protein sequence data," Bioinformatics, 23, pp.1410-1417. 2007.   DOI   ScienceOn
8 V. Vapnik, Statistical learning theory, John Wiley & Sons, 1998.
9 S. Henikoff, and J. G. Henikoff, "Amino acid substitution matrices from protein blocks," proc. natl. acad. sci., 89, pp.11915-11919, 1992.   DOI   ScienceOn
10 A. V. Oppenheim, and R. W. Schafer, Discrete-time signal processing. Prentice-Hall, New Jersey, 1989.
11 K. Gupta, D. Thomas, S. Vidya, K. Venkatesh, and S. Ramakumar, "Detailed protein sequence alignment based on Spectral Similarity Score (SSS)," BMC Bioinformatics, 6(105), 2005.
12 A. Hoglund, P. Donnes, T, Blum, H.-W. Adolph, and O. Kohlbacher, "Multiloc: prediction of protein localization using n-terminal targeting sequences, sequence motifs and amino acid compositions," Bioinformatics, 22, pp.1158-1165, 2006.   DOI   ScienceOn
13 C.-C. Chang, and C.-J. Lin, LIBSVM : a library for support vector machines, 2001. Software available at http://www.csie.ntu.edu.tw/-cjlin/libsvm
14 M. Paetzel, A. Karla, N. C. Strynadka, and R. E. Dalbey, "Signal peptidases," Chem. Rev., 102, pp. 4549-4580, 2002.   DOI   ScienceOn