Browse > Article

Protein Structure Alignment Based on Maximum of Residue Pair Distance and Similarity Graph  

Kim, Woo-Cheol (연세대학교 컴퓨터과학과)
Park, Sang-Hyun (연세대학교 컴퓨터과학과)
Won, Jung-Im (한양대학교 정보통신대학 컴퓨터)
Abstract
After the Human Genome Project finished the sequencing of a human DNA sequence, the concerns on protein functions are increasing. Since the structures of proteins are conserved in divergent evolution, their functions are determined by their structures rather than by their amino acid sequences. Therefore, if similarities between two protein structures are observed, we could expect them to have common biological functions. So far, a lot of researches on protein structure alignment have been performed. However, most of them use RMSD(Root Mean Square Deviation) as a similarity measure with which it is hard to judge the similarity level of two protein structures intuitively. In addition, they retrieve only one result having the highest alignment score with which it is hard to satisfy various users of different purpose. To overcome these limitations, we propose a novel protein structure alignment algorithm based on MRPD(Maximum of Residue Pair Distance) and SG (Similarity Graph). MRPD is more intuitive similarity measure by which fast tittering of unpromising pairs of protein pairs is possible, and SG is a compact representation method for multiple alignment results with which users can choose the most plausible one among various users' needs by providing multiple alignment results without compromising the time to align protein structures.
Keywords
Protein structure; Structure alignment; Structure similarity measure;
Citations & Related Records
연도 인용수 순위
  • Reference
1 L. Holm and C. Sander, 'Protein structure comparison by alignment of distance matrices,' Journal of Molecular Biology, Vol.233, pp. 123-138, 1993   DOI   ScienceOn
2 K. S. Arun, T. S. Huang, and S. D. Blostein, 'Least-squaresfitting of two 3-D point sets,' IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol.PAMI-9, No.5, pp. 698-700, 1987   DOI   ScienceOn
3 L. Chen, T. Zhou and Y. Tang, 'Protein structure alignment by deterministic annealing,' Bioinformatics, Vol.21, No.1, pp. 51-62, 2005   DOI   ScienceOn
4 W. R. Taylor and C. O. Orengo, 'Protein structure alignment,' Journal of Molecular Biology, Vol.208, pp. 1-22, 1989   DOI
5 I. Lotan and F. Schwarzer, 'Approximation of Protein Structure for Fast Similarity Measures,' Journal of Computational Biology, Vol.11, No.2-3, pp. 299-317, 2004   DOI   ScienceOn
6 D. Frishman and P. Argos, 'Knowledge-based protein secondary structure assignment,' Proteins, Vol.23, pp. 566-579, 1995   DOI   ScienceOn
7 I. A. Jewett, C. C. Huang and T. E. Ferrin, 'MINRMS: an efficient algorithm for determining protein structure similarity using root-meansquared-distance,' Bioinformatics, Vol.19, No.5, pp. 625-634, 2003   DOI   ScienceOn
8 F. N. Abu-Khzam, N. E. Baldwin, M. A. Langston and N. F. Samatova, 'On the Relative Efficiency of Maximal Clique Enumeration Algorithms, with Applications to High-Throughput Computational Biology,' Proceeding of International Conference on Research Trends in Science and Technology, 2005
9 I. Eidhammer and I. Jonassen, 'Protein structure comparison and structure patterns - an algorithmic approach,' ISMB tutorial, 2001
10 I. N. Shindyalov and P. E. Bourne, 'Protein structure alignment by incremental combinatorial extension (CE) of the optimal path,' Protein Engineering, Vol.11, No.9, pp. 739-747, 1998   DOI
11 O. Camoglu, T. Kahveci and A. K. Singh, 'Indexbased Similarity Search for Protein Structure Databases,' Journal of Bioinformatics and Computational Biology, Vol.2, No.1, pp. 99-126, 2004   DOI   ScienceOn
12 A. G. Murzin, S. E. Brenner, T. Hubbard and C. Chothia, 'SCOP: a structural classification of proteins database for the investigation of sequences and structures,' Journal of Molecular Biology, Vol.247, pp. 536-540, 1995
13 V. Stix, 'Finding all maximal cliques in dynamic graphs,' Computational Optimization Application, Vol.27, No.2, pp. 173-186, 2004   DOI   ScienceOn
14 E. Tomita, A. Tanaka, and H. Takahashi, 'The worst-case time complexity for generating all maximal cliques,' Proceeding of 10th International Computing and Combinatorics Conference (LNCS 3106), pp. 161-170, 2004
15 RCSB Protein Data Bank (http://www.rcsb.org/pdb)
16 B. Dahiya and S. Mayo, 'De Novo protein design: fully automated sequence selection,' Science, Vol. 278, pp. 82-87, 1997   DOI
17 M. A. Erdmann, 'Protein Similarity from Knot Theory: Geometric Convolution and Line Weavings,' Journal of Computational Biology, Vol.12, No.6, pp. 609-637, 2005   DOI   ScienceOn
18 C. Bron and J. Kerbosch, 'Algorithm 457: finding all cliques of an undirected graph,' Communications of the ACM, Vol.16, pp. 575-577, 1973   DOI
19 W. Kabsch and C. Sander, 'Dictionary of protein secondary structures: pattern recognition of hydrogen-bonded and geometrical features,' Biopolymers, Vol.22, pp. 2511-2631, 1983
20 L. Holm and C. Sander, '3-D lookup: Fast protein structure database searches at 90% reliability,' Proceeding of International Conference on Molecular Biology, pp. 179-187, 1995
21 W. R. Taylor, 'Protein structure comparison using iterated double dynamic programming,' Protein Science, Vol.8, pp. 654-665, 1999   DOI   ScienceOn
22 J. F. Gibrat, T. Madej and S. H. Bryant, 'Surprising similarities in structure comparison,' Current Opinion Structural Biology, Vol.6, No.3, pp. 377-385, 1996   DOI   ScienceOn
23 R. Nussinov and H. J. Wolfson, 'Efficient detection of three-dimensional structural motifs in biological macromolecules by computer vision techniques,' Proceeding of National Academy of Sciences of the USA, pp. 10495-10499, 1991
24 P. E. Bourne and H. Weissig, Structural Bioinformatics, John Wiley & Sons Inc, 2003
25 F. S. Collins, A. Patrinos, E. Jordan, A. Chakravarti, R. Gesteland, L. Walters, and the members of the DOE and NIH planning groups, 'New Goals for the U.S. Human Genome Project: 1998-2003,' Science, Vol.282, No.5389, pp. 682-689, 1998   DOI