Browse > Article

A Protein Structure Comparison System based on PSAML  

Kim Jin-Hong (울산대학교 컴퓨터정보통신공학부)
Ahn Geon-Tae (울산대학교 컴퓨터정보통신공학부)
Byun Sang-Hee (울산대학교 컴퓨터정보통신공학부)
Lee Su-Hyun (창원대학교 컴퓨터공학과)
Lee Myung-Joon (울산대학교 컴퓨터정보통신공학부)
Abstract
Since understanding of similarities and differences among protein structures is very important for the study of the relationship between structure and function, many protein structure comparison systems have been developed. Hut, unfortunately, these systems introduce their own protein data derived from the PDB(Protein Data Bank), which are needed in their algorithms for comparing protein structures. In addition, according to the rapid increase in the size of PDB, these systems require much more computation to search for common substructures in their databases. In this paper, we introduce a protein structure comparison system named WS4E(A Web-Based Searching Substructures of Secondary Structure Elements) based on a PSAML database which stores PSAML documents using the eXist open XML DBMS. PSAML(Protein Structure Abstraction Markup Language) is an XML representation of protein data, describing a protein structure as the secondary structures of the protein and their relationships. Using the PSAML database, the WS4E provides web services searching for common substructures among proteins represented in PSAML. In addition, to reduce the number of candidate protein structures to be compared in the PSAML database, we used topology strings which contain the spatial information of secondary structures in a protein.
Keywords
PSAML; topology string; Protein Structure Comparison; XML; WS4E;
Citations & Related Records
연도 인용수 순위
  • Reference
1 The Apache Software Foundation, Xerces: XML parsers in Java, Apache XML Project, WWW document (http://xml.apache.org/), 2004
2 V. Guerrini and D. Jackson, 'Bioinformatics and Extended Markup Language (XML),' Online Journal of Bioinformatics, Vol.1, No.1, pp.12-21, 2000
3 W3C, Document Object Model (DOM), WWW document (http://www.w3.org/DOM/), 2004
4 VRML Plugin, VRML Plugin and Browser Detector, WWW document (http://cic.nist.gov/vrml/vbdetect.html), 2002
5 MGED group, MicroArray and Gene Expression (MAGE), WWW document (http://www.mged.org/Workgroups/MAGE/mage.html), 2004
6 BioXML, Genome Annotation Markup Elements (GAME), WWW document (http://www.bioxml.org/Projects/game/), 2003
7 A. P. Singh and D. L. Brutlag, Protein Structure Alignment: A Comparison of Methods, 1999
8 N. N. Alexandrov and D. Fischer, 'Analysis of topological and nontopological structural similarities in the PDB: New examples with old structures.' Proteins: Structure, Function, and Genetics, Vol.25. No.3, pp.354-365, 1996   DOI
9 김진홍, 안건태, 이수현, 이명준, '구조비교를 위한 단백질 데이터의 XML 표현기법', 한국정보과학회 프로그래밍언어연구회, 제16권, 제2호, pp.15-16, 2002
10 A. P. Singh and D. L. Brutlag, 'Hierarchical Protein Structure Superposition using both Secondary Structure and Atomic Representations,' Intelligent Systems for Molecular Biology 97, vol.5, pp.284-293, 1997
11 Sun Microsystems, Java Object Serialization Specification, WWW document (http://java.sun.com/j2se/l.4/docs/guide/serialization/spec/serialTOC.doc.html) 2003
12 D. Gilbert, D. Westhead, J. Viksna, and J. Thornton, A computer system to perform structure comparison using TOPS representations of protein structure, Comput. Chem., Vol.26, pp.23-30, 2001   DOI   ScienceOn
13 Holm, L., Park, J. DaliLite workbench for protein structure comparison, Bioinformatics, Vol.16, pp.566-567, 2000   DOI
14 H. M. Berman, J. D. Westbrook, Z. Feng, G. Gilliland, T. N. Bhat, H. Weissig, I. N. Shindyalov, and P. E. Bourne, 'The Protein Data Bank,' Nucleic Acid Research, Vol.28, No.1, pp.235-242, 2000   DOI
15 D. S. Greer, J. D. Westbrook, and P. E. Bourne, OpenMMS: An Ontology Driven Architecture for Macromolecular Structure, Objects in Bio and Cheminformatics, 2001
16 I. Eidharnmer, I. Jonassen, and W. R. Taylor, Structure Comparison and Structure Patterns, Report no 174, University of Bergen, 1999
17 L. Holm and C. Sander, 'Protein structure comparison by alignment of distance matrices,' Journal of Molecular Biology, Vol.233, pp.123-138, 1993   DOI   ScienceOn
18 Helen B, T. Bhat, Philip B., Zukang F., Gary G., Helge W., and John W., 'The Protein Data Bank and the challenge of structural genomics,' Nature Structural Biology, Vol.7, pp.957-959, 2000   DOI   ScienceOn
19 A. Murzin, S. Brenner, T. Hubbard, and C. Chothia, 'SCOP: A structural classification of proteins database for the investigation of sequences and structures,' Journal of Molecular Biology, Vol.247, pp.536-540, 1995   DOI
20 Sampo Niskanen, Patrie Ostergard, Cliquer: routines for clique searching, WWW document (http://www.hut.fi/~pat/cliquer), 2002
21 D. Hanisch, R. Zimmer, and T. Lengauer, 'ProML: the Protein Markup Language for specification of protein sequences, structures and families,' In Silico Biol, Vol.2, No.3, pp.313-324, 2002
22 Hiroaki KATO and Yoshimasa TAKAHASHI, 'Automated Identification of Three- Dimensional Common Structural Features of Proteins,' J. Chem. Software, Vol.7, No.4, pp.161-170, 2001   DOI
23 David W. Mount, Bioinformatics Sequence and Genome Analysis, Gold Spring Harbor Laboratory Press, pp.31-32, 2001
24 Su-Hyun Lee, Jin-Hong Kim, Geon-Tae Ahn, and Myung-Joon Lee, 'Efficient Generation of Compatibility Graphs for Two Sets With an Ordered Attribute,' Information Sciences, (submitted), 2004
25 P. Murray-Rust and H. Rzepa, 'Chemical markup Language and XML Part 1. Basic principles,' J. Chem. Inf . Comp. Sci, Vol.39, No.6, pp.928-942, 1999   DOI
26 Hofinann K, Bucher P, Falquet L, and Bairoch A (1999) The PROS1TE database, its status in 1999. Nucleic Acids Res 27: 215-219   DOI   ScienceOn
27 Martin AC, 'The ups and downs of protein topology: rapid comparison of protein structure,' Protein Eng. Vol.13, No.12, pp.829-837, 2002
28 Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ, 'Gapped BLAST and PSI-BLAST: a new generation of protein database search programs,' Nucleic Acids Res., No.25, pp.3389-3402, 1997   DOI
29 P. Bourne, H. Berman, B. McMahon, K.Warenpaugh, J. Westbrook, and P. Fitzgerald, 'The Macromolecular Crystallographic Information File (mmCIF),' Methods In Enzymology. Vol.277, pp.571-590, 1997   DOI
30 Proteomics Inc., BioML:Biological Markup Language, WWW document (http://www.bioml.com/bioml/), 2004
31 Akmal B. Chaudri, Awais Rashid, Roberto Zicari, XML Data Management: Native XML and XML-Enabled Database Systems, Addison Wesley Professional, 2003
32 W. Kabsch and C. Sander, 'Dictionary of Protein Secondary Stucture: Pattern Recognition of Hydrogen-Bonded and Geometrical Features,' Biopolymers, Vol.22, pp.2577-2637, 1983   DOI   ScienceOn
33 R. Sayle and E. Milner-White, 'RASMOL: biomolecular graphics for all,' Trends in Biochemical Science, Vol.20, pp.374-376, 1995   DOI   ScienceOn