Browse > Article

Integration of Heterogeneous Protein Databases Based on RDF(S) Models  

Lee, Kang-Pyo (서울대학교 컴퓨터공학부)
Yoo, Sang-Won (서울대학교 컴퓨터공학부)
Kim, Hyoung-Joo (서울대학교 컴퓨터공학부)
Abstract
In biological domain, there exist a variety of protein analysis databases which have their own meaning toward the same target of protein. If we integrate these scattered heterogeneous data efficiently, we can obtain useful information which otherwise cannot be found from each original source. Reflecting the characteristics of biological data, each data source has its own syntax and semantics. If we describe these data through RDF(S) models, one of the Semantic Web standards, we can achieve not only syntactic but also semantic integration. In this paper, we propose a new concept of integration layer based on the RDF unified schema. As a conceptual model, we construct a unified schema focusing on the protein information; as a representational model, we propose a technique for the wrappers to aggregate necessary information from the relevant sources and dynamically generate RDF instances. Two example queries show that our integration layer succeeds in processing the integrated requests from users and displaying the appropriate results.
Keywords
RDF(S); Data Integration; Protein; Bioinformatics;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Jeong, H., Tombor, B., Albert, R., Oltvai, Z. N. and Barabási, A.-L., "The large-scale organization of metabolic networks," Nature, Vol.407, pp. 651-654, 2000   DOI   ScienceOn
2 RDF Site Summary (RSS) 1.0, http://web.resource. org/rss/1.0/
3 Pierre DÄonnes and Annette HÄoglund, "Predicting Protein Subcellular Localization: Past, Present, and Future," Genomics Proteomics Bioinformatics, Vol.2, pp. 209-215, 2004   DOI
4 Apweiler, R., Bairoch, A., Wu, C. H., Barker, W. C., Boeckmann, B., Ferro, S., Gasteiger, E., Huang, H., Lopez, R., Magrane, M. et al., "UniProt: the Universal Protein knowledgebase," Nucleic Acids Research, Vol.32, D115-D119, 2004   DOI   ScienceOn
5 The World Wide Web Consortium, http://www.w3.org/
6 OASIS (Omics AnalySIS), http://idb.snu.ac.kr/
7 Hiroyuki Ogata, Susumu Goto, Kazushige Sato, Wataru Fujibuchi, and Hidemasa Bono, "KEGG: Kyoto Encyclopedia of Genes and Genomes," Nucleic Acids Research, Vol.28, pp. 27-30, 2000   DOI   ScienceOn
8 Jeong, H., Mason, S. P., Barabási, A.-L. and Oltvai, Z. N., "Lethality and centrality in protein networks," Nature, Vol.411, pp. 41-42, 2001   DOI   ScienceOn
9 Bowers P. M., Pellegrini M., Thompson M. J., Fierro J., Yeates T. O., Eisenberg D., "Prolinks: a database of protein functional linkages derived from coevolution," Genome Biology, Vol.5, No.5, R35, 2004   DOI
10 Eric K. Neumann and Dennis Quan, "Biodash: A Semantic Web Dashboard for Drug Development," Pacific Symposium on Biocomputing, Vol.11, pp. 176-187, 2006
11 Jeen Broekstra, Arjohn Kampman, "SeRQL: An RDF Query and Transformation Language," International Semantic Web Conference, 2004
12 Thomas Hernandez and Subbarao Kambhampati, "Integration of Biological Sources: Current Systems and Challenges Ahead," ACM SIGMOD Record, Vol.33, Issue 3, pp. 51-60, 2004
13 Papin, J. A. and Palsson, B. O., "Topological analysis of mass-balanced signaling networks: a framework to obtain network properties including crosstalk," Journal of Theoretical Biology, Vol.227, pp. 283-297, 2004   DOI   ScienceOn
14 Kei-Hoi Cheung, Kevin Y. Yip, Andrew Smith, Remko deKnikker, Andy Masiar and Mark Gerstein, "YeastHub: a semantic web use case for integrating data in the life sciences domain," Bioinformatics, Vol.21, pp. 85-96, 2005   DOI   ScienceOn
15 Dan Brickley, Ramanathan V. Guha, "RDF Vocabulary Description Language 1.0: RDF Schema," W3C Recommendation, World Wide Web Consortium, 2004
16 Brenton Louie, Peter Mork, Fernando Martin- Sanchez, Alon Halevy, and Peter Tarczy-Hornoch, "Methodological Review: Data integration and genomic medicine," Journal of Biomedical Informatics, Vol.40, pp. 5-16, 2007   DOI   ScienceOn
17 NCBI (National Center for Biotechnology Information), http://www.ncbi.nih.gov/
18 O. Lassila, R. Swick, "Resource Description Framework (RDF) Model and Syntax Specification," W3C Recommendation, World Wide Web Consortium, 1999
19 Ashburner, M., Ball, C., Blake, J., Botstein, D., Butler, H., Cherry, M., Davis, A., Dolinski, K., Dwight, S., Eppig, J. et al., "Gene Ontology: tool for the unification of biology," Nature Genetics, Vol.25, pp. 25-29, 2000   DOI   ScienceOn
20 J. Broekstra, A. Kampman, F. Harmelen "Sesame: An Architecture for Storing and Querying RDF Data and Schema Information," International Semantic Web Conference, http://openrdf.org, 2002
21 Goldbeck, J., Fragoso, G., Hartel, F., Hendler, J., Parsia, B. and Oberthaler, J. "The national cancer institute's thesaurus and ontology," Journal of Web Semantics, Vol. 1, pp. 1-5, 2003   DOI   ScienceOn