• Title/Summary/Keyword: XML Databases

Search Result 230, Processing Time 0.028 seconds

Requirement Analysis for Bio-Information Integration Systems

  • Lee, Sean;Lee, Phil-Hyoun;Dokyun Na;Lee, Doheon;Lee, Kwanghyung;Bae, Myung-Nam
    • Proceedings of the Korean Institute of Intelligent Systems Conference
    • /
    • 2003.09a
    • /
    • pp.11-15
    • /
    • 2003
  • Amount of biological data information has been increasing exponentially. In order to cope with this bio-information explosion, it is necessary to construct a biological data information integration system. The integration system could provide useful services for bio-application developers by answering general complex queries that require accessing information from heterogeneous bio data sources, and easily accommodate a new database into the integrated systems. In this paper, we analyze architectures and mechanisms of existing integration systems with their advantages and disadvantages. Based on this analysis and user requirement studies, we propose an integration system framework that embraces advantages of the existing systems. More specifically, we propose an integration system architecture composed of a mediator and wrappers, which can offer a service interface layer for various other applications as well as independent biologists, thus playing the role of database management system for biology applications. In other words, the system can help abstract the heterogeneous information structures and formats from the application layer. In the system, the wrappers send database-specific queries and report the result to the mediator using XML. The proposed system could facilitate in silico knowledge discovery by allowing combination of numerous discrete biological information databases.

  • PDF

Building an Integrated Protein Data Management System Using the XPath Query Process

  • Cha Hyo Soung;Jung Kwang Su;Jung Young Jin;Ryu Keun Ho
    • Proceedings of the KSRS Conference
    • /
    • 2004.10a
    • /
    • pp.99-102
    • /
    • 2004
  • Recently according to developing of bioinformatics techniques, there are a lot of researches about large amount of biological data. And a variety of files and databases are being used to manage these data efficiently. However, because of the deficiency of standardization there are a lot of problems to manage the data and transform one into the other among heterogeneous formats. We are interested in integrating. saving, and managing gene and protein sequence data generated through sequencing. Accordingly, in this paper the goal of our research is to implement the system to manage sequence data and transform a sequence file format into other format. To satisfy these requirements, we adopt BSML (Bioinformatics Sequence Markup Language) as the standard to manage the bioinformatics data. And then we integrate and store the heterogeneous 리at file formats using BSML schema based DTD. And we developed the system to apply the characteristics of object-oriented database and to process XPath query, one of the efficient structural query. that saves and manages XML documents easily.

  • PDF

An agent-based integrated database for rice functional genomics (에이전트 기반의 벼 기능 유전자 통합 데이터베이스)

  • Lee Gi-Yeol;Sin Mun-Su;An Su-Yeong;Jeong Dong-Hun;An Jin-Heung;Jeong Mu-Yeong
    • Proceedings of the Korean Operations and Management Science Society Conference
    • /
    • 2006.05a
    • /
    • pp.1702-1706
    • /
    • 2006
  • In the field of rice research, insertional mutants have become a valuable resource for studies of gene function. However, a well-designed database yet in the area of rice functional genomics. The relevant data are widely distributed and independently managed by the individual research groups. Heterogeneous data format in the distributed database systems causes many problems related to redundancy and compatibility. In this research, integration of the distributed databases using agent technology is pursued. In particular, a data integration agent, an ontology agent, a comparison agent, and resource agents are designed, whereby the integrated database is maintained. Moreover a framework for the web-based information system, which provides information to biologists and permits biologists to add new data to the database, is proposed. To establish an interoperable data format, an XML-based data model is also developed adopting ontology concept.

  • PDF

GWB: An integrated software system for Managing and Analyzing Genomic Sequences (GWB: 유전자 서열 데이터의 관리와 분석을 위한 통합 소프트웨어 시스템)

  • Kim In-Cheol;Jin Hoon
    • Journal of Internet Computing and Services
    • /
    • v.5 no.5
    • /
    • pp.1-15
    • /
    • 2004
  • In this paper, we explain the design and implementation of GWB(Gene WorkBench), which is a web-based, integrated system for efficiently managing and analyzing genomic sequences, Most existing software systems handling genomic sequences rarely provide both managing facilities and analyzing facilities. The analysis programs also tend to be unit programs that include just single or some part of the required functions. Moreover, these programs are widely distributed over Internet and require different execution environments. As lots of manual and conversion works are required for using these programs together, many life science researchers suffer great inconveniences. in order to overcome the problems of existing systems and provide a more convenient one for helping genomic researches in effective ways, this paper integrates both managing facilities and analyzing facilities into a single system called GWB. Most important issues regarding the design of GWB are how to integrate many different analysis programs into a single software system, and how to provide data or databases of different formats required to run these programs. In order to address these issues, GWB integrates different analysis programs byusing common input/output interfaces called wrappers, suggests a common format of genomic sequence data, organizes local databases consisting of a relational database and an indexed sequential file, and provides facilities for converting data among several well-known different formats and exporting local databases into XML files.

  • PDF

Design and Implementation of GML Transformation System based on Standard Transportation Framework Model of TTA (TTA 표준 교통 프레임워크 데이터 모델 기반 GML 변환 시스템 설계 및 구현)

  • Lee, Ki-Won;Kim, Hak-Hoon
    • Journal of the Korean Association of Geographic Information Studies
    • /
    • v.9 no.3
    • /
    • pp.25-35
    • /
    • 2006
  • Standardization or standard-related study are regarded as main issues in GIS applications. Though several GIS standards and specifications have been released, there are a few actual application cases adapting those. In this study, we designed and implemented a geo-spatial information processing system with editing, storing, and disseminating functions, in which standard GIS transportation data model by TTA linked with OGC-GML, XML-based geographic features encoding standard. The system developed in this study enables us to transfer and edit transportation entities based on TTA standards to GML, importing ESRI shapefile. In web-based system, GML-based databases are transformed to SVG file, for the purpose of web publishing. TTA GIS transportation data model is used in this study, and tested; however, standard data models from other application fields also can be easily applied because this system basically provides data importing and editing functions. This system as practical tools can be utilized for applicability test of GIS standard data model and practical operation of standard specification.

  • PDF

A Design and Implementation of Heterogeneous Metadata Searching System using Ontology (Ontology를 이용한 이종 메타데이터 검색 시스템의 설계 및 구현)

  • Choe, Hyun-Jong;Kim, Tae-Young
    • Journal of The Korean Association of Information Education
    • /
    • v.8 no.3
    • /
    • pp.353-360
    • /
    • 2004
  • World Wide Web is not more meaningless sea of information but is becoming the Semantic Web that provides many users with meaningful information. The starting point is the XML and metadata, RDF is a stopover which gives technique to relate arbitrary web resources. And now, the semantic and logic of web resources can be settled in the Ontology. A lot of educational multimedia web resources in Korea have produced their metadata with KERIS's KEM(Korea Educational Metadata). Therefore our country have to start the study of the semantic and logic in web resources. But, many researchers in Korea are more eager to study Dublin Core's DC and SCORM's LOM metadata specification than KEM. Thus the study of method about sharing and integrating these three metadata specifications should be performed before the study of semantic and logic in web resources in Korea. We design the Ontology to integrate these three metadata specifications and implement the prototype system using this Ontology. These three metadata have some elements that have same labels and meanings, and other elements have different labels and same meanings. To match these different labels which have same meanings, we adapted the one-to-one mapping technique in designing our Ontology. This designed Ontology was imported as "integrated schema" in our prototype searching system to integrate three different metadata in databases. Moreover we know that the more specific property design of class in Ontology was needed in order to provide users with more informed searching results such as synonym, antonym, hierarchy and associations.

  • PDF

Analysis of a Compound-Target Network of Oryeong-san (오령산 구성성분-타겟 네트워크 분석)

  • Kim, Sang-Kyun
    • Journal of the Korea Knowledge Information Technology Society
    • /
    • v.13 no.5
    • /
    • pp.607-614
    • /
    • 2018
  • Oryeong-san is a prescription widely used for diseases where water is stagnant because it has the effect of circulating the water in the body and releasing it into the urine. In order to investigate the mechanisms of oryeong-san, we in this paper construct and analysis the compound-target network of medicinal materials constituting oryeong-san based on a systems pharmacology approach. First, the targets related to the 475 chemical compounds of oryeong-san were searched in the STITCH database, and the search results for the interactions between compounds and targets were downloaded as XML files. The compound-target network of oryeong-san is visualized and explored using Gephi 0.8.2, which is an open-source software for graphs and networks. In the network, nodes are compounds and targets, and edges are interactions between the nodes. The edge is weighted according to the reliability of the interaction. In order to analysis the compound-target network, it is clustered using MCL algorithm, which is able to cluster the weighted network. A total of 130 clusters were created, and the number of nodes in the cluster with the largest number of nodes was 32. In the clustered network, it was revealed that the active compounds of medicinal materials were associated with the targets for regulating the blood pressure in the kidney. In the future, we will clarify the mechanisms of oryeong-san by linking the information on disease databases and the network of this research.

A Study on Web Services for Sequence Similarity search in the Workflow Environment (워크플로우 환경에서의 대규모 서열 유사성 검색 웹 서비스에 관한 연구)

  • Jun, Jin-Young
    • Journal of the Korea Society of Computer and Information
    • /
    • v.13 no.6
    • /
    • pp.41-49
    • /
    • 2008
  • In recent years, a life phenomenon using a workflow management tool in bioinformatics has been actively researched. Workflow management tool is the base which enables researchers to collaborate through the re-use and sharing of service, and a variety of workflow management tools including MyGrid project's Taverna, Kepler and BioWMS have been developed and used as the open source. This workflow management tool can model and automate different services in spatially-distant area in one working space based on the web service technology. Many tools and databases used in the bioinformatics are provided in the web services form and are used in the workflow management tool. In such the situation, the web services development and stable service offering for a sequence similarity search which is basically used in the bioinformatics can be essential in the bioinformatics field. In this paper, the similarity retrieval speed of biology sequence data was improved based on a Linux cluster, and the sequence similarity retrieval could be done for a short time by linking with the workflow management tool through developing it in the web services.

  • PDF

The Application of Geography Markup Language(GML) to the Maritime Information

  • Oh, Se-Woong;Park, Jong-Min;Suh, Sang-Hyun
    • Proceedings of the Korean Institute of Navigation and Port Research Conference
    • /
    • v.1
    • /
    • pp.519-524
    • /
    • 2006
  • This paper describes an application of information presentation based geographic map for maritime information, including navigation information. The work is motivated by the need to prepare maritime information representation and distribution for future generation Web network technology. This works consist of map generation using GML and application to maritime information. GML 3.0 became an adopted specification of the Open Geospatial Consortium(OGC) in January 2003, and is rapidly emerging as the world standard for the encoding, transport and storage of all forms of geographic information. This paper looks at the application of GML to one of the more challenging areas of maritime information. Specific features of GML of interest to maritime information provider are discussed and then illustrated through a series of maritime information case studies. The first phase of the work consists of the construction of GML application schema for using as a base map of maritime information. Maritime information is acquired from multiple sources, including standards documents, database schemas, lexicons, collections of symbol definition. The sources of GML ontological knowledge and the contribution of each source to the overall ontology are described in this paper. In the second phase, the prepared GML is used to create a prototype of the mixed maritime information as a base map - for tagging documents within the maritime domain. An overview of this prototype is included. One application area for these information elements described here is the integrated retrieval of maritime information from diverse sources, ranging from Web sites to nautical chart databases and text documents.

  • PDF

An Efficient ROLAP Cube Generation Scheme (효율적인 ROLAP 큐브 생성 방법)

  • Kim, Myung;Song, Ji-Sook
    • Journal of KIISE:Databases
    • /
    • v.29 no.2
    • /
    • pp.99-109
    • /
    • 2002
  • ROLAP(Relational Online Analytical Processing) is a process and methodology for a multidimensional data analysis that is essential to extract desired data and to derive value-added information from an enterprise data warehouse. In order to speed up query processing, most ROLAP systems pre-compute summary tables. This process is called 'cube generation' and it mostly involves intensive table sorting stages. (1) showed that it is much faster to generate ROLAP summary tables indirectly using a MOLAP(multidimensional OLAP) cube generation algorithm. In this paper, we present such an indirect ROLAP cube generation algorithm that is fast and scalable. High memory utilization is achieved by slicing the input fact table along one or more dimensions before generating summary tables. High speed is achieved by producing summary tables from their smallest parents. We showed the efficiency of our algorithm through experiments.