• 제목/요약/키워드: Biological Sequence Database

검색결과 92건 처리시간 0.056초

DEVELOPMENT OF XML BASED PERSONALIZED DATAASE MANAGEMENT SYTEM FOR BIOLOGISTS

  • Cho Kyung Hwan;Jung Kwang Su;Kim Sun Shin;Ryu Keun Ho
    • 대한원격탐사학회:학술대회논문집
    • /
    • 대한원격탐사학회 2005년도 Proceedings of ISRS 2005
    • /
    • pp.770-773
    • /
    • 2005
  • In most biological laboratory, sequences from sequence machine are stored into file disks as simple files. It will be hard work to store and manage the sequence data with consistency and integrity such as storing redundant files. It is required needed to develop a system which integrated and managed genome data with consistency and integrity for accurate sequence analysis. There fore, in this paper, we not only store gene and protein sequence data through sequencing but also manage them. We also make a integrate schema for transforming the file formats and design database system using it. As integrated schema is designed as a BSML, it is possible to apply a style language of XSL. From this, we can transfer among heterogeneous sequence formats.

  • PDF

Building an Integrated Protein Data Management System Using the XPath Query Process

  • Cha Hyo Soung;Jung Kwang Su;Jung Young Jin;Ryu Keun Ho
    • 대한원격탐사학회:학술대회논문집
    • /
    • 대한원격탐사학회 2004년도 Proceedings of ISRS 2004
    • /
    • pp.99-102
    • /
    • 2004
  • Recently according to developing of bioinformatics techniques, there are a lot of researches about large amount of biological data. And a variety of files and databases are being used to manage these data efficiently. However, because of the deficiency of standardization there are a lot of problems to manage the data and transform one into the other among heterogeneous formats. We are interested in integrating. saving, and managing gene and protein sequence data generated through sequencing. Accordingly, in this paper the goal of our research is to implement the system to manage sequence data and transform a sequence file format into other format. To satisfy these requirements, we adopt BSML (Bioinformatics Sequence Markup Language) as the standard to manage the bioinformatics data. And then we integrate and store the heterogeneous 리at file formats using BSML schema based DTD. And we developed the system to apply the characteristics of object-oriented database and to process XPath query, one of the efficient structural query. that saves and manages XML documents easily.

  • PDF

Sequence Validation for the Identification of the White-Rot Fungi Bjerkandera in Public Sequence Databases

  • Jung, Paul Eunil;Fong, Jonathan J.;Park, Myung Soo;Oh, Seung-Yoon;Kim, Changmu;Lim, Young Woon
    • Journal of Microbiology and Biotechnology
    • /
    • 제24권10호
    • /
    • pp.1301-1307
    • /
    • 2014
  • White-rot fungi of the genus Bjerkandera are cosmopolitan and have shown potential for industrial application and bioremediation. When distinguishing morphological characters are no longer present (e.g., cultures or dried specimen fragments), characterizing true sequences of Bjerkandera is crucial for accurate identification and application of the species. To build a framework for molecular identification of Bjerkandera, we carefully identified specimens of B. adusta and B. fumosa from Korea based on morphological characters, followed by sequencing the internal transcribed spacer region and 28S nuclear ribosomal large subunit. The phylogenetic analysis of Korean Bjerkandera specimens showed clear genetic differentiation between the two species. Using this phylogeny as a framework, we examined the identification accuracy of sequences available in GenBank. Analyses revealed that many Bjerkandera sequences in the database are either misidentified or unidentified. This study provides robust reference sequences for sequence-based identification of Bjerkandera, and further demonstrates the presence and dangers of incorrect sequences in GenBank.

PC-Based Hybrid Grid Computing for Huge Biological Data Processing

  • Cho, Wan-Sup;Kim, Tae-Kyung;Na, Jong-Hwa
    • Journal of the Korean Data and Information Science Society
    • /
    • 제17권2호
    • /
    • pp.569-579
    • /
    • 2006
  • Recently, the amount of genome sequence is increasing rapidly due to advanced computational techniques and experimental tools in the biological area. Sequence comparisons are very useful operations to predict the functions of the genes or proteins. However, it takes too much time to compare long sequence data and there are many research results for fast sequence comparisons. In this paper, we propose a hybrid grid system to improve the performance of the sequence comparisons based on the LanLinux system. Compared with conventional approaches, hybrid grid is easy to construct, maintain, and manage because there is no need to install SWs for every node. As a real experiment, we constructed an orthologous database for 89 prokaryotes just in a week under hybrid grid; note that it requires 33 weeks on a single computer.

  • PDF

말 데이터베이스 구축 (HorseDB; an Integrated Horse Resource and Web Service)

  • 김대수;조운종;허재원;최은상;조병욱;김희수
    • 생명과학회지
    • /
    • 제16권3호
    • /
    • pp.472-476
    • /
    • 2006
  • 공개된 데이터베이스들에서 말에 대한 생물학적인 데이터와 지놈 데이터를 분석하여 말 데이터베이스를 구축하였다. 말 데이터베이스는 말의 생물학적인 데이터와 지놈 데이터를 생물정보학적인 분석방법으로 분석하고 이들 데이터를 통합하여 제공하는데 목적을 두고 있다. 본 데이터베이스는 말의 생물학적 데이터와 지놈 분석 데이터 그리고 생물정보학적인 분석프로그램을 제공하는 인터페이스로 구성하였다. 또한 사용자의 편의를 돕기 위해서 쉽게 이용할 수 있도록 웹 메뉴를 구성 하였으며 말에 대한 다양한 정보를 제공할 수 있게 하였다. 말 데이터베이스를 이용할 수 있는 웹 주소는 http://www.primate.or.kr/horse이다.

Identification and Phylogenetic Analysis of SINE-R Retroposon Family in cDNA Library of Human Fetal Brain

  • Yi, Joo-Mi;Shin, Kyung-Mi;Lee, Ji-Won;Paik, In-Ho;Jang, Kyung-Lib;Kim, Heui-Soo
    • Animal cells and systems
    • /
    • 제5권3호
    • /
    • pp.231-236
    • /
    • 2001
  • SINE-R retroposons have been derived from human endogenous retrovirus HERV-K family and found to be hominoid specific. Both SINE-R retroposons and HERV-K family are potentially capable of affecting the expression of closely located genes. From cDNA library of human fetal brain, we identified seven SINE-R retroposons and compared them with sequences derived from GenBank database. The SINE-R retroposons from human feta1 brain showed 85∼97% sequence similarities with the human-specific retroposon SINE-R.C2. They also showed 88∼96% sequence similarities with the sequence of the schizo-cDNA clone that derived from postmortem frontal cortex tissue of a schizophrenic patient. Phylogenetic analysis using the neiqhbor-joining method revealed that the seven new SINE-R retroposons from cDNA library of the human feta1 brain have proliferated independently during human evolution. The data indicate that such SINE-R retroposons are expressed in human fetal brain and deserve further investigation as potential leads to understanding of neuropsychiatric diseases.

  • PDF

BSML 기반 능동 트리거 규칙을 이용한 염기서열정보관리시스템의 구현 (Implementation of an Information Management System for Nucleotide Sequences based on BSML using Active Trigger Rules)

  • 박성희;정광수;류근호
    • 한국정보과학회논문지:데이타베이스
    • /
    • 제32권1호
    • /
    • pp.24-42
    • /
    • 2005
  • 유전체 서열을 포함하는 생물정보는 지속적으로 변화하며 이질적이고 다양하다는 특성을 갖는다. 이러한 생물 정보의 특성을 반영한 관리시스템이 요구되지만 현재 대부분의 기존 생물정보 데이타베이스는 생물 데이타에 대한 저장소로만 이용된다. 따라서 이 논문에서는 생물학 연구실 수준에서 시퀀싱 실험을 통해 생산되거나 다양한 공개용 데이타베이스로부터 수집된 염기 서열 데이타를 파일 포맷 변환, 편집, 저장 및 검색을 수행하는 서열정보관리 시스템을 제시한다. 이질적인 서열 포맷간의 파일 변환을 위하여 XML기반 BSML을 공통 포맷으로 이용한다. 서열 저장관리에서는 동일한 DNA 조각에 대한 서열 구성의 변경정보를 저장하기 위해 서열 버전을 정의하고 능동 트리거 규칙을 이용하여 변경 정보 검출 및 생성 방법을 보여준다. 트리거 기능을 이용하여 서열의 변경 정보를 자동적으로 데이타베이스에서 저장관리 할 수 있음을 보이고 성능을 평가하였다.

Construction of EST Database for Comparative Gene Studies of Acanthamoeba

  • Moon, Eun-Kyung;Kim, Joung-Ok;Xuan, Ying-Hua;Yun, Young-Sun;Kang, Se-Won;Lee, Yong-Seok;Ahn, Tae-In;Hong, Yeon-Chul;Chung, Dong-Il;Kong, Hyun-Hee
    • Parasites, Hosts and Diseases
    • /
    • 제47권2호
    • /
    • pp.103-107
    • /
    • 2009
  • The genus Acanthamoeba can cause severe infections such as granulomatous amebic encephalitis and amebic keratitis in humans. However, little genomic information of Acanthamoeba has been reported. Here, we constructed Acanthamoeba expressed sequence tags (EST) database (Acanthamoeba EST DB) derived from our 4 kinds of Acanthamoeba cDNA library. The Acanthamoeba EST DB contains 3,897 EST generated from amebae under various conditions of long term in vitro culture, mouse brain passage, or encystation, and downloaded data of Acanthamoeba from National Center for Biotechnology Information (NCBI) and Taxonomically Broad EST Database (TBestDB). The almost reported eDNA/genomic sequences of Acanthamoeba provide stand alone BLAST system with nucleotide (BLAST NT) and amino acid (BLAST AA) sequence database. In BLAST results, each gene links for the significant information including sequence data, gene orthology annotations, relevant references, and a BlastX result. This is the first attempt for construction of Acanthamoeba database with genes expressed in diverse conditions. These data were integrated into a database (http://www. amoeba.or.kr).

GEDA: New Knowledge Base of Gene Expression in Drug Addiction

  • Suh, Young-Ju;Yang, Moon-Hee;Yoon, Suk-Joon;Park, Jong-Hoon
    • BMB Reports
    • /
    • 제39권4호
    • /
    • pp.441-447
    • /
    • 2006
  • Abuse of drugs can elicit compulsive drug seeking behaviors upon repeated administration, and ultimately leads to the phenomenon of addiction. We developed a procedure for the standardization of microarray gene expression data of rat brain in drug addiction and stored them in a single integrated database system, focusing on more effective data processing and interpretation. Another characteristic of the present database is that it has a systematic flexibility for statistical analysis and linking with other databases. Basically, we adopt an intelligent SQL querying system, as the foundation of our DB, in order to set up an interactive module which can automatically read the raw gene expression data in the standardized format. We maximize the usability of this DB, helping users study significant gene expression and identify biological function of the genes through integrated up-to-date gene information such as GO annotation and metabolic pathway. For collecting the latest information of selected gene from the database, we also set up the local BLAST search engine and non-redundant sequence database updated by NCBI server on a daily basis. We find that the present database is a useful query interface and data-mining tool, specifically for finding out the genes related to drug addiction. We apply this system to the identification and characterization of methamphetamine-induced genes' behavior in rat brain.

Construction of PANM Database (Protostome DB) for rapid annotation of NGS data in Mollusks

  • Kang, Se Won;Park, So Young;Patnaik, Bharat Bhusan;Hwang, Hee Ju;Kim, Changmu;Kim, Soonok;Lee, Jun Sang;Han, Yeon Soo;Lee, Yong Seok
    • 한국패류학회지
    • /
    • 제31권3호
    • /
    • pp.243-247
    • /
    • 2015
  • A stand-alone BLAST server is available that provides a convenient and amenable platform for the analysis of molluscan sequence information especially the EST sequences generated by traditional sequencing methods. However, it is found that the server has limitations in the annotation of molluscan sequences generated using next-generation sequencing (NGS) platforms due to inconsistencies in molluscan sequence available at NCBI. We constructed a web-based interface for a new stand-alone BLAST, called PANM-DB (Protostome DB) for the analysis of molluscan NGS data. The PANM-DB includes the amino acid sequences from the protostome groups-Arthropoda, Nematoda, and Mollusca downloaded from GenBank with the NCBI taxonomy Browser. The sequences were translated into multi-FASTA format and stored in the database by using the formatdb program at NCBI. PANM-DB contains 6% of NCBInr database sequences (as of 24-06-2015), and for an input of 10,000 RNA-seq sequences the processing speed was 15 times faster by using PANM-DB when compared with NCBInr DB. It was also noted that PANM-DB show two times more significant hits with diverse annotation profiles as compared with Mollusks DB. Hence, the construction of PANM-DB is a significant step in the annotation of molluscan sequence information obtained from NGS platforms. The PANM-DB is freely downloadable from the web-based interface (Malacological Society of Korea, http://malacol.or/kr/blast) as compressed file system and can run on any compatible operating system.