• Title/Summary/Keyword: NCBI

Search Result 339, Processing Time 0.035 seconds

A Study on the distribution of sequences for fast search on NCBI NR-DB (NCBI-NR 데이터베이스의 빠른 검색을 위한 시퀀스 분배에 관한 연구)

  • Ji, Mingeun;Yi, Gangman
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2016.04a
    • /
    • pp.646-648
    • /
    • 2016
  • 유전체 정보를 일정한 유전자 수로 분할하여 유전체에 대한 유전자 정보 처리를 보다 빠르고 정확하게 해석하기 위해 본 논문에서는 바이오 데이터베이스를 이용하여 유전체 내의 유전자 정보가 올바른지 확인하고 이를 사용자가 임의로 정렬하여 유전자 길이가 유동적이면서 유전체에 대한 유전자 정보가 담긴 파일들을 생성하여 유전자 데이터 해석을 수행할 수 있도록 구현하였다.

Selection of antigen epitope for Foot and Mouth Disease Virus (FMDV) rapid diagnosis based on bioinformatics (생명정보학 기반 구제역바이러스 특이 진단을 위한 항원 단백질 epitope 선정)

  • Seo, Seung Hwan;Jo, Si Hyang;Lee, Jihoo;Kim, Hak Yong
    • Proceedings of the Korea Contents Association Conference
    • /
    • 2015.05a
    • /
    • pp.223-224
    • /
    • 2015
  • 구제역은 소, 돼지 등 발굽이 두 개로 갈라진 가축들에게 감염을 유발하는 전염성이 매우 높은 바이러스성 질병이다. 구제역 감염 시 입 주변, 구강 내, 코, 발굽사이 등에 수포가 생기며 고열과 식욕이 저하되어 심하게 앓거나 죽게 되는데, 강한 감염 전파력을 가졌음에도 치료제가 없고, 감염확인 즉시 확산 방지를 위한 살 처분만이 이루어지고 있다. 따라서 무엇보다도 빠른 감염여부 진단이 중요하다. 현재까지 구제역을 진단하는 방법으로는 감염 된 가축의 혈액에서 구제역 항원 단백질에 대한 항체형성 여부를 확인하는 항체진단법과 수포액과 같은 체액을 채취하여 세포배양을 통한 구제역 바이러스 분리방법이 있지만 두 가지 모두 짧은 잠복기를 갖는 구제역 바이러스를 빠른 시간 내 진단하기는 어렵다. 따라서 본 연구에서는 보다 빠른 구제역 진단 키트개발을 위해 NCBI Pubmed를 이용하여 구제역바이러스가 가지는 6개의 주요 단백질을 확인하였고, NCBI BLAST를 이용하여 6개의 단백질 중 구제역 바이러스에 특이적인 항원 단백질 peptidase C28을 선정하였다. 선정된 단백질의 아미노산 서열을 이용하여 IEDB analysis resource를 통해 peptidase C28의 epitope 부위를 예측하였다. 예측 된 부위의 아미노산 서열을 NCBI BLAST에서 정상 동물과 비교하여 구제역바이러스 특이 항원 단백질 epitope peptide를 최종 선정하였다. 이를 이용한 구제역 바이러스 진단키트 제작은 보다 빠른 진단을 통해 감염 확산을 조기에 차단하고 경제적 손실과 피해를 최소화 할 수 있다.

  • PDF

A Study on Development of GenBank-based Prototype System for Linking Heterogeneous Content (GenBank를 활용한 이종의 콘텐트 연계 프로토타입 시스템 개발 연구)

  • Ahn, Bu-Young;Shin, Young-Ju;Kim, Dea-Hwan
    • Journal of Information Management
    • /
    • v.40 no.4
    • /
    • pp.109-133
    • /
    • 2009
  • Among biological information, GenBank, provided by the National Center for Biotechnology Information (NCBI)of the United States, is a representative database on genetic information and is the most widely used by researchers around the world. Korea Institute of Science and Technology Information (KISTI) visits NCBI on a regular basis and downloads the latest version of GenBank to reorganize the information gathered there into a database. This database is provided for Korean researchers of science and technology through the Bio-KRISTAL search engine, developed by KISTI. This study aims to design a service model that links information on papers, patents, and biodiversity and other contents of NDSL, an integrated service on scientific and technological information run by KISTI, with GenBank's reference and organism fields and to develop a prototype system. For this purpose, this paper explores the possibility of a linkage and convergence service between heterogeneous content by: (a) collecting GenBank data from NCBI's FTP site; (b) dividing GenBank text files into basic and reference genetic information and restructuring them into a database; (c) extracting article and patent information from the GenBank reference fields to generate new tables; and (d) leveraging data mapping technology to implement a prototype system where GenBank and NDSL data are interlinked and provided.

Korean Reference Genome Construction (한국인 고유유전체 참조표준)

  • Ryu, Je-Un;Kim, Dae-Su;Park, Jong-Hwa
    • Proceedings of the Korean Society for Emotion and Sensibility Conference
    • /
    • 2009.05a
    • /
    • pp.23-26
    • /
    • 2009
  • 한국인 최초 전체 유전체 서열(KOREF; Koreanindividualgenomesequence) 은 한국인을 위한 참조 서열로써 사용될 수 있다. 2009년 1월에 남성 한국인 유전체를 솔렉사(Solexa)를 통해 전장서열을 결정하였다. 이는 NCBI의 인간게놈프로젝트에서 생산한 게놈의 99.83%를 커버하며, 또한 NCBI게름서열의 약 20배를 커버할 정도의 유전체 서열을 결정하여 매우 높은 정확도를 가진 한국인 고유유전체이다. 한국인 유전체 서열의 분석결과 현재까지 밝혀지지 않았던 한국인 특이적인 3백만 개의 SNP를 밝혀냈다. 먼저 보고된 중국인 게놈은 한국인 게놈과 매우 가까운 민족 그룹임에도 불구하고 38%(3,186,352 SNP중에 1,217,362 SNP) 의 특이적인 차이를 나타내었으며, 또한 미토콘드리아 서열 비교를 통해서도 특이적인 다양성을 보여주는 SNP데이터를 확인 할 수 있었다. 차세대 게놈서열결정의 기술은 적은 노력과 비용으로 인간 유전체 데이터를 얻을 수 있게 되었으며, 이러한 개인유전체 데이터는 개인유전체 의학으로 가는 초석이 될 것이다.

  • PDF

Patome: Database of Patented Bio-sequences

  • Kim, SeonKyu;Lee, ByungWook
    • Genomics & Informatics
    • /
    • v.3 no.3
    • /
    • pp.94-97
    • /
    • 2005
  • We have built a database server called Patome which contains the annotation information for patented bio-sequences from the Korean Intellectual Property Office (KIPO). The aims of the Patome are to annotate Korean patent bio-sequences and to provide information on patent relationship of public database entries. The patent sequences were annotated with Reference Sequence (RefSeq) or NCBI's nr database. The raw patent data and the annotated data were stored in the database. Annotation information can be used to determine whether a particular RefSeq ID or NCBI's nr ID is related to Korean patent. Patome infrastructure consists of three components­the database itself, a sequence data loader, and an online database query interface. The database can be queried using submission number, organism, title, applicant name, or accession number. Patome can be accessed at http://www.patome.net. The information will be updated every two months.

Theoretical Peptide Mass Distribution in the Non-Redundant Protein Database of the NCBI

  • Lim Da-Jeong;Oh Hee-Seok;Kim Hee-Bal
    • Genomics & Informatics
    • /
    • v.4 no.2
    • /
    • pp.65-70
    • /
    • 2006
  • Peptide mass mapping is the matching of experimentally generated peptides masses with the predicted masses of digested proteins contained in a database. To identify proteins by matching their constituent fragment masses to the theoretical peptide masses generated from a protein database, the peptide mass fingerprinting technique is used for the protein identification. Thus, it is important to know the theoretical mass distribution of the database. However, few researches have reported the peptide mass distribution of a database. We analyzed the peptide mass distribution of non-redundant protein sequence database in the NCBI after digestion with 15 different types of enzymes. In order to characterize the peptide mass distribution with different digestion enzymes, a power law distribution (Zipfs law) was applied to the distribution. After constructing simulated digestion of a protein database, rank-frequency plot of peptide fragments was applied to generalize a Zipfs law curve for all enzymes. As a result, our data appear to fit Zipfs law with statistically significant parameter values.

Construction of PANM Database (Protostome DB) for rapid annotation of NGS data in Mollusks

  • Kang, Se Won;Park, So Young;Patnaik, Bharat Bhusan;Hwang, Hee Ju;Kim, Changmu;Kim, Soonok;Lee, Jun Sang;Han, Yeon Soo;Lee, Yong Seok
    • The Korean Journal of Malacology
    • /
    • v.31 no.3
    • /
    • pp.243-247
    • /
    • 2015
  • A stand-alone BLAST server is available that provides a convenient and amenable platform for the analysis of molluscan sequence information especially the EST sequences generated by traditional sequencing methods. However, it is found that the server has limitations in the annotation of molluscan sequences generated using next-generation sequencing (NGS) platforms due to inconsistencies in molluscan sequence available at NCBI. We constructed a web-based interface for a new stand-alone BLAST, called PANM-DB (Protostome DB) for the analysis of molluscan NGS data. The PANM-DB includes the amino acid sequences from the protostome groups-Arthropoda, Nematoda, and Mollusca downloaded from GenBank with the NCBI taxonomy Browser. The sequences were translated into multi-FASTA format and stored in the database by using the formatdb program at NCBI. PANM-DB contains 6% of NCBInr database sequences (as of 24-06-2015), and for an input of 10,000 RNA-seq sequences the processing speed was 15 times faster by using PANM-DB when compared with NCBInr DB. It was also noted that PANM-DB show two times more significant hits with diverse annotation profiles as compared with Mollusks DB. Hence, the construction of PANM-DB is a significant step in the annotation of molluscan sequence information obtained from NGS platforms. The PANM-DB is freely downloadable from the web-based interface (Malacological Society of Korea, http://malacol.or/kr/blast) as compressed file system and can run on any compatible operating system.

Development of Local Animal BLAST Search System Using Bioinformatics Tools (생물정보시스템을 이용한 Local Animal BLAST Search System 구축)

  • Kim, Byeong-Woo;Lee, Geun-Woo;Kim, Hyo-Seon;No, Seung-Hui;Lee, Yun-Ho;Kim, Si-Dong;Jeon, Jin-Tae;Lee, Ji-Ung;Jo, Yong-Min;Jeong, Il-Jeong;Lee, Jeong-Gyu
    • Bioinformatics and Biosystems
    • /
    • v.1 no.2
    • /
    • pp.99-102
    • /
    • 2006
  • The Basic Local Alignment Search Tool (BLAST) is one of the most established software in bioinformatics research and it compares a query sequence against the libraries of known sequences in order to investigate sequence similarity. Expressed Sequence Tags (ESTs) are single-pass sequence reads from mRNA (or cDNA) and represent the expression for a given cDNA library and the snapshot of genes expressed in a given tissue and/or at a given developmental stage. Therefore, ESTs can be very valuable information for functional genomics and bioinformatics researches. Although major bio database (DB) websites including NCBI are providing BLAST services and EST data, local DB and search system is demanding for better performance and security issue. Here we present animal EST DBs and local BLAST search system. The animal ESTs DB in NCBI Genbank were divided by animal species using the Perl script we developed. and we also built the new extended DB search systems fur the new data (Local Animal BLAST Search System: http://bioinfo.kohost.net), which was constructed on the high-capacity PC Cluster system fur the best performance. The new local DB contains 650,046 sequences for Bos taurus(cattle), 368,120 sequences for Sus scrofa (pig), 693,005 sequences for Gallus gallus (fowl), respectively.

  • PDF

Phylogenetic Relationships and Cultural Characteristics among Inonotus obliquus Strains Collected in Korea (국내 수집 차가버섯 균주의 배양특성과 유전적 유연관계 분석)

  • Park, Hyun;Park, Won-Chull;Yoon, Kab-Hee;Chang, Ji-Youn;Ryu, Sung-Ryul;Ka, Kang-Hyeon;Lee, Bong-Hun
    • The Korean Journal of Mycology
    • /
    • v.35 no.1
    • /
    • pp.28-32
    • /
    • 2007
  • Fruiting bodies of Inonotus obliquus were collected from the trunk of Betula ermani at 1,100 m of Mt. Odae. Diameter range of the trees at breast height (DBH) was $10{\sim}50$ cm and size range of the sclerotia was $8{\times}5{\sim}20{\times}16cm$. Relationships between the examined strains and Inonotus obliquus strain registered in National Center for Biotechnology Information (NCBI) were very near. And all of 10 strains except strains registered in NCBI showed high homologous characteristics by neighbour joining analysis of ITS sequence. Mycelial growth showed a big difference among strains. Mycelial growth of KFRI 744 was fastest and KFRI 739 was slowest. Difference of mycelial growth between KFRI 735 and 738 was slight, but the difference of mycelial growth between KFRI 744 and 739 was almost twice. Also weight reduction rate among strains showed some difference. KFRI 744 was highest and KFRI 741 was lowest. Vegetative incompatibilities were observed in all mycelial pairings except for KFRI 740-741 and KFRI 742-743 combinations.