• Title/Summary/Keyword: Contig

검색결과 64건 처리시간 0.021초

Computational Detection of Prokaryotic Core Promoters in Genomic Sequences

  • Kim Ki-Bong;Sim Jeong Seop
    • Journal of Microbiology
    • /
    • 제43권5호
    • /
    • pp.411-416
    • /
    • 2005
  • The high-throughput sequencing of microbial genomes has resulted in the relatively rapid accumulation of an enormous amount of genomic sequence data. In this context, the problem posed by the detection of promoters in genomic DNA sequences via computational methods has attracted considerable research attention in recent years. This paper addresses the development of a predictive model, known as the dependence decomposition weight matrix model (DDWMM), which was designed to detect the core promoter region, including the -10 region and the transcription start sites (TSSs), in prokaryotic genomic DNA sequences. This is an issue of some importance with regard to genome annotation efforts. Our predictive model captures the most significant dependencies between positions (allowing for non­adjacent as well as adjacent dependencies) via the maximal dependence decomposition (MDD) procedure, which iteratively decomposes data sets into subsets, based on the significant dependence between positions in the promoter region to be modeled. Such dependencies may be intimately related to biological and structural concerns, since promoter elements are present in a variety of combinations, which are separated by various distances. In this respect, the DDWMM may prove to be appropriate with regard to the detection of core promoter regions and TSSs in long microbial genomic contigs. In order to demonstrate the effectiveness of our predictive model, we applied 10-fold cross-validation experiments on the 607 experimentally-verified promoter sequences, which evidenced good performance in terms of sensitivity.

이유자돈으로부터 분리한 Lactobacillus salivarius KLW001의 유전체 분석 (Draft genome sequence of Lactobacillus salivarius KLW001 isolated from a weaning piglet)

  • 진귀득;이준영;김은배
    • 미생물학회지
    • /
    • 제53권2호
    • /
    • pp.134-136
    • /
    • 2017
  • 이유자돈용 생균제 개발을 위해, 본 연구자들은 유산균의 일종인 Lactobacillus salivairus KLW001 균주를 대한민국 양돈가에서 사육 중인 이유자돈으로부터 분리하였다. 이 균주는 K88 antigen-positive Escherichia coli, Salmonella enterica serovar Typhimurium에 대한 항균 활성이 타 균주보다 우수하여, 우리는 이 균주의 유전체를 분석하였다. 유전체 초안 속의 166개 Contig (${\geq}500bp$)들에서, G+C content (%)가 33.0%였고, 2,326,706 bp 크기의 염기서열을 확보할 수 있었다. 유전체 초안으로부터 항생제 저항 유전자 4개, Phage 관련 유전자 36개, Bile 대사 유전자 3개, CRISPR Spacer 27개를 확인하였다.

해수 순환여과양식시스템에서 분리된 Flavobacteriaceae 균주 KCTC 52651의 유전체 분석 (Complete genome sequence of Flavobacteriaceae strain KCTC 52651 isolated from seawater recirculating aquaculture system)

  • 김영삼;전용재;김경호
    • 미생물학회지
    • /
    • 제55권2호
    • /
    • pp.174-176
    • /
    • 2019
  • Flavobacteriaceae 과에 속하는 신균주인 RR4-38(= KCTC 52651 = DSM 108068)가 한국의 해수 순환여과양식시스템의 생물여과조에서 분리되었다. 41.9%의 G+C 함유량을 가진 3,182,272 bp의 길이의 하나의 완전한 유전체 컨티그가 PacBio RS II를 이용하여 얻어졌다. 이 유전체는 2,829개의 단백질 암호화 유전자와 6개의 rRNA 유전자, 38개 tRNA 유전자, 4개의 ncRNA 유전자, 9개의 유사유전자를 포함하고 있다. 이 결과는 해수 순환여과양식시스템에서 미생물의 활성을 이해하는데 통찰력을 줄 것이다.

올리고당을 생산하는 Leuconostoc lactis CCK940 균주의 유전체 염기서열 (Draft genome sequence of oligosaccharide producing Leuconostoc lactis CCK940 isolated from kimchi in Korea)

  • 이설희;박영서
    • 미생물학회지
    • /
    • 제54권4호
    • /
    • pp.445-447
    • /
    • 2018
  • 한국의 전통시장에서 구입한 김치에서 분리된 Leuconostoc lactis CCK940은 sucrose와 maltose를 이용하여 중합도가 4이상인 올리고당을 생산하였다. L. lactis CCK940의 유전체는 1,741,511 bp의 2개 contig로 구성된 염색체로 조합되었으며 G + C의 비율은 43.33%로 나타났다. 염색체 DNA에서 1,698개의 코딩 유전자, 12개의 rRNA, 68개의 tRNA 유전자가 확인되었다. L. lactis CCK940은 올리고당을 생산할 수 있는 sucrose phosphorylase, maltose phosphorylase, ${\beta}$-galactosidase 등의 glucosyltransferase 생합성 유전자들을 지니고 있었다.

흰다리새우(Litopenaeus vannamei )에서 분리된 WSSV의 전장유전체 분석 (The complete genome sequence of a white spot syndrome virus isolated from Litopenaeus vannamei)

  • 이아름;공경희;김휘진;오명주;김도형;김종오;김위식
    • 한국어병학회지
    • /
    • 제35권1호
    • /
    • pp.129-133
    • /
    • 2022
  • The full genome sequence of a Korean white spot syndrome virus (WSSV, isolate: WSSV-GoC18) is presented here. We obtained a total of 12,320,554 reads with 291,172 bases, 170 gene, and 170 coding DNA sequence, which were assembled in 1 contig. Phylogenetic analysis revealed that the WSSV-GoC18 was closely related to Chinese isolate (WSSV-PC) and distinctly different with previously reported a Korean isolate (WSSV K-LV1). The complete genome sequence of WSSV isolates will be of great help in molecular epidemiological studies, contributing to molecular diagnosis and disease prevention in shrimp aquaculture.

대한민국 한강에서 분리된 메로페넴 내성 Pseudomonas peli CJ30의 유전체 서열 초안 (Draft Genome Sequence of Meropenem-Resistant Pseudomonas peli CJ30, Isolated from the Han River, South Korea)

  • 김용석;차창준
    • 한국미생물·생명공학회지
    • /
    • 제51권2호
    • /
    • pp.214-216
    • /
    • 2023
  • 메로페넴에 내성을 갖는 Pseudomonas peli CJ30 균주가 대한민국의 한강에서 분리되었다. CJ30 균주의 유전체는 크기가 4,919,106 bp이고 G + C 함량이 60.0%인 아홉 개의 contig로 조립되었다. CJ30 균주의 유전체 서열은 5,411개의 단백질 코딩 유전자, 18개의 rRNA 유전자 및 70개의 tRNA 유전자를 포함하였다. 균주 CJ30에는 blaSFC-3 및 ampC β-락타마아제 유전자가 포함되어 있습니다.

Comparative analysis of HiSeq3000 and BGISEQ-500 sequencing platform with shotgun metagenomic sequencing data

  • Animesh Kumar;Espen M. Robertsen;Nils P. Willassen;Juan Fu;Erik Hjerde
    • Genomics & Informatics
    • /
    • 제21권4호
    • /
    • pp.49.1-49.11
    • /
    • 2023
  • Recent advances in sequencing technologies and platforms have enabled to generate metagenomics sequences using different sequencing platforms. In this study, we analyzed and compared shotgun metagenomic sequences generated by HiSeq3000 and BGISEQ-500 platforms from 12 sediment samples collected across the Norwegian coast. Metagenomics DNA sequences were normalized to an equal number of bases for both platforms and further evaluated by using different taxonomic classifiers, reference databases, and assemblers. Normalized BGISEQ-500 sequences retained more reads and base counts after preprocessing, while a slightly higher fraction of HiSeq3000 sequences were taxonomically classified. Kaiju classified a higher percentage of reads relative to Kraken2 for both platforms, and comparison of reference database for taxonomic classification showed that MAR database outperformed RefSeq. Assembly using MEGAHIT produced longer assemblies and higher total contigs count in majority of HiSeq3000 samples than using metaSPAdes, but the assembly statistics notably improved with unprocessed or normalized reads. Our results indicate that both platforms perform comparably in terms of the percentage of taxonomically classified reads and assembled contig statistics for metagenomics samples. This study provides valuable insights for researchers in selecting an appropriate sequencing platform and bioinformatics pipeline for their metagenomics studies.

잣나무(Pinus koraiensis)의 cDNA library 제작 및 EST 분석 (Construction of a full-length cDNA library from Pinus koraiensis and analysis of EST dataset)

  • 김준기;임수빈;최선희;이종석;노승문;임용표
    • 농업과학연구
    • /
    • 제38권1호
    • /
    • pp.11-16
    • /
    • 2011
  • In this study, we report the generation and analysis of a total of 1,211 expressed sequence tags (ESTs) from Pinus koraiensis. A cDNA library was generated from the young leaf tissue and a total of 1,211 cDNA were partially sequenced. EST and unigene sequence quality were determined by computational filtering, manual review, and BLAST analyses. In all, 857 ESTs were acquired after the removal of the vector sequence and filtering over a minimum length 50 nucleotides. A total of 411 unigene, consisting of 89 contigs and 322 singletons, was identified after assembling. Also, we identified 77 new microsatellite-containing sequences from the unigenes and classified the structure according to their repeat unit. According to homology search with BLASTX against the NCBI database, 63.1% of ESTs were homologous with known function and 22.2% of ESTs were matched with putative or unknown function. The remaining 14.6% of ESTs showed no significant similarity to any protein sequences found in the public database. Gene ontology (GO) classification showed that the most abundant GO terms were transport, nucleotide binding, plastid, in terms biological process, molecular function and cellular component, respectively. The sequence data will be used to characterize potential roles of new genes in Pinus and provided for the useful tools as a genetic resource.

Localization of 5,105 Hanwoo (Korean Cattle) BAC Clones on Bovine Chromosomes by the Analysis of BAC End Sequences (BESs) Involving 21,024 Clones

  • Choi, Jae Min;Chae, Sung-Hwa;Kang, Se Won;Choi, Dong-Sik;Lee, Yong Seok;Park, Hong-Seog;Yeo, Jung-Sou;Choi, Inho
    • Asian-Australasian Journal of Animal Sciences
    • /
    • 제20권11호
    • /
    • pp.1636-1650
    • /
    • 2007
  • As an initial step toward a better understanding of the genome structure of Korean cattle (Hanwoo breed) and initiation of the framework for genomic research in this bovine, the bacterial artificial chromosome (BAC) end sequencing of 21,024 clones was recently completed. Among these clones, BAC End Sequences (BESs) of 20,158 clones with high quality sequences (Phred score ${\geq}20$, average BES equaled 620 bp and totaled 23,585,814 bp), after editing sequencing results by eliminating vector sequences, were used initially to compare sequence homology with the known bovine chromosomal DNA sequence by using BLASTN analysis. Blast analysis of the BESs against the NCBI Genome database for Bos taurus (Build 2.1) indicated that the BESs from 13,201 clones matched bovine contig sequences with significant blast hits (E<$e^{-40}$), including 7,075 single-end hits and 6,126 paired-end hits. Finally, a total of 5,105 clones of the Korean cattle BAC clones with paired-end hits, including 4,053 clones from the primary analysis and 1,052 clones from the secondary analysis, were mapped to the bovine chromosome with very high accuracy.

KUGI: A Database and Search System for Korean Unigene and Pathway Information

  • Yang, Jin-Ok;Hahn, Yoon-Soo;Kim, Nam-Soon;Yu, Ung-Sik;Woo, Hyun-Goo;Chu, In-Sun;Kim, Yong-Sung;Yoo, Hyang-Sook;Kim, Sang-Soo
    • 한국생물정보학회:학술대회논문집
    • /
    • 한국생물정보시스템생물학회 2005년도 BIOINFO 2005
    • /
    • pp.407-411
    • /
    • 2005
  • KUGI (Korean UniGene Information) database contains the annotation information of the cDNA sequences obtained from the disease samples prevalent in Korean. A total of about 157,000 5'-EST high throughput sequences collected from cDNA libraries of stomach, liver, and some cancer tissues or established cell lines from Korean patients were clustered to about 35,000 contigs. From each cluster a representative clone having the longest high quality sequence or the start codon was selected. We stored the sequences of the representative clones and the clustered contigs in the KUGI database together with their information analyzed by running Blast against RefSeq, human mRNA, and UniGene databases from NCBI. We provide a web-based search engine fur the KUGI database using two types of user interfaces: attribute-based search and similarity search of the sequences. For attribute-based search, we use DBMS technology while we use BLAST that supports various similarity search options. The search system allows not only multiple queries, but also various query types. The results are as follows: 1) information of clones and libraries, 2) accession keys, location on genome, gene ontology, and pathways to public databases, 3) links to external programs, and 4) sequence information of contig and 5'-end of clones. We believe that the KUGI database and search system may provide very useful information that can be used in the study for elucidating the causes of the disease that are prevalent in Korean.

  • PDF