• Title/Summary/Keyword: Whole genome

Search Result 589, Processing Time 0.026 seconds

The Design and Implementation of Web-Based Integrated Genome Analysis Tools (웹 기반 통합 유전체 분석 시스템의 설계 및 구현)

  • 최범순;이경희;권해룡;조완섭;이충세;김영창
    • Journal of Korea Multimedia Society
    • /
    • v.7 no.3
    • /
    • pp.408-417
    • /
    • 2004
  • Genome analysis process requires several steps of various software analysis tools. We propose WGAT(Web-based Genome Analysis Tool), which combines several tools for gene analysis and provides a graphic user interface for users. Software tools related to gene analysis are based on Linux or Unix oriented program, which is difficult to install and use for biologists. Furthermore, files generated from gene analysis frequently require manual transformation for next step input file. Web-based tools which are recently developed process orily one sequence at a time. So it needs many repetitive processes to analyze large size data file. WGAT is developed to support Web-based genome analysis for easy use as well as fast service for users. Whole genome data analysis can be done by running WGAT on Linux server and giving sequence data files with various options. Therefore many steps of the analysis can be done automatically by the system. Simulation shows that WGAT method gives 20 times faster analysis when sequence segment is one thousand.

  • PDF

Genomic Approaches for Understanding the Characteristics of Salmonella enterica subsp. enterica Serovar Typhimurium ST1120, Isolated from Swine Feces in Korea

  • Kim, Seongok;Kim, Eunsuk;Park, Soyeon;Hahn, Tae-Wook;Yoon, Hyunjin
    • Journal of Microbiology and Biotechnology
    • /
    • v.27 no.11
    • /
    • pp.1983-1993
    • /
    • 2017
  • Salmonella enterica subsp. enterica serovar Typhimurium, one of the most common foodborne pathogens, is transmitted mainly through contaminated food derived from infected animals. In this study, S. Typhimurium ST1120, an isolate from pig feces in Korea, was subjected to whole-genome analysis to understand its genomic features associated with virulence. The genome of ST1120 was found to have a circular chromosome of 4,855,001 bp (GC content 52.2%) and a plasmid of 6,863 bp (GC content 46.0%). This chromosome was predicted to have 4,558 open reading frames (ORFs), 17 pseudogenes, 22 rRNA genes, and 86 tRNA genes. Its plasmid was predicted to have three ORFs. Comparative genome analysis revealed that ST1120 was phylogenetically close to S. Typhimurium U288, a critical isolate in piggery farms and food chains in Europe. In silico functional analysis predicted that the ST1120 genome harbored multiple genes associated with virulence and stress resistance, including Salmonella pathogenicity islands (SPIs containing SPI-1 to SPI-5, SPI-13, and SPI-14), C63PI locus, ST104 prophage locus, and various antibiotic resistance genes. In accordance with these analysis results, ST1120 showed competence in invasion and survival abilities when it was added to host cells. It also exhibited robust resistance against antibiotics in comparison with other S. Typhimurium strains. This is the first report of the complete genome sequence of S. Typhimurium isolated from swine in Korea. Comparative genome analysis between ST1120 and other Salmonella strains would provide fruitful information toward understanding Salmonella host specificity and developing control measures against S. Typhimurium infection.

Complete Genome Sequencing of Bacillus velezensis WRN014, and Comparison with Genome Sequences of other Bacillus velezensis Strains

  • Wang, Junru;Xing, Juyuan;Lu, Jiangkun;Sun, Yingjiao;Zhao, Juanjuan;Miao, Shaohua;Xiong, Qin;Zhang, Yonggang;Zhang, Guishan
    • Journal of Microbiology and Biotechnology
    • /
    • v.29 no.5
    • /
    • pp.794-808
    • /
    • 2019
  • Bacillus velezensis strain WRN014 was isolated from banana fields in Hainan, China. Bacillus velezensis is an important member of the plant growth-promoting rhizobacteria (PGPR) which can enhance plant growth and control soil-borne disease. The complete genome of Bacillus velezensis WRN014 was sequenced by combining Illumina Hiseq 2500 system and Pacific Biosciences SMRT high-throughput sequencing technologies. Then, the genome of Bacillus velezensis WRN014, together with 45 other completed genome sequences of the Bacillus velezensis strains, were comparatively studied. The genome of Bacillus velezensis WRN014 was 4,063,541bp in length and contained 4,062 coding sequences, 9 genomic islands and 13 gene clusters. The results of comparative genomic analysis provide evidence that (i) The 46 Bacillus velezensis strains formed 2 obviously closely related clades in phylogenetic trees. (ii) The pangenome in this study is open and is increasing with the addition of new sequenced genomes. (iii) Analysis of single nucleotide polymorphisms (SNPs) revealed local diversification of the 46 Bacillus velezensis genomes. Surprisingly, SNPs were not evenly distributed throughout the whole genome. (iv) Analysis of gene clusters revealed that rich gene clusters spread over Bacillus velezensis strains and some gene clusters are conserved in different strains. This study reveals that the strain WRN014 and other Bacillus velezensis strains have potential to be used as PGPR and biopesticide.

Assessment of Erythrobacter Species Diversity through Pan-Genome Analysis with Newly Isolated Erythrobacter sp. 3-20A1M

  • Cho, Sang-Hyeok;Jeong, Yujin;Lee, Eunju;Ko, So-Ra;Ahn, Chi-Yong;Oh, Hee-Mock;Cho, Byung-Kwan;Cho, Suhyung
    • Journal of Microbiology and Biotechnology
    • /
    • v.31 no.4
    • /
    • pp.601-609
    • /
    • 2021
  • Erythrobacter species are extensively studied marine bacteria that produce various carotenoids. Due to their photoheterotrophic ability, it has been suggested that they play a crucial role in marine ecosystems. It is essential to identify the genome sequence and the genes of the species to predict their role in the marine ecosystem. In this study, we report the complete genome sequence of the marine bacterium Erythrobacter sp. 3-20A1M. The genome size was 3.1 Mbp and its GC content was 64.8%. In total, 2998 genetic features were annotated, of which 2882 were annotated as functional coding genes. Using the genetic information of Erythrobacter sp. 3-20A1M, we performed pan-genome analysis with other Erythrobacter species. This revealed highly conserved secondary metabolite biosynthesis-related COG functions across Erythrobacter species. Through subsequent secondary metabolite biosynthetic gene cluster prediction and KEGG analysis, the carotenoid biosynthetic pathway was proven conserved in all Erythrobacter species, except for the spheroidene and spirilloxanthin pathways, which are only found in photosynthetic Erythrobacter species. The presence of virulence genes, especially the plant-algae cell wall degrading genes, revealed that Erythrobacter sp. 3-20A1M is a potential marine plant-algae scavenger.

Draft Genome Sequence of the Reference Strain of the Korean Medicinal Mushroom Wolfiporia cocos KMCC03342

  • Bogun Kim;Byoungnam Min;Jae-Gu Han;Hongjae Park;Seungwoo Baek;Subin Jeong;In-Geol Choi
    • Mycobiology
    • /
    • v.50 no.4
    • /
    • pp.254-257
    • /
    • 2022
  • Wolfiporia cocos is a wood-decay brown rot fungus belonging to the family Polyporaceae. While the fungus grows, the sclerotium body of the strain, dubbed Bokryeong in Korean, is formed around the roots of conifer trees. The dried sclerotium has been widely used as a key component of many medicinal recipes in East Asia. Wolfiporia cocos strain KMCC03342 is the reference strain registered and maintained by the Korea Seed and Variety Service for commercial uses. Here, we present the first draft genome sequence of W. cocos KMCC03342 using a hybrid assembly technique combining both short- and long-read sequences. The genome has a total length of 55.5 Mb comprised of 343 contigs with N50 of 332 kb and 95.8% BUSCO completeness. The GC ratio was 52.2%. We predicted 14,296 protein-coding gene models based on ab initio gene prediction and evidence-based annotation procedure using RNAseq data. The annotated genome was predicted to have 19 terpene biosynthesis gene clusters, which was the same number as the previously sequenced W. cocos strain MD-104 genome but higher than Chinese W. cocos strains. The genome sequence and the predicted gene clusters allow us to study biosynthetic pathways for the active ingredients of W. cocos.

'Drawing' a Molecular Portrait of CIN and Cervical Cancer: a Review of Genome-Wide Molecular Profiling Data

  • Kurmyshkina, Olga V;Kovchur, Pavel I;Volkova, Tatyana O
    • Asian Pacific Journal of Cancer Prevention
    • /
    • v.16 no.11
    • /
    • pp.4477-4487
    • /
    • 2015
  • In this review we summarize the results of studies employing high-throughput methods of profiling of HPV-associated cervical intraepithelial neoplasia (CIN) and squamous cell cervical cancers at key intracellular regulatory levels to demonstrate the unique identity of the landscape of molecular changes underlying this oncopathology, and to show how these changes are related to the 'natural history' of cervical cancer progression and the formation of clinically significant properties of tumors. A step-wise character of cervical cancer progression is a morphologically well-described fact and, as evidenced by genome-wide screenings, it is indeed the consistent change of the molecular profiles of HPV-infected epithelial cells through which they progressively acquire the phenotypic hallmarks of cancerous cells. In this sense, CIN/cervical cancer is a unique model for studying the driving forces and mechanisms of carcinogenesis. Recent research has allowed definition of the whole-genome spectrum of both random and regular molecular alterations, as well as changes either common to processes of carcinogenesis or specific for cervical cancer. Despite the existence of questions that are still to be investigated, these findings are of great value for the future development of approaches for the diagnostics and treatment of cervical neoplasms.

Identify Major Gene-Gene Interaction Effects Using SNPHarvester (SNPHarvester를 활용한 주요 유전자 상호작용 효과 감명)

  • Lee, Jea-Young;Kim, Dong-Chul
    • Communications for Statistical Applications and Methods
    • /
    • v.16 no.6
    • /
    • pp.915-923
    • /
    • 2009
  • The gene which is related in the disease of the human has been searched among numerous genes in GWA(Genome-Wide Association) research. However, most current statistical methods used to detect gene-gene interactions in disease association studies cannot be easily applied to handle the whole genome association study(GWAS) due to heavy computing. Therefore SNPHarvester is developed to find the main gene group among numerous genes. This research finds the superior gene groups which are related with the economic traits of the Korean beef cattle, not that of human, among sets of SNPs by using SNPHarvester, and also finds the superior genotypes which can enhance various qualities of Korean beef among SNP groups.

The Complete Genome Sequence of Southern rice black-streaked dwarf virus Isolated from Vietnam

  • Dinh, Thi-Sau;Zhou, Cuiji;Cao, Xiuling;Han, Chenggui;Yu, Jialin;Li, Dawei;Zhang, Yongliang
    • The Plant Pathology Journal
    • /
    • v.28 no.4
    • /
    • pp.428-432
    • /
    • 2012
  • We determined the complete genome sequence of a Vietnamese isolate of Southern rice black-streaked dwarf virus (SRBSDV). Whole genome comparisons and phylogenetic analysis showed that the genome of the Vietnamese isolate shared high nucleotide sequence identities of over 97.5% with those of the reported Chinese isolates, confirming a common origin of them. Moreover, the greatest divergence between different SRBSDV isolates was found in the segments S1, S3, S4 and S6, which differs from the sequence alignment results between SRBSDV and Rice black streaked dwarf virus (RBSDV), implying that SRBSDV evolved in a unique way independent of RBSDV. This is the first report of a complete nucleotide sequence of SRBSDV from Vietnam and our data provides new clues for further understanding of molecular variation and epidemiology of SRBSDV in Southeast Asia.

Genome-Wide Identification and Characterization of Novel Laccase Genes in the White-Rot Fungus Flammulina velutipes

  • Kim, Hong-Il;Kwon, O-Chul;Kong, Won-Sik;Lee, Chang-Soo;Park, Young-Jin
    • Mycobiology
    • /
    • v.42 no.4
    • /
    • pp.322-330
    • /
    • 2014
  • The aim of this study was to identify and characterize new Flammulina velutipes laccases from its whole-genome sequence. Of the 15 putative laccase genes detected in the F. velutipes genome, four new laccase genes (fvLac-1, fvLac-2, fvLac3, and fvLac-4) were found to contain four complete copper-binding regions (ten histidine residues and one cysteine residue) and four cysteine residues involved in forming disulfide bridges, fvLac-1, fvLac-2, fvLac3, and fvLac-4, encoding proteins consisting of 516, 518, 515, and 533 amino acid residues, respectively. Potential N-glycosylation sites (Asn-Xaa-Ser/Thr) were identified in the cDNA sequence of fvLac-1 (Asn-454), fvLac-2 (Asn-437 and Asn-455), fvLac-3 (Asn-111 and Asn-237), and fvLac4 (Asn-402 and Asn-457). In addition, the first 19~20 amino acid residues of these proteins were predicted to comprise signal peptides. Laccase activity assays and reverse transcription polymerase chain reaction analyses clearly reveal that $CuSO_4$ affects the induction and the transcription level of these laccase genes.

Short Reads Phasing to Construct Haplotypes in Genomic Regions That Are Associated with Body Mass Index in Korean Individuals

  • Lee, Kichan;Han, Seonggyun;Tark, Yeonjeong;Kim, Sangsoo
    • Genomics & Informatics
    • /
    • v.12 no.4
    • /
    • pp.165-170
    • /
    • 2014
  • Genome-wide association (GWA) studies have found many important genetic variants that affect various traits. Since these studies are useful to investigate untyped but causal variants using linkage disequilibrium (LD), it would be useful to explore the haplotypes of single-nucleotide polymorphisms (SNPs) within the same LD block of significant associations based on high-density variants from population references. Here, we tried to make a haplotype catalog affecting body mass index (BMI) through an integrative analysis of previously published whole-genome next-generation sequencing (NGS) data of 7 representative Korean individuals and previously known Korean GWA signals. We selected 435 SNPs that were significantly associated with BMI from the GWA analysis and searched 53 LD ranges nearby those SNPs. With the NGS data, the haplotypes were phased within the LDs. A total of 44 possible haplotype blocks for Korean BMI were cataloged. Although the current result constitutes little data, this study provides new insights that may help to identify important haplotypes for traits and low variants nearby significant SNPs. Furthermore, we can build a more comprehensive catalog as a larger dataset becomes available.