• 제목/요약/키워드: High-quality genome assembly

검색결과 9건 처리시간 0.029초

Hybrid Fungal Genome Annotation Pipeline Combining ab initio, Evidence-, and Homology-based gene model evaluation

  • Min, Byoungnam;Choi, In-Geol
    • 한국균학회소식:학술대회논문집
    • /
    • 한국균학회 2018년도 춘계학술대회 및 임시총회
    • /
    • pp.22-22
    • /
    • 2018
  • Fungal genome sequencing and assembly have been trivial in these days. Genome analysis relies on high quality of gene prediction and annotation. Automatic fungal genome annotation pipeline is essential for handling genomic sequence data accumulated exponentially. However, building an automatic annotation procedure for fungal genomes is not an easy task. FunGAP (Fungal Genome Annotation Pipeline) is developed for precise and accurate prediction of gene models from any fungal genome assembly. To make high-quality gene models, this pipeline employs multiple gene prediction programs encompassing ab initio, evidence-, and homology-based evaluation. FunGAP aims to evaluate all predicted genes by filtering gene models. To make a successful filtering guide for removal of false-positive genes, we used a scoring function that seeks for a consensus by estimating each gene model based on homology to the known proteins or domains. FunGAP is freely available for non-commercial users at the GitHub site (https://github.com/CompSynBioLab-KoreaUniv/FunGAP).

  • PDF

Toward Complete Bacterial Genome Sequencing Through the Combined Use of Multiple Next-Generation Sequencing Platforms

  • Jeong, Haeyoung;Lee, Dae-Hee;Ryu, Choong-Min;Park, Seung-Hwan
    • Journal of Microbiology and Biotechnology
    • /
    • 제26권1호
    • /
    • pp.207-212
    • /
    • 2016
  • PacBio's long-read sequencing technologies can be successfully used for a complete bacterial genome assembly using recently developed non-hybrid assemblers in the absence of second-generation, high-quality short reads. However, standardized procedures that take into account multiple pre-existing second-generation sequencing platforms are scarce. In addition to Illumina HiSeq and Ion Torrent PGM-based genome sequencing results derived from previous studies, we generated further sequencing data, including from the PacBio RS II platform, and applied various bioinformatics tools to obtain complete genome assemblies for five bacterial strains. Our approach revealed that the hierarchical genome assembly process (HGAP) non-hybrid assembler resulted in nearly complete assemblies at a moderate coverage of ~75x, but that different versions produced non-compatible results requiring post processing. The other two platforms further improved the PacBio assembly through scaffolding and a final error correction.

High quality genome sequence of Treponema phagedenis KS1 isolated from bovine digital dermatitis

  • Espiritu, Hector M.;Mamuad, Lovelia L.;Jin, Su-jeong;Kim, Seon-ho;Lee, Sang-suk;Cho, Yong-il
    • Journal of Animal Science and Technology
    • /
    • 제62권6호
    • /
    • pp.948-951
    • /
    • 2020
  • Treponema phagedenis KS1, a fastidious anaerobe, was isolated from a bovine digital dermatitis (BDD)-infected dairy cattle in Chungnam, Korea. Initial data indicated that T. phagedenis KS1 exhibited putative virulent phenotypic characteristics. This study reports the whole genome assembly and annotation of T. phagedenis KS1 (KCTC14157BP) to assist in the identification of putative pathogenicity related factors. The whole genome of T. phagedenis KS1 was sequenced using PacBio RSII and Illumina HiSeqXTen platforms. The assembled T. phagedenis KS1 genome comprises 16 contigs with a total size of 3,769,422 bp and an overall guanine-cytosine (GC) content of 40.03%. Annotation revealed 3,460 protein-coding genes, as well as 49 transfer RNA- and 6 ribosomal RNA-coding genes. The results of this study provide insight into the pathogenicity of T. phagedenis KS1.

Status of Philippine Mango Genomics: Enriching Molecular Genomics Towards a Globally Competitive Philippine Mango Industry

  • Eureka Teresa M. Ocampo;Cris Q. Cortaga;Jhun Laurence S. Rasco;John Albert P. Lachica;Darlon V. Lantican
    • 한국작물학회:학술대회논문집
    • /
    • 한국작물학회 2022년도 추계학술대회
    • /
    • pp.28-28
    • /
    • 2022
  • This paper presents the first genome assemblies of Philippine mangoes that provide valuable reference for varietal improvement and genomic studies on mango and related fruit crops. WE sequenced whole genomes of3 species, Mangifera odorata (Huani), Mangifera altissima (Paho), and Mangifera indica 'Carabao' (Sweet Elena). 'Carabao' is the major export variety of the Philippines; Paho is identified as vulnerable by the IUCN Red List of Threatened Species; Huani has fruit sap acrid which is the primary defense mechanism against insects and birds. We used Falcon, a diploid aware -de novo assembler to assemble SMRT generated long-read sequences. Falcon-unzip was employed to phase the output assembly producing larger contig sets (primary contigs) and shorter contigs corresponding to haplotypes (haplotigs). Assembly statistics were generated by comparing the assembly to a reference genome, Tommy Atkins, using Quality Assessment Tool (QUAST). Moreover, the extent of duplication and completeness of gene content was measured using Benchmarking Universal Single-Copy Orthologs (BUSCO). Draft assemblies with high duplications were processed using Purge Haplotigs and Purge Dups to lessen duplications with minimal impact on genome completeness. De novo assemblies of Huani, Paho and 'Carabao' were then generated with primary contig sizes of 463.64 Mb, 508.95 Mb and 401.51 Mb respectively. These draft assemblies of Huani, Paho and 'Carabao' showed 96.90%, 95.17% and 99.07% complete BUSCOs respectively which is comparable to 'Tommy Atkins' genome (98.6%). Using two mango transcriptome data (pooled RNA-seq from different mango varieties and tissues), 91-96% or 24-30 million reads were successfully mapped back for each generated assembly indicating high degree of completeness. The results obtained demonstrated the highly contiguous, phased, and near complete genome assembly of three Philippine mango species for structural and functional annotation of gene units, especially those with economic importance.

  • PDF

Whole Genome Sequencing of Two Musa Species Towards Disease Resistance and Fiber Quality Improvement

  • John Ivan Pasquil;Richellen Plaza;Roneil Christian Alonday;Damsel Bangcal;Julianne Villela;Antonio, Lalusin;Maria Genaleen Diaz;Antonio Laurena
    • 한국작물학회:학술대회논문집
    • /
    • 한국작물학회 2022년도 추계학술대회
    • /
    • pp.32-32
    • /
    • 2022
  • Abaca (Musa textilis L. Nee) is a native Musa species from the Philippines known for its natural fiber. Abaca fiber a.k.a. Manila hemp extracted from its pseudostems is considered one of the strongest fibers in the world. This is used for commodities such as ropes, papers, and money bills. Abaca is vulnerable to pests and diseases such as the Abaca Bunchy Top Disease (ABTD) caused by Abaca Bunchy Top Virus (ABTV) and Banana Bunchy Top Virus (BBTV). Inosa, one of the varieties of abaca utilized in the Philippines, is highly susceptible to ABTD. In contrast, Pacol (Musa balbisiana L.), a close relative of abaca, is highly resistant to the same disease. Here, we report the sequencing and de novo genome assembly of both abaca var. Inosa and banana var. Pacol. A total of ~16 Gb and ~21 Gb raw reads for Inosa and Pacol, respectively, were generated using Pacbio Hifi sequencing method and assembled with Hifiasm. High-quality de novo assemblies of both Musa species with 99% recovered as per BUSCO analysis were obtained. The assembled Inosa genome has a total length of ~654 Mb and N50 of 7 Mb while Pacol has a total length of 527 Mb and N50 of 3 Mb which are close to their estimated genome size of ~638 Mb and ~503 Mb, respectively. The information that can be derived from the de novo assembled genomes would provide a solid foundation for further research in disease resistance and fiber quality improvement in abaca.

  • PDF

Draft Genome Assembly and Annotation for Cutaneotrichosporon dermatis NICC30027, an Oleaginous Yeast Capable of Simultaneous Glucose and Xylose Assimilation

  • Wang, Laiyou;Guo, Shuxian;Zeng, Bo;Wang, Shanshan;Chen, Yan;Cheng, Shuang;Liu, Bingbing;Wang, Chunyan;Wang, Yu;Meng, Qingshan
    • Mycobiology
    • /
    • 제50권1호
    • /
    • pp.66-78
    • /
    • 2022
  • The identification of oleaginous yeast species capable of simultaneously utilizing xylose and glucose as substrates to generate value-added biological products is an area of key economic interest. We have previously demonstrated that the Cutaneotrichosporon dermatis NICC30027 yeast strain is capable of simultaneously assimilating both xylose and glucose, resulting in considerable lipid accumulation. However, as no high-quality genome sequencing data or associated annotations for this strain are available at present, it remains challenging to study the metabolic mechanisms underlying this phenotype. Herein, we report a 39,305,439 bp draft genome assembly for C. dermatis NICC30027 comprised of 37 scaffolds, with 60.15% GC content. Within this genome, we identified 524 tRNAs, 142 sRNAs, 53 miRNAs, 28 snRNAs, and eight rRNA clusters. Moreover, repeat sequences totaling 1,032,129 bp in length were identified (2.63% of the genome), as were 14,238 unigenes that were 1,789.35 bp in length on average (64.82% of the genome). The NCBI non-redundant protein sequences (NR) database was employed to successfully annotate 11,795 of these unigenes, while 3,621 and 11,902 were annotated with the Swiss-Prot and TrEMBL databases, respectively. Unigenes were additionally subjected to pathway enrichment analyses using the Gene Ontology (GO), Kyoto Encyclopedia of Genes and Genomes (KEGG), Cluster of Orthologous Groups of proteins (COG), Clusters of orthologous groups for eukaryotic complete genomes (KOG), and Non-supervised Orthologous Groups (eggNOG) databases. Together, these results provide a foundation for future studies aimed at clarifying the mechanistic basis for the ability of C. dermatis NICC30027 to simultaneously utilize glucose and xylose to synthesize lipids.

The Philippines Coconut Genomics Initiatives: Updates and Opportunities for Capacity Building and Genomics Research Collaboration

  • Hayde Flandez-Galvez;Darlon V. Lantican;Anand Noel C. Manohar;Maria Luz J. Sison;Roanne R. Gardoce;Barbara L. Caoili;Alma O. Canama-Salinas;Melvin P. Dancel;Romnick A. Latina;Cris Q. Cortaga;Don Serville R. Reynoso;Michelle S. Guerrero;Susan M. Rivera;Ernesto E. Emmanuel;Cristeta Cueto;Consorcia E. Reano;Ramon L. Rivera;Don Emanuel M. Cardona;Edward Cedrick J. Fernandez ;Robert Patrick M. Cabangbang;Maria Salve C. Vasquez;Jomari C. Domingo;Reina Esther S. Caro;Alissa Carol M. Ibarra;Frenzee Kroeizha L. Pammit;Jen Daine L. Nocum;Angelica Kate G. Gumpal;Jesmar Cagayan;Ronilo M. Bajaro;Joseph P. Lagman;Cynthia R. Gulay;Noe Fernandez-Pozo;Susan R. Strickler;Lukas A. Mueller
    • 한국작물학회:학술대회논문집
    • /
    • 한국작물학회 2022년도 추계학술대회
    • /
    • pp.30-30
    • /
    • 2022
  • Philippines is the second world supplier of coconut by-products. As its first major genomics project, the Philippine Genome Center program for Agriculture (PGC-Agriculture) took the challenge to sequence and assemble the whole coconut genome. The project aims to provide advance genetics tools for our collaborating coconut researchers while taking the opportunity to initiate local capacity. Combination of different NGS platforms was explored and the Philippine 'Catigan Green Dwarf' (CATD) variety was selected with the breeders to be the crop's reference genome. A high quality genome assembly of CATD was generated and used to characterize important genes of coconut towards the development of resilient and outstanding varieties especially for added high-value traits. The talk will present the significant results of the project as published in various papers including the first report of whole genome sequence of a dwarf coconut variety. Updates will include the challenges hurdled and specific applications such as gene mining for host insect resistance and screening for least damaged coconuts (thus potentially insect resistant varieties). Genome-wide DNA markers as published and genes related to coconut oil qualitative/quantitative traits will also be presented, including initial molecular/biochemical studies that support nutritional and medicinal claims. A web-based genome database is currently built for ease access and wider utility of these genomics tools. Indeed, a major milestone accomplished by the coconut genomics research team, which was facilitated with the all-out government support and strong collaboration among multidisciplinary experts and partnership with advance research institutes.

  • PDF

Screening of Genetic Variations in Korean Native Duck using Next-Generation Resequencing Data

  • Eunjin Cho;Minjun Kim;Hyo Jun Choo;Jun Heon Lee
    • 한국가금학회지
    • /
    • 제50권3호
    • /
    • pp.187-191
    • /
    • 2023
  • Korean native ducks (KNDs) continue to have a high preference from consumers due to their excellent meat quality and taste characteristics. However, due to low productivity and fixed plumage color phenotype, it could not secure a large share in the domestic market compared to imported species. In order to improve the market share of KNDs, the genetic characteristics of the breed should be identified and used for improvement and selection. Therefore, this study was conducted to identify the genetic information of colored and white KNDs using next-generation resequencing data and screening for differences between the two groups. As a result of the analysis, the genetic variants that showed significant differences between the colored and white KND groups were mainly identified as mutations related to tyrosine activity. The variants were located in the genes that affect melanin synthesis and regulation, such as EGFR, PDGFRA, and DDR2, and these were reported as the candidate genes related to plumage pigmentation in poultry. Therefore, the results of this study are expected to be useful as a basis for understanding and utilizing the genetic characteristics of KNDs for genetic improvement and selection of white broiler KNDs.

Blood transcriptome resources of chinstrap (Pygoscelis antarcticus) and gentoo (Pygoscelis papua) penguins from the South Shetland Islands, Antarctica

  • Kim, Bo-Mi;Jeong, Jihye;Jo, Euna;Ahn, Do-Hwan;Kim, Jeong-Hoon;Rhee, Jae-Sung;Park, Hyun
    • Genomics & Informatics
    • /
    • 제17권1호
    • /
    • pp.5.1-5.9
    • /
    • 2019
  • The chinstrap (Pygoscelis antarcticus) and gentoo (P. papua) penguins are distributed throughout Antarctica and the sub-Antarctic islands. In this study, high-quality de novo assemblies of blood transcriptomes from these penguins were generated using the Illumina MiSeq platform. A total of 22.2 and 21.8 raw reads were obtained from chinstrap and gentoo penguins, respectively. These reads were assembled using the Oases assembly platform and resulted in 26,036 and 21,854 contigs with N50 values of 929 and 933 base pairs, respectively. Functional gene annotations through pathway analyses of the Gene Ontology, EuKaryotic Orthologous Groups, and Kyoto Encyclopedia of Genes and Genomes (KEGG) databases were performed for each blood transcriptome, resulting in a similar compositional order between the two transcriptomes. Ortholog comparisons with previously published transcriptomes from the $Ad{\acute{e}}lie$ (P. adeliae) and emperor (Aptenodytes forsteri) penguins revealed that a high proportion of the four penguins' transcriptomes had significant sequence homology. Because blood and tissues of penguins have been used to monitor pollution in Antarctica, immune parameters in blood could be important indicators for understanding the health status of penguins and other Antarctic animals. In the blood transcriptomes, KEGG analyses detected many essential genes involved in the major innate immunity pathways, which are key metabolic pathways for maintaining homeostasis against exogenous infections or toxins. Blood transcriptome studies such as this may be useful for checking the immune and health status of penguins without sacrifice.