• Title/Summary/Keyword: taxonomic databases

Search Result 12, Processing Time 0.018 seconds

An assessment of the taxonomic reliability of DNA barcode sequences in publicly available databases

  • Jin, Soyeong;Kim, Kwang Young;Kim, Min-Seok;Park, Chungoo
    • ALGAE
    • /
    • v.35 no.3
    • /
    • pp.293-301
    • /
    • 2020
  • The applications of DNA barcoding have a wide range of uses, such as in taxonomic studies to help elucidate cryptic species and phylogenetic relationships and analyzing environmental samples for biodiversity monitoring and conservation assessments of species. After obtaining the DNA barcode sequences, sequence similarity-based homology analysis is commonly used. This means that the obtained barcode sequences are compared to the DNA barcode reference databases. This bioinformatic analysis necessarily implies that the overall quantity and quality of the reference databases must be stringently monitored to not have an adverse impact on the accuracy of species identification. With the development of next-generation sequencing techniques, a noticeably large number of DNA barcode sequences have been produced and are stored in online databases, but their degree of validity, accuracy, and reliability have not been extensively investigated. In this study, we investigated the extent to which the amount and types of erroneous barcode sequences were deposited in publicly accessible databases. Over 4.1 million sequences were investigated in three largescale DNA barcode databases (NCBI GenBank, Barcode of Life Data System [BOLD], and Protist Ribosomal Reference database [PR2]) for four major DNA barcodes (cytochrome c oxidase subunit 1 [COI], internal transcribed spacer [ITS], ribulose bisphosphate carboxylase large chain [rbcL], and 18S ribosomal RNA [18S rRNA]); approximately 2% of erroneous barcode sequences were found and their taxonomic distributions were uneven. Consequently, our present findings provide compelling evidence of data quality problems along with insufficient and unreliable annotation of taxonomic data in DNA barcode databases. Therefore, we suggest that if ambiguous taxa are presented during barcoding analysis, further validation with other DNA barcode loci or morphological characters should be mandated.

Evaluation of 16S rRNA Databases for Taxonomic Assignments Using a Mock Community

  • Park, Sang-Cheol;Won, Sungho
    • Genomics & Informatics
    • /
    • v.16 no.4
    • /
    • pp.24.1-24.4
    • /
    • 2018
  • Taxonomic identification is fundamental to all microbiology studies. Particularly in metagenomics, which identifies the composition of microorganisms using thousands of sequences, its importance is even greater. Identification is inevitably affected by the choice of database. This study was conducted to evaluate the accuracy of three widely used 16S databases-Greengenes, Silva, and EzBioCloud-and to suggest basic guidelines for selecting reference databases. Using public mock community data, each database was used to assign taxonomy and to test its accuracy. We show that EzBioCloud performs well compared with other existing databases.

Data-processing pipeline and database design for integrated analysis of mycoviruses

  • Je, Mikyung;Son, Hyeon Seok;Kim, Hayeon
    • International journal of advanced smart convergence
    • /
    • v.8 no.3
    • /
    • pp.115-122
    • /
    • 2019
  • Recent and ongoing discoveries of mycoviruses with new properties demand the development of an appropriate research infrastructure to analyze their evolution and classification. In particular, the discovery of negative-sense single-stranded mycoviruses is worth noting in genome types in which double-stranded RNA virus and positive-sense single-stranded RNA virus were predominant. In addition, some genomic properties of mycoviruses are more interesting because they have been reported to have similarities with the pathogenic virus family that infects humans and animals. Genetic information on mycoviruses continues to accumulate in public repositories; however, these databases have some difficulty reflecting the latest taxonomic information and obtaining specialized data for mycoviruses. Therefore, in this study, we developed a bioinformatics-based pipeline to efficiently utilize this genetic information. We also designed a schema for data processing and database construction and an algorithm to keep taxonomic information of mycoviruses up to date. The pipeline and database (termed 'mycoVDB') presented in this study are expected to serve as useful foundations for improving the accuracy and efficiency of future research on mycoviruses.

Current methodologies in construction of plant-pollinator network with emphasize on the application of DNA metabarcoding approach

  • Namin, Saeed Mohamadzade;Son, Minwoong;Jung, Chuleui
    • Journal of Ecology and Environment
    • /
    • v.46 no.2
    • /
    • pp.126-135
    • /
    • 2022
  • Background: Pollinators are important ecological elements due to their role in the maintenance of ecosystem health, wild plant reproduction, crop production and food security. The pollinator-plant interaction supports the preservation of plant and animal populations and it also improves the yield in pollination dependent crops. Having knowledge about the plant-pollinator interaction is necessary for development of pesticide risk assessment of pollinators and conservation of endangering species. Results: Traditional methods to discover the relatedness of insects and plants are based on tracing the visiting pollinators by field observations as well as palynology. These methods are time-consuming and needs expert taxonomists to identify different groups of pollinators such as insects or identify flowering plants through palynology. With pace of technology, using molecular methods become popular in identification and classification of organisms. DNA metabarcoding, which is the combination of DNA barcoding and high throughput sequencing, can be applied as an alternative method in identification of mixed origin environmental samples such as pollen loads attached to the body of insects and has been used in DNA-based discovery of plant-pollinator relationship. Conclusions: DNA metabarcoding is practical for plant-pollinator studies, however, lack of reference sequence in online databases, taxonomic resolution, universality of primers are the most crucial limitations. Using multiple molecular markers is preferable due to the limitations of developed universal primers, which improves taxa richness and taxonomic resolution of the studied community.

Database of National Species List of Korea: the taxonomical systematics platform for managing scientific names of Korean native species

  • Park, Jongsun;An, Jung-Hyun;Kim, Yongsung;Kim, Donghyun;Yang, Byeong-Gug;Kim, Taeho
    • Journal of Species Research
    • /
    • v.9 no.3
    • /
    • pp.233-246
    • /
    • 2020
  • A scientific name is one of changeable terms in biology whenever additional research results of specific taxa is accumulated. The Database of the National Species List of Korea (DBNKo) was developed to manage taxonomic information of Korean species, designed to describe the changeable and complex taxonomical structure and information. A Korean Taxonomical Serial Number (KTSN) was assigned to each taxon, different from the normally used systems that the scientific name was considered as primary key to manage higher rank of taxa systematically. Common names were also treated with the KTSN, reflecting that common name is considered as one type of taxon. Additional taxonomic information (e.g., synonyms, original names, and references) was also added to the database. A web interface with an intuitive dashboard presenting taxonomic hierarchical structure is provided to experts and/or managers of the DBNKo. Currently, several biological databases are available in the National Institute of Biological Resources (NIBR) such as a specimen database, a digital library, a genetic information system, and the shared species data based on the DBNKo. The DBNKo started sharing species information with other institutions such as the Nakdonggang National Institute of Biological Resources. It is an ideal centralized species database to manage standardized information of Korean species.

Reconsideration about Nomenclature of Herbs Listed in the Korean Pharmacopoeia (대한민국약전에 수재된 식물성 한약재의 학명에 대한 재고)

  • Doh, Eui-Jeong;Lee, Guem-San
    • The Korea Journal of Herbology
    • /
    • v.28 no.3
    • /
    • pp.61-68
    • /
    • 2013
  • Objectives : A precise and simple system of nomenclature was required to avoid error, ambiguity or confusion. Although medicinal plants must be produced or distributed based on a pharmacopoeia described origin including scientific name, the Korean Pharmacopoeia tenth edition (KP 10) had many names against the nomenclature. Therefore, this study aimed at searching correct scientific names for 241 plants in KP 10. Methods : Authoritative databases - The Plant List, International Plant Name Index, YList, Tropicos, eFloras, World Checklist of Selected Plant Families, The Global Compositae Checklist, The International Legume Database and Information Service, et al. - and previously performed researches, floras were cross-checked. Results : The arrangement of this list was designed for four cases, errors including illegitimate, nomenclatural synonyms, recommended names and decision reserved names. Consideration about the scientific names produced nine correct names for ten misspellings and illegitimate, and thirty-six correct names for forty-one nomenclatural synonyms. These results should be reflected in the next of KP 10. Separately, ten recommended names were also suggested for taxonomic synonyms which had been used indiscriminately due to diverse taxonomic opinions. In addition to those, decision reserved names were suggested for thirteen species which had been corridor of uncertainty. Then again, there was need to study about authorship, because KP 10 did not keep recommendations for author citations. Conclusions : Correction of scientific names for some medicinal plants which violated the International Code of Nomenclature would be useful to improve the accuracy of a Pharmacopoeia as the criterional materials.

A Revision of the Phylogeny of Helicotylenchus Steiner, 1945 (Tylenchida: Hoplolaimidae) as Inferred from Ribosomal and Mitochondrial DNA

  • Abraham Okki, Mwamula;Oh-Gyeong Kwon;Chanki Kwon;Yi Seul Kim;Young Ho Kim;Dong Woon Lee
    • The Plant Pathology Journal
    • /
    • v.40 no.2
    • /
    • pp.171-191
    • /
    • 2024
  • Identification of Helicotylenchus species is very challenging due to phenotypic plasticity and existence of cryptic species complexes. Recently, the use of rDNA barcodes has proven to be useful for identification of Helicotylenchus. Molecular markers are a quick diagnostic tool and are crucial for discriminating related species and resolving cryptic species complexes within this speciose genus. However, DNA barcoding is not an error-free approach. The public databases appear to be marred by incorrect sequences, arising from sequencing errors, mislabeling, and misidentifications. Herein, we provide a comprehensive analysis of the newly obtained, and published DNA sequences of Helicotylenchus, revealing the potential faults in the available DNA barcodes. A total of 97 sequences (25 nearly full-length 18S-rRNA, 12 partial 28S-rRNA, 16 partial internal transcribed spacer [ITS]-rRNA, and 44 partial cytochrome c oxidase subunit I [COI] gene sequences) were newly obtained in the present study. Phylogenetic relationships between species are given as inferred from the analyses of 103 sequences of 18S-rRNA, 469 sequences of 28S-rRNA, 183 sequences of ITS-rRNA, and 63 sequences of COI. Remarks on suggested corrections of published accessions in GenBank database are given. Additionally, COI gene sequences of H. dihystera, H. asiaticus and the contentious H. microlobus are provided herein for the first time. Similar to rDNA gene analyses, the COI sequences support the genetic distinctness and validity of H. microlobus. DNA barcodes from type material are needed for resolving the taxonomic status of the unresolved taxonomic groups within the genus.

Application of Molecular Biology to Rumen Microbes -Review-

  • Kobayashi, Y.;Onodera, R.
    • Asian-Australasian Journal of Animal Sciences
    • /
    • v.12 no.1
    • /
    • pp.77-83
    • /
    • 1999
  • Molecular biological techniques that recently developed, have made it possible to realize some of new attempts in the research field of rumen microbiology. Those are 1) cloning of genes from rumen microorganisms mainly in E. coli, 2) transformation of rumen bacteria and 3) ecological analysis with nonculturing methods. Most of the cloned genes are for polysaccharidase enzymes such as endoglucanase, xylanase, amylase, chitinase and others, and the cloning rendered gene structural analyses by sequencing and also characterization of the translated products through easier purification. Electrotransformation of Butyrivibrio fibrisolvens and Prevotella ruminicola have been made toward the direction for obtaining more fibrolytic, acid-tolerant, depoisoning or essential amino acids-producing rumen bacterium. These primarily required stable and efficient gene transfer systems. Some vectors, constructed from native plasmids of rumen bacteria, are now available for successful gene introduction and expression in those rumen bacterial species. Probing and PCR-based methodologies have also been developed for detecting specific bacterial species and even strains. These are much due to accumulation of rRNA gene sequences of rumen microbes in databases. Although optimized analytical conditions are essential to reliable and reproducible estimation of the targeted microbes, the methods permit long term storage of frozen samples, providing us ease in analytical work as compared with a traditional method based on culturing. Moreover, the methods seem to be promissing for obtaining taxonomic and evolutionary information on all the rumen microbes, whether they are culturable or not.

Computational Approaches for Structural and Functional Genomics

  • Brenner, Steven-E.
    • Proceedings of the Korean Society for Bioinformatics Conference
    • /
    • 2000.11a
    • /
    • pp.17-20
    • /
    • 2000
  • Structural genomics aims to provide a good experimental structure or computational model of every tractable protein in a complete genome. Underlying this goal is the immense value of protein structure, especially in permitting recognition of distant evolutionary relationships for proteins whose sequence analysis has failed to find any significant homolog. A considerable fraction of the genes in all sequenced genomes have no known function, and structure determination provides a direct means of revealing homology that may be used to infer their putative molecular function. The solved structures will be similarly useful for elucidating the biochemical or biophysical role of proteins that have been previously ascribed only phenotypic functions. More generally, knowledge of an increasingly complete repertoire of protein structures will aid structure prediction methods, improve understanding of protein structure, and ultimately lend insight into molecular interactions and pathways. We use computational methods to select families whose structures cannot be predicted and which are likely to be amenable to experimental characterization. Methods to be employed included modern sequence analysis and clustering algorithms. A critical component is consultation of the presage database for structural genomics, which records the community's experimental work underway and computational predictions. The protein families are ranked according to several criteria including taxonomic diversity and known functional information. Individual proteins, often homologs from hyperthermophiles, are selected from these families as targets for structure determination. The solved structures are examined for structural similarity to other proteins of known structure. Homologous proteins in sequence databases are computationally modeled, to provide a resource of protein structure models complementing the experimentally solved protein structures.

  • PDF

Korea Barcode of Life Database System (KBOL)

  • Kim, Sung-Min;Kim, Chang-Bae;Min, Gi-Sik;Suh, Young-Bae;Bhak, Jong;Woo, Tae-Ha;Koo, Hye-Young;Choi, Jun-Kil;Shin, Mann-Kyoon;Jung, Jong-Woo;Song, Kyo-Hong;Ree, Han-Il;Hwang, Ui-Wook;Park, Yung-Chul;Eo, Hae-Seok;Kim, Joo-Pil;Yoon, Seong-Myeong;Rho, Hyun-Soo;Kim, Sa-Heung;Lee, Hang;Min, Mi-Sook
    • Animal cells and systems
    • /
    • v.16 no.1
    • /
    • pp.11-19
    • /
    • 2012
  • A major concern regarding the collection and storage of biodiversity information is the inefficiency of conventional taxonomic approaches in dealing with a large number of species. This inefficiency has increased the demand for automated, rapid, and reliable molecular identification systems and large-scale biological databases. DNA-based taxonomic approaches are now arguably a necessity in biodiversity studies. In particular, DNA barcoding using short DNA sequences provides an effective molecular tool for species identification. We constructed a large-scale database system that holds a collection of 5531 barcode sequences from 2429 Korean species. The Korea Barcode of Life database (KBOL, http://koreabarcode.org) is a web-based database system that is used for compiling a high volume of DNA barcode data and identifying unknown biological specimens. With the KBOL system, users can not only link DNA barcodes and biological information but can also undertake conservation activities, including environmental management, monitoring, and detecting significant organisms.