• Title/Summary/Keyword: genome annotation

Search Result 182, Processing Time 0.024 seconds

An Analysis of Ortholog Clusters Detected from Multiple Genomes (다종의 유전체로부터 탐지된 Ortholog 군집에 대한 분석)

  • Kim, Sun-Shin;Oh, Jeong-Su;Lee, Bum-Ju;Kim, Tae-Kyung;Jung, Kwang-Su;Rhee, Chung-Sei;Kim, Young-Chang;Cho, Wan-Sup;Ryu, Keun-Ho
    • Journal of KIISE:Databases
    • /
    • v.35 no.2
    • /
    • pp.125-131
    • /
    • 2008
  • It is very useful to predict orthologs for new genome annotation and research on genome evolution. We showed that the previous work can be extended to construct OCs(Ortholog Clusters) automatically from multiple complete-genomes. The proposed method also has the quality of production of InParanoid, which produces orthologs from just two genomes. On the other hand, in order to predict more exactly the function of a newly sequenced gene it can be an important issue to prevent unwanted inclusion of paralogs into the OCs. We have, here, investigated how well it is possible to construct a functionally purer OCs with score cut-offs. Our OCs were generated from the datasets of 20 procaryotes. The similarity with both COG(Clusters of Orthologous Group) and KO(Kegg Orthology) against our OCs has about 90% and inclines to increase with the growth of score cut-offs.

Next-generation gene targeting in the mouse for functional genomics

  • Gondo, Yoichi;Fukumura, Ryutaro;Murata, Takuya;Makino, Shigeru
    • BMB Reports
    • /
    • v.42 no.6
    • /
    • pp.315-323
    • /
    • 2009
  • In order to elucidate ultimate biological function of the genome, the model animal system carrying mutations is indispensable. Recently, large-scale mutagenesis projects have been launched in various species. Especially, the mouse is considered to be an ideal model to human because it is a mammalian species accompanied with well-established genetic as well as embryonic technologies. In 1990', large-scale mouse mutagenesis projects firstly initiated with a potent chemical mutagen, N-ethyl-N-nitrosourea (ENU) by the phenotype-driven approach or forward genetics. The knockout mouse mutagenesis projects with trapping/conditional mutagenesis have then followed as Phase II since 2006 by the gene-driven approach or reverse genetics. Recently, the next-generation gene targeting system has also become available to the research community, which allows us to establish and analyze mutant mice carrying an allelic series of base substitutions in target genes as another reverse genetics. Overall trends in the large-scale mouse mutagenesis will be reviewed in this article particularly focusing on the new advancement of the next-generation gene targeting system. The drastic expansion of the mutant mouse resources altogether will enhance the systematic understanding of the life. The construction of the mutant mouse resources developed by the forward and reverse genetic mutagenesis is just the beginning of the annotation of mammalian genome. They provide basic infrastructure to understand the molecular mechanism of the gene and genome and will contribute to not only basic researches but also applied sciences such as human disease modelling, genomic medicine and personalized medicine.

Genome-wide Response of Normal WI-38 Human Fibroblast Cells to 1,763 MHz Radiofrequency Radiation

  • Im, Chang-Nim;Kim, Eun-Hye;Park, Ae-Kyung;Park, Woong-Yang
    • Genomics & Informatics
    • /
    • v.8 no.1
    • /
    • pp.28-33
    • /
    • 2010
  • Increased exposure of human to RF fields has raised concerns for its potential adverse effects on our health. To address the biological effects of RF radiation, we used genome wide gene expression as the indicator. We exposed normal WI-38 human fibroblast cells to 1763 MHz mobile phone RF radiation at a specific absorption rate (SAR) of 60 W/kg with an operating cooling system for 24 h. There were no alterations in cell numbers or morphology after RF exposure. Through microarray analysis, we identified no differentially expressed genes (DEGs) at the 0.05 significance level after controlling for multiple testing errors with the Benjaminiochberg false discovery rate (BH FDR) method. Meanwhile, 82 genes were differentially expressed between RF-exposed cells and controls when the significance level was set at 0.01 without correction for multiple comparisons. We found that 24 genes (0.08% of the total genes examined) were changed by more than 1.5-fold on RF exposure. However, significant enrichment of any gene set or pathway was not observed from the functional annotation analysis. From these results, we did not find any evidence that non-thermal RF radiation at a 60-W/kg SAR significantly affects cell proliferation or gene expression in WI-38 cells.

In silico approaches to identify the functional and structural effects of non-synonymous SNPs in selective sweeps of the Berkshire pig genome

  • Shin, Donghyun;Oh, Jae-Don;Won, Kyeong-Hye;Song, Ki-Duk
    • Asian-Australasian Journal of Animal Sciences
    • /
    • v.31 no.8
    • /
    • pp.1150-1159
    • /
    • 2018
  • Objective: Non-synonymous single nucleotide polymorphisms (nsSNPs) were identified in Berkshire selective sweep regions and then were investigated to discover genetic nsSNP mechanisms that were potentially associated with Berkshire domestication and meat quality. We further used bioinformatics tools to predict damaging amino-acid substitutions in Berkshire-related nsSNPs. Methods: nsSNPs were examined in whole genome resequencing data of 110 pigs, including 14 Berkshire pigs, generated using the Illumina Hiseq2000 platform to identify variations that might affect meat quality in Berkshire pigs. Results: Total 65,550 nsSNPs were identified in the mapped regions; among these, 319 were found in Berkshire selective-sweep regions reported in a previous study. Genes encompassing these nsSNPs were involved in lipid metabolism, intramuscular fatty-acid deposition, and muscle development. The effects of amino acid change by nsSNPs on protein functions were predicted using sorting intolerant from tolerant and polymorphism phenotyping V2 to reveal their potential roles in biological processes that may correlate with the unique Berkshire meat-quality traits. Conclusion: Our nsSNP findings confirmed the history of Berkshire pigs and illustrated the effects of domestication on generic-variation patterns. Our novel findings, which are generally consistent with those of previous studies, facilitated a better understanding of Berkshire domestication. In summary, we extensively investigated the relationship between genomic composition and phenotypic traits by scanning for nsSNPs in large-scale whole-genome sequencing data.

Genome analysis of Yucatan miniature pigs to assess their potential as biomedical model animals

  • Kwon, Dae-Jin;Lee, Yeong-Sup;Shin, Donghyun;Won, Kyeong-Hye;Song, Ki-Duk
    • Asian-Australasian Journal of Animal Sciences
    • /
    • v.32 no.2
    • /
    • pp.290-296
    • /
    • 2019
  • Objective: Pigs share many physiological, anatomical and genomic similarities with humans, which make them suitable models for biomedical researches. Understanding the genetic status of Yucatan miniature pigs (YMPs) and their association with human diseases will help to assess their potential as biomedical model animals. This study was performed to identify non-synonymous single nucleotide polymorphisms (nsSNPs) in selective sweep regions of the genome of YMPs and present the genetic nsSNP distributions that are potentially associated with disease occurrence in humans. Methods: nsSNPs in whole genome resequencing data from 12 YMPs were identified and annotated to predict their possible effects on protein function. Sorting intolerant from tolerant (SIFT) and polymorphism phenotyping v2 analyses were used, and gene ontology (GO) network and Kyoto encyclopedia of genes and genomes (KEGG) pathway analyses were performed. Results: The results showed that 8,462 genes, encompassing 72,067 nsSNPs were identified, and 118 nsSNPs in 46 genes were predicted as deleterious. GO network analysis classified 13 genes into 5 GO terms (p<0.05) that were associated with kidney development and metabolic processes. Seven genes encompassing nsSNPs were classified into the term associated with Alzheimer's disease by referencing the genetic association database. The KEGG pathway analysis identified only one significantly enriched pathway (p<0.05), hsa04080: Neuroactive ligand-receptor interaction, among the transcripts. Conclusion: The number of deleterious nsSNPs in YMPs was identified and then these variants-containing genes in YMPs data were adopted as the putative human diseases-related genes. The results revealed that many genes encompassing nsSNPs in YMPs were related to the various human genes which are potentially associated with kidney development and metabolic processes as well as human disease occurrence.

Characterization of a Potential Probiotic Lactiplantibacillus plantarum LRCC5310 by Comparative Genomic Analysis and its Vitamin B6 Production Ability

  • Yunjeong Lee;Nattira Jaikwang;Seong keun Kim;Jiseon Jeong;Ampaitip Sukhoom;Jong-Hwa Kim;Wonyong Kim
    • Journal of Microbiology and Biotechnology
    • /
    • v.33 no.5
    • /
    • pp.644-655
    • /
    • 2023
  • Safety assessment and functional analysis of probiotic candidates are important for their industrial applications. Lactiplantibacillus plantarum is one of the most widely recognized probiotic strains. In this study we aimed to determine the functional genes of L. plantarum LRCC5310, isolated from kimchi, using next-generation, whole-genome sequencing analysis. Genes were annotated using the Rapid Annotations using Subsystems Technology (RAST) server and the National Center for Biotechnology Information (NCBI) pipelines to establish the strain's probiotic potential. Phylogenetic analysis of L. plantarum LRCC5310 and related strains showed that LRCC5310 belonged to L. plantarum. However, comparative analysis revealed genetic differences between L. plantarum strains. Carbon metabolic pathway analysis based on the Kyoto Encyclopedia of Genes and Genomes database showed that L. plantarum LRCC5310 is a homofermentative bacterium. Furthermore, gene annotation results indicated that the L. plantarum LRCC5310 genome encodes an almost complete vitamin B6 biosynthetic pathway. Among five L. plantarum strains, including L. plantarum ATCC 14917T , L. plantarum LRCC5310 detected the highest concentration of pyridoxal 5'-phosphate with 88.08 ± 0.67 nM in MRS broth. These results indicated that L. plantarum LRCC5310 could be used as a functional probiotic for vitamin B6 supplementation.

The Brassica rapa Tissue-specific EST Database (배추의 조직 특이적 발현유전자 데이터베이스)

  • Yu, Hee-Ju;Park, Sin-Gi;Oh, Mi-Jin;Hwang, Hyun-Ju;Kim, Nam-Shin;Chung, Hee;Sohn, Seong-Han;Park, Beom-Seok;Mun, Jeong-Hwan
    • Horticultural Science & Technology
    • /
    • v.29 no.6
    • /
    • pp.633-640
    • /
    • 2011
  • Brassica rapa is an A genome model species for Brassica crop genetics, genomics, and breeding. With the completion of sequencing the B. rapa genome, functional analysis of the genome is forthcoming issue. The expressed sequence tags are fundamental resources supporting annotation and functional analysis of the genome including identification of tissue-specific genes and promoters. As of July 2011, 147,217 ESTs from 39 cDNA libraries of B. rapa are reported in the public database. However, little information can be retrieved from the sequences due to lack of organized databases. To leverage the sequence information and to maximize the use of publicly-available EST collections, the Brassica rapa tissue-specific EST database (BrTED) is developed. BrTED includes sequence information of 23,962 unigenes assembled by StackPack program. The unigene set is used as a query unit for various analyses such as BLAST against TAIR gene model, functional annotation using MIPS and UniProt, gene ontology analysis, and prediction of tissue-specific unigene sets based on statistics test. The database is composed of two main units, EST sequence processing and information retrieving unit and tissue-specific expression profile analysis unit. Information and data in both units are tightly inter-connected to each other using a web based browsing system. RT-PCR evaluation of 29 selected unigene sets successfully amplified amplicons from the target tissues of B. rapa. BrTED provided here allows the user to identify and analyze the expression of genes of interest and aid efforts to interpret the B. rapa genome through functional genomics. In addition, it can be used as a public resource in providing reference information to study the genus Brassica and other closely related crop crucifer plants.

The Complete Chloroplast Genome Sequence and Intra-Species Diversity of Rhus chinensis

  • Kim, Inseo;Park, Jee Young;Lee, Yun Sun;Joh, Ho Jun;Kang, Shin Jae;Murukarthick, Jayakodi;Lee, Hyun Oh;Hur, Young-Jin;Kim, Yong;Kim, Kyung Hoon;Lee, Sang-Choon;Yang, Tae-Jin
    • Plant Breeding and Biotechnology
    • /
    • v.5 no.3
    • /
    • pp.243-251
    • /
    • 2017
  • Rhus chinensis is a shrub widely distributed in Asia. It has been used for traditional medicine and ecological restoration. Here, we report the complete chloroplast genome sequence of two R. chinensis genotypes collected from China and Korea. The assembled chloroplast genome of Chinese R. chinensis is 149,094 bp long, consisting of a large single copy (97,246 bp), a small single copy (18,644 bp) and a pair of inverted repeats (16,602 bp). Gene annotation revealed 77 protein coding genes, 30 tRNA genes, and 4 rRNA genes. A phylogenomic analysis of the chloroplast genomes with 11 known complete chloroplast genomes clarified the relationship of R. chinensis with the other plant species in the Sapindales order. A comparative chloroplast genome analysis identified 170 SNPs and 85 InDels at intra-species level of R. chinensis between Chinese and Korean collections. Based on the sequence diversity between Korea and Chinese R. chinensis plants, we developed three DNA markers useful for genetic diversity and authentication system. The chloroplast genome information obtained in this study will contribute to enriching genetic resources and conservation of endemic Rhus species.

Korea Brassica Genome Project: Current Status and Prospective (배추 유전체열구의 현황과 전망)

  • Choi, Su-Ryun;Park, Jee-Yong;Park, Beom-Seok;Kim, Ho-Il;Lim, Yong-Pyo
    • Journal of Plant Biotechnology
    • /
    • v.33 no.3
    • /
    • pp.153-160
    • /
    • 2006
  • Brassica rape is an important species used as a vegetable, oil, and fodder worldwide. It is related phylogenically to Arabidopsis thaliana, which has already been fully sequenced as a model plant. The 'Multinational Brassica Genome Project (MBGP)'was launched by the international Brassica community with the aim of sequencing the whole genome of B. rapa in 2003 on account of its value and the fact that it has the smallest genome among the diploid Brassica. The genome study was carried out not only to know the structure of genome but also to understand the function and the evolution of the genes comprehensively. There are two mapping populations, over 1,000 molecular markers and a genetic map, 2 BAC libraries, physical map, a 22 cDHA libraries as suitable genomic materials for examining the genome of B. rapa ssp. pekinensis Chinese cabbage. As the first step for whole genome analysis, 220,000 BAC-end sequences of the KBrH and KBrB BAC library are achieved by cooperation of six countries. The results of BAC-end sequence analysis will provide a clue in understanding the structure of the genome of Brassica rapa by analyzing the gene sequence, annotation and abundant repetitive DHA. The second stage involves sequencing of the genetically mapped seed BACs and identifying the overlapping BACs for complete genome sequencing. Currently, the second stage is comprises of process genetic anchoring using communal populations and maps to identify more than 1,000 seed BACs based on a BAC-to-BAC strategy. For the initial sequencing, 629 seed BACs corresponding to the minimum tiling path onto Arabidopsis genome were selected and fully sequenced. These BACs are now anchoring to the genetic map using the development of SSR markers. This information will be useful for identifying near BAC clones with the seed BAC on a genome map. From the BAC sequences, it is revealed that the Brassica rapa genome has extensive triplication of the DNA segment coupled with variable gene losses and rearrangements within the segments. This article introduces the current status and prospective of Korea Brassica Genome Project and the bioinformatics tools possessed in each national team. In the near future, data of the genome will contribute to improving Brassicas for their economic use as well as in understanding the evolutional process.

Improving accessibility and distinction between negative results in biomedical relation extraction

  • Sousa, Diana;Lamurias, Andre;Couto, Francisco M.
    • Genomics & Informatics
    • /
    • v.18 no.2
    • /
    • pp.20.1-20.4
    • /
    • 2020
  • Accessible negative results are relevant for researchers and clinicians not only to limit their search space but also to prevent the costly re-exploration of research hypotheses. However, most biomedical relation extraction datasets do not seek to distinguish between a false and a negative relation among two biomedical entities. Furthermore, datasets created using distant supervision techniques also have some false negative relations that constitute undocumented/ unknown relations (missing from a knowledge base). We propose to improve the distinction between these concepts, by revising a subset of the relations marked as false on the phenotype-gene relations corpus and give the first steps to automatically distinguish between the false (F), negative (N), and unknown (U) results. Our work resulted in a sample of 127 manually annotated FNU relations and a weighted-F1 of 0.5609 for their automatic distinction. This work was developed during the 6th Biomedical Linked Annotation Hackathon (BLAH6).