• 제목/요약/키워드: genome annotation

검색결과 182건 처리시간 0.021초

A genomic and bioinformatic-based approach to identify genetic variants for liver cancer across multiple continents

  • Muhammad Ma'ruf;Lalu Muhammad Irham;Wirawan Adikusuma;Made Ary Sarasmita;Sabiah Khairi;Barkah Djaka Purwanto;Rockie Chong;Maulida Mazaya;Lalu Muhammad Harmain Siswanto
    • Genomics & Informatics
    • /
    • 제21권4호
    • /
    • pp.48.1-48.8
    • /
    • 2023
  • Liver cancer is the fourth leading cause of death worldwide. Well-known risk factors include hepatitis B virus and hepatitis C virus, along with exposure to aflatoxins, excessive alcohol consumption, obesity, and type 2 diabetes. Genomic variants play a crucial role in mediating the associations between these risk factors and liver cancer. However, the specific variants involved in this process remain under-explored. This study utilized a bioinformatics approach to identify genetic variants associated with liver cancer from various continents. Single-nucleotide polymorphisms associated with liver cancer were retrieved from the genome-wide association studies catalog. Prioritization was then performed using functional annotation with HaploReg v4.1 and the Ensembl database. The prevalence and allele frequencies of each variant were evaluated using Pearson correlation coefficients. Two variants, rs2294915 and rs2896019, encoded by the PNPLA3 gene, were found to be highly expressed in the liver tissue, as well as in the skin, cell-cultured fibroblasts, and adipose-subcutaneous tissue, all of which contribute to the risk of liver cancer. We further found that these two SNPs (rs2294915 and rs2896019) were positively correlated with the prevalence rate. Positive associations with the prevalence rate were more frequent in East Asian and African populations. We highlight the utility of this population-specific PNPLA3 genetic variant for genetic association studies and for the early prognosis and treatment of liver cancer. This study highlights the potential of integrating genomic databases with bioinformatic analysis to identify genetic variations involved in the pathogenesis of liver cancer. The genetic variants investigated in this study are likely to predispose to liver cancer and could affect its progression and aggressiveness. We recommend future research prioritizing the validation of these variations in clinical settings.

Identification of genomic regions and genes associated with subclinical ketosis in periparturient dairy cows

  • Jihwan Lee;KwangHyeon Cho;Kent A. Weigel;Heather M. White;ChangHee Do;Inchul Choi
    • Journal of Animal Science and Technology
    • /
    • 제66권3호
    • /
    • pp.567-576
    • /
    • 2024
  • Subclinical ketosis (SCK) is a prevalent metabolic disorder that occurs during the transition to lactation period. It is defined as a high blood concentration of ketone bodies (beta-hydroxybutyric acid f ≥ 1.2 mmol/L) within the first few weeks of lactation, and often presents without clinical signs. SCK is mainly caused by negative energy balance (NEB). The objective of this study is to identify single nucleotide polymorphisms (SNPs) associated with SCK using genome-wide association studies (GWAS), and to predict the biological functions of proximal genes using gene-set enrichment analysis (GSEA). Blood samples were collected from 112 Holstein cows between 5 and 18 days postpartum to determine the incidence of SCK. Genomic DNA extracted from both SCK and healthy cows was examined using the Illumina Bovine SNP50K BeadChip for genotyping. GWAS revealed 194 putative SNPs and 163 genes associated with those SNPs. Additionally, GSEA showed that the genes retrieved by Database for Annotation, Visualization, and Integrated Discovery (DAVID) belonged to calcium signaling, starch and sucrose, immune network, and metabolic pathways. Furthermore, the proximal genes were found to be related to germ cell and early embryo development. In summary, this study proposes several feasible SNPs and genes associated with SCK through GWAS and GSEA. These candidates can be utilized in selective breeding programs to reduce the genetic risk for SCK and subfertility in high-performance dairy cows.

Choosing preferable labels for the Japanese translation of the Human Phenotype Ontology

  • Ninomiya, Kota;Takatsuki, Terue;Kushida, Tatsuya;Yamamoto, Yasunori;Ogishima, Soichi
    • Genomics & Informatics
    • /
    • 제18권2호
    • /
    • pp.23.1-23.6
    • /
    • 2020
  • The Human Phenotype Ontology (HPO) is the de facto standard ontology to describe human phenotypes in detail, and it is actively used, particularly in the field of rare disease diagnoses. For clinicians who are not fluent in English, the HPO has been translated into many languages, and there have been four initiatives to develop Japanese translations. At the Biomedical Linked Annotation Hackathon 6 (BLAH6), a rule-based approach was attempted to determine the preferable Japanese translation for each HPO term among the candidates developed by the four approaches. The relationship between the HPO and Mammalian Phenotype translations was also investigated, with the eventual goal of harmonizing the two translations to facilitate phenotype-based comparisons of species in Japanese through cross-species phenotype matching. In order to deal with the increase in the number of HPO terms and the need for manual curation, it would be useful to have a dictionary containing word-by-word correspondences and fixed translation phrases for English word order. These considerations seem applicable to HPO localization into other languages.

A proof-of-concept study of extracting patient histories for rare/intractable diseases from social media

  • Yamaguchi, Atsuko;Queralt-Rosinach, Nuria
    • Genomics & Informatics
    • /
    • 제18권2호
    • /
    • pp.17.1-17.4
    • /
    • 2020
  • The amount of content on social media platforms such as Twitter is expanding rapidly. Simultaneously, the lack of patient information seriously hinders the diagnosis and treatment of rare/intractable diseases. However, these patient communities are especially active on social media. Data from social media could serve as a source of patient-centric knowledge for these diseases complementary to the information collected in clinical settings and patient registries, and may also have potential for research use. To explore this question, we attempted to extract patient-centric knowledge from social media as a task for the 3-day Biomedical Linked Annotation Hackathon 6 (BLAH6). We selected amyotrophic lateral sclerosis and multiple sclerosis as use cases of rare and intractable diseases, respectively, and we extracted patient histories related to these health conditions from Twitter. Four diagnosed patients for each disease were selected. From the user timelines of these eight patients, we extracted tweets that might be related to health conditions. Based on our experiment, we show that our approach has considerable potential, although we identified problems that should be addressed in future attempts to mine information about rare/intractable diseases from Twitter.

An Orthologous Group Clustering Technique based on the Grid Computing

  • Oh, J.S.;Kim, T.K.;Kim, S.S.;Kwon, H.R.;Kim, Y.C.;Yoo, J.S.;Cho, W.S.
    • 한국생물정보학회:학술대회논문집
    • /
    • 한국생물정보시스템생물학회 2005년도 BIOINFO 2005
    • /
    • pp.72-77
    • /
    • 2005
  • Orthologs are genes having the same function across different species that specialize from a single gene in the last common ancestor of these species. Orthologous groups are useful in the genome annotation, studies on gene evolution, and comparative genomics. However, the construction of an orthologous group is difficult to automate and it takes so much time. It is also hard to guarantee the accuracy of the constructed orthologous groups. We propose a system to construct orthologous groups on many genomes automatically and rapidly. We utilize the grid computing to reduce the sequence alignment time, and we use clustering algorithm in the application of database to automate whole processes. We have generated orthologous groups for 20 complete prokaryotes genomes just in a day because of the grid computing. Furthermore, new genomes can be accommodated easily by the clustering algorithm and grid computing. We compared the generated orthologous groups with COGs (Clusters of orthologous Group of proteins) and KO (KEGG Ortholog). The comparison shows about 85 percent similarity compared with previous well-known orthologous databases.

  • PDF

The Analysis and Application of a Recombinant Monooxygenase Library as a Biocatalyst for the Baeyer- Villiger Reaction

  • Park, Ji-Yeoun;Kim, Dong-Hyun;Kim, Su-Jin;Kim, Jin-Hee;Bae, Ki-Hwan;Lee, Choong-Hwan
    • Journal of Microbiology and Biotechnology
    • /
    • 제17권7호
    • /
    • pp.1083-1089
    • /
    • 2007
  • Because of their selectivity and catalytic efficiency, BVMOs are highly valuable biocatalysts for the chemoenzymatic synthesis of a broad range of useful compounds. In this study, we investigated the microbial Baeyer-Villiger oxidation and sulfoxidation of thioanisole and bicyclo[3.2.0]hept-2-en-6-one using whole Escherichia coli cells that recombined with each of the Baeyer-Villiger monooxygenases originated from Pseudomonas aeruginosa PAOl and two from Streptomyces coelicolor A3(2). The three BVMOs were identified in the microbial genome database by a recently described protein sequence motif; e.g., BVMO motif(FXGXXXHXXXW). The reaction products were identified as (R)-/(S)-sulfoxide and 2-oxabicyclo/3-oxabicyclo[3.3.0]oct-6-en-2-one by GC-MS analysis. Consequently, this study demonstrated that the three enzymes can indeed catalyze the Baeyer-Villiger reaction as a biocatalyst, and effective annotation tools can be efficiently exploited as a source of novel BVMOs.

Applied Computational Tools for Crop Genome Research

  • Love Christopher G;Batley Jacqueline;Edwards David
    • Journal of Plant Biotechnology
    • /
    • 제5권4호
    • /
    • pp.193-195
    • /
    • 2003
  • A major goal of agricultural biotechnology is the discovery of genes or genetic loci which are associated with characteristics beneficial to crop production. This knowledge of genetic loci may then be applied to improve crop breeding. Agriculturally important genes may also benefit crop production through transgenic technologies. Recent years have seen an application of high throughput technologies to agricultural biotechnology leading to the production of large amounts of genomic data. The challenge today is the effective structuring of this data to permit researchers to search, filter and importantly, make robust associations within a wide variety of datasets. At the Plant Biotechnology Centre, Primary Industries Research Victoria in Melbourne, Australia, we have developed a series of tools and computational pipelines to assist in the processing and structuring of genomic data to aid its application to agricultural biotechnology resear-ch. These tools include a sequence database, ASTRA, for the processing and annotation of expressed sequence tag data. Tools have also been developed for the discovery of simple sequence repeat (SSR) and single nucleotide polymorphism (SNP) molecular markers from large sequence datasets. Application of these tools to Brassica research has assisted in the production of genetic and comparative physical maps as well as candidate gene discovery for a range of agronomically important traits.

Development of KHapmap Browser using DAS for Korean HapMap Research

  • Jin, Hoon;Kim, Seung-Ho;Kim, Young-Uk;Park, Young-Kyu;Ji, Mi-Hyun;Kim, Young-Joo
    • Genomics & Informatics
    • /
    • 제6권2호
    • /
    • pp.57-63
    • /
    • 2008
  • The Korean HapMap Project has been carried out for the last 5 years since it started in June, 2003. The project generated data for a sum of 1,764,000 Korean SNPs and formally registered the data to the dbSNP of NCBI (The dbSNP website. 2008). We have developed a series of software programs for association studies as well as for the comparison and analysis of Korean HapMap data with four other populations (CEPH, Yoruba, Han Chinese, and Japanese populations). The KHapmap Browser was developed and integrated to provide haplotype retrieval and comparative study tools of human ethnicities for comprehensive disease association studies (http://www.khapmap.org). On that basis, GBrowse was adopted in the KHapmap Browser for inherent Korean genetic data, and a provision of extended services was pledged with the distributed sequence annotation system (DAS). The dynamic linking service of the KHapmap Browser to other tools in our intranetwork environment provides many enhanced functions over GBrowse without DAS. KHapmap Browser is expected to be an invaluable tool for the study of Korean and international Hapmap data.

Composite Dependency-reflecting Model for Core Promoter Recognition in Vertebrate Genomic DNA Sequences

  • Kim, Ki-Bong;Park, Seon-Hee
    • BMB Reports
    • /
    • 제37권6호
    • /
    • pp.648-656
    • /
    • 2004
  • This paper deals with the development of a predictive probabilistic model, a composite dependency-reflecting model (CDRM), which was designed to detect core promoter regions and transcription start sites (TSS) in vertebrate genomic DNA sequences, an issue of some importance for genome annotation. The model actually represents a combination of first-, second-, third- and much higher order or long-range dependencies obtained using the expanded maximal dependency decomposition (EMDD) procedure, which iteratively decomposes data sets into subsets on the basis of dependency degree and patterns inherent in the target promoter region to be modeled. In addition, decomposed subsets are modeled by using a first-order Markov model, allowing the predictive model to reflect dependency between adjacent positions explicitly. In this way, the CDRM allows for potentially complex dependencies between positions in the core promoter region. Such complex dependencies may be closely related to the biological and structural contexts since promoter elements are present in various combinations separated by various distances in the sequence. Thus, CDRM may be appropriate for recognizing core promoter regions and TSSs in vertebrate genomic contig. To demonstrate the effectiveness of our algorithm, we tested it using standardized data and real core promoters, and compared it with some current representative promoter-finding algorithms. The developed algorithm showed better accuracy in terms of specificity and sensitivity than the promoter-finding ones used in performance comparison.

The gene repertoire of Pythium porphyrae (Oomycota) suggests an adapted plant pathogen tackling red algae

  • Badis, Yacine;Han, Jong Won;Klochkova, Tatyana A.;Gachon, Claire M.M.;Kim, Gwang Hoon
    • ALGAE
    • /
    • 제35권2호
    • /
    • pp.133-144
    • /
    • 2020
  • Pythium porphyrae is responsible for devastating outbreaks in seaweed farms of Pyropia, the most valuable cultivated seaweed worldwide. While the genus Pythium contains many well studied pathogens, the genome of P. porphyrae has yet to be sequenced. Here we report the first available gene repertoire of P. porphyrae and a preliminary analysis of pathogenicity-related genes. Using ab initio detection strategies, similarity based and manual annotation, we found that the P. porphyrae gene repertoire is similar to classical phytopathogenic Pythium species. This includes the absence of expanded RxLR effector family and the detection of classical pathogenicity-related genes like crinklers, glycoside hydrolases, cellulose-binding elicitor lectin-like proteins and elicitins. We additionally compared this dataset to the proteomes of 8 selected Pythium species. While 34% of the predicted proteome appeared specific to P. porphyrae, we could not attribute specific enzymes to the degradation of red algal biomass. Conversely, we detected several cellulases and a cutinase conserved with plant-pathogenic Pythium species. Together with the recent report of P. porphyrae triggering disease symptoms on several plant species in lab-controlled conditions, our findings add weight to the hypothesis that P. porphyrae is a reformed plant pathogen.