• Title/Summary/Keyword: Bioinformatics data

Search Result 646, Processing Time 0.025 seconds

Knowledge-guided artificial intelligence technologies for decoding complex multiomics interactions in cells

  • Lee, Dohoon;Kim, Sun
    • Clinical and Experimental Pediatrics
    • /
    • v.65 no.5
    • /
    • pp.239-249
    • /
    • 2022
  • Cells survive and proliferate through complex interactions among diverse molecules across multiomics layers. Conventional experimental approaches for identifying these interactions have built a firm foundation for molecular biology, but their scalability is gradually becoming inadequate compared to the rapid accumulation of multiomics data measured by high-throughput technologies. Therefore, the need for data-driven computational modeling of interactions within cells has been highlighted in recent years. The complexity of multiomics interactions is primarily due to their nonlinearity. That is, their accurate modeling requires intricate conditional dependencies, synergies, or antagonisms between considered genes or proteins, which retard experimental validations. Artificial intelligence (AI) technologies, including deep learning models, are optimal choices for handling complex nonlinear relationships between features that are scalable and produce large amounts of data. Thus, they have great potential for modeling multiomics interactions. Although there exist many AI-driven models for computational biology applications, relatively few explicitly incorporate the prior knowledge within model architectures or training procedures. Such guidance of models by domain knowledge will greatly reduce the amount of data needed to train models and constrain their vast expressive powers to focus on the biologically relevant space. Therefore, it can enhance a model's interpretability, reduce spurious interactions, and prove its validity and utility. Thus, to facilitate further development of knowledge-guided AI technologies for the modeling of multiomics interactions, here we review representative bioinformatics applications of deep learning models for multiomics interactions developed to date by categorizing them by guidance mode.

Identifying Post-translational Modification Crosstalks for Breast Cancer

  • Tung, Chi-Hua;Shueng, Pei-Wei;Chu, Yen-Wei;Chu, Yen-Wei;Chen, Chian-Ying
    • Journal of Computing Science and Engineering
    • /
    • v.11 no.4
    • /
    • pp.111-120
    • /
    • 2017
  • Post-translational modifications (PTMs) of proteins play substantial roles in the gene regulation of cell physiological functions and in the generation of major diseases. However, the majority of existing studies only explored a certain PTM of proteins, while very few have investigated the PTMs of two or more domains and the effects of their interactions. In this study, after collecting data regarding a large number of breast cancer-related and validated PTMs, a sequence and domain analysis of breast cancer proteins was carried out using bioinformatics methods. Then, protein-protein interaction network-related tools were applied in order to determine the crosstalks between the PTMs of the proteins. Finally, statistical and functional analyses were conducted to identify more modification sites of domains and proteins that may interact with at least two or more PTMs. In addition to exploring the associations between the interactive effects of PTMs, the present study also provides important information that would allow biologists to further explore the regulatory pathways of biological functions and related diseases.

Determination of Substrate Specificities Against β-Glucosidase A (BglA) from Thermotoga maritime: A Molecular Docking Approach

  • Rajoka, Muhammad Ibrahim;Idrees, Sobia;Ashfaq, Usman Ali;Ehsan, Beenish;Haq, Asma
    • Journal of Microbiology and Biotechnology
    • /
    • v.25 no.1
    • /
    • pp.44-49
    • /
    • 2015
  • Thermostable enzymes derived from Thermotoga maritima have attracted worldwide interest for their potential industrial applications. Structural analysis and docking studies were preformed on T. maritima β-glucosidase enzyme with cellobiose and pNP-linked substrates. The 3D structure of the thermostable β-glucosidase was downloaded from the Protein Data Bank database. Substrates were downloaded from the PubCehm database and were minimized using MOE software. Docking of BglA and substrates was carried out using MOE software. After analyzing docked enzyme/substrate complexes, it was found that Glu residues were mainly involved in the reaction, and other important residues such as Asn, Ser, Tyr, Trp, and His were involved in hydrogen bonding with pNP-linked substrates. By determining the substrate recognition pattern, a more suitable β-glucosidase enzyme could be developed, enhancing its industrial potential.

DNA Fingerprinting Analysis of the Genus Phytophthora in Korea

  • Park, Dong-Suk;Kang, Hee-Wan;Lee, Mi-Hee;Park, Young-Jin;Lee, Byoung-Moo;Hahn, Jang-Ho;Go, Seung-Joo
    • Mycobiology
    • /
    • v.31 no.4
    • /
    • pp.235-247
    • /
    • 2003
  • In order to investigate biodiversity and establish identification system for Phytophthora spp. in Korea, a variety of band pattern was produced by using the URP(universal rice primer). The fingerprint patterns of Phytophthora spp. showed many common and variable fragments according to their isolates in distinct genotypes. In particular, P. drechsleri was classified into four distinct types(I to IV). P. drechsleri(KACC 40498 and KACC 40499) and P. cryptogea(KACC 40413) appeared to have almost equal bands despite their being different species. Ninety isolates of Phytophthora spp. were clustered into 13 groups based on UPGMA(unweighted pair group method with arithmetic means) analysis. These DNA fingerprinting data would be helpful for inter- and intra-species identification of Phytophthora species.

Bioinformatics : Latest Application and Interdisciplinary Field of Computer Science (전산학의 최신 응용 및 학제 분야인 생명정보학)

  • Kim, Ki-Bong
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.11 no.3
    • /
    • pp.971-977
    • /
    • 2010
  • A flood of biological data has caused many challenges in computing. Bioinformatics, the application of computational techniques to analyze the information associated with biomolecules on a large-scale, has now firmly established itself as an interdisciplinary subject in molecular biology, and encompasses a wide range of subject areas from structural biology, genomics, proteomics, systems biology, biostatistics to computer science. In this review, I provide an introduction and overview of the current state of bioinformatics. Looking at the types of biological information and databases that are commonly used, I also deals with some of bioinformatics application domains which are closely related to areas of computer science.

Comparative analysis of HiSeq3000 and BGISEQ-500 sequencing platform with shotgun metagenomic sequencing data

  • Animesh Kumar;Espen M. Robertsen;Nils P. Willassen;Juan Fu;Erik Hjerde
    • Genomics & Informatics
    • /
    • v.21 no.4
    • /
    • pp.49.1-49.11
    • /
    • 2023
  • Recent advances in sequencing technologies and platforms have enabled to generate metagenomics sequences using different sequencing platforms. In this study, we analyzed and compared shotgun metagenomic sequences generated by HiSeq3000 and BGISEQ-500 platforms from 12 sediment samples collected across the Norwegian coast. Metagenomics DNA sequences were normalized to an equal number of bases for both platforms and further evaluated by using different taxonomic classifiers, reference databases, and assemblers. Normalized BGISEQ-500 sequences retained more reads and base counts after preprocessing, while a slightly higher fraction of HiSeq3000 sequences were taxonomically classified. Kaiju classified a higher percentage of reads relative to Kraken2 for both platforms, and comparison of reference database for taxonomic classification showed that MAR database outperformed RefSeq. Assembly using MEGAHIT produced longer assemblies and higher total contigs count in majority of HiSeq3000 samples than using metaSPAdes, but the assembly statistics notably improved with unprocessed or normalized reads. Our results indicate that both platforms perform comparably in terms of the percentage of taxonomically classified reads and assembled contig statistics for metagenomics samples. This study provides valuable insights for researchers in selecting an appropriate sequencing platform and bioinformatics pipeline for their metagenomics studies.

Transcript profiling of expressed sequence tags from intramuscular fat, longissimus dorsi muscle and liver in Korean cattle (Hanwoo)

  • Lim, Da-Jeong;Lee, Seung-Hwan;Cho, Yong-Min;Yoon, Du-Hak;Shin, Youn-Hee;Kim, Kyu-Won;Park, Hye-Sun;Kim, Hee-Bal
    • BMB Reports
    • /
    • v.43 no.2
    • /
    • pp.115-121
    • /
    • 2010
  • A large data set of Hanwoo (Korean cattle) ESTs was analyzed to obtain differential gene expression results for the following three libraries: intramuscular fat, longissimus dorsi muscle and liver. To better understand the gene expression profiles, we identified differentially expressed genes (DEGs) via digital gene expression analysis. Hierarchical clustering of genes was performed according to their relative abundance within the six separate groups (Hanwoo fat versus non-Hanwoo fat, Hanwoo muscle versus non-Hanwoo muscle and Hanwoo liver versus non-Hanwoo liver), producing detailed patterns of gene expression. We determined the quantitative traits associated with the highly expressed genes. We also provide the first list of putative regulatory elements associated with differential tissue expression in Hanwoo cattle. In addition, we conducted evolutionary analysis that suggests a subset of genes accelerated in the bovine lineage are strongly correlated with their expression in Hanwoo muscle.

Identification of Recently Selected Mutations Driven by Artificial Selection in Hanwoo (Korean Cattle)

  • Lim, Dajeong;Gondro, Cedric;Park, Hye Sun;Cho, Yong Min;Chai, Han Ha;Seong, Hwan Hoo;Yang, Bo Suk;Hong, Seong Koo;Chang, Won Kyung;Lee, Seung Hwan
    • Asian-Australasian Journal of Animal Sciences
    • /
    • v.26 no.5
    • /
    • pp.603-608
    • /
    • 2013
  • Hanwoo have been subjected over the last seventy years to intensive artificial selection with the aim of improving meat production traits such as marbling and carcass weight. In this study, we performed a signature of selection analysis to identify recent positive selected regions driven by a long-term artificial selection process called a breeding program using whole genome SNP data. In order to investigate homozygous regions across the genome, we estimated iES (integrated Extended Haplotype Homozygosity SNP) for the each SNPs. As a result, we identified two highly homozygous regions that seem to be strong and/or recent positive selection. Five genes (DPH5, OLFM3, S1PR1, LRRN1 and CRBN) were included in this region. To go further in the interpretation of the observed signatures of selection, we subsequently concentrated on the annotation of differentiated genes defined according to the iES value of SNPs localized close or within them. We also described the detection of the adaptive evolution at the molecular level for the genes of interest. As a result, this analysis also led to the identification of OLFM3 as having a strong signal of selection in bovine lineage. The results of this study indicate that artificial selection which might have targeted most of these genes was mainly oriented towards improvement of meat production.

Mechanistic insight into the progressive retinal atrophy disease in dogs via pathway-based genome-wide association analysis

  • Sheet, Sunirmal;Krishnamoorthy, Srikanth;Park, Woncheoul;Lim, Dajeong;Park, Jong-Eun;Ko, Minjeong;Choi, Bong-Hwan
    • Journal of Animal Science and Technology
    • /
    • v.62 no.6
    • /
    • pp.765-776
    • /
    • 2020
  • The retinal degenerative disease, progressive retinal atrophy (PRA) is a major reason of vision impairment in canine population. Canine PRA signifies an inherently dissimilar category of retinal dystrophies which has solid resemblances to human retinis pigmentosa. Even though much is known about the biology of PRA, the knowledge about the intricate connection among genetic loci, genes and pathways associated to this disease in dogs are still remain unknown. Therefore, we have performed a genome wide association study (GWAS) to identify susceptibility single nucleotide polymorphisms (SNPs) of PRA. The GWAS was performed using a case-control based association analysis method on PRA dataset of 129 dogs and 135,553 markers. Further, the gene-set and pathway analysis were conducted in this study. A total of 1,114 markers associations with PRA trait at p < 0.01 were extracted and mapped to 640 unique genes, and then selected significant (p < 0.05) enriched 35 gene ontology (GO) terms and 5 Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways contain these genes. In particular, apoptosis process, homophilic cell adhesion, calcium ion binding, and endoplasmic reticulum GO terms as well as pathways related to focal adhesion, cyclic guanosine monophosphate)-protein kinase G signaling, and axon guidance were more likely associated to the PRA disease in dogs. These data could provide new insight for further research on identification of potential genes and causative pathways for PRA in dogs.

HorseDB; an Integrated Horse Resource and Web Service (말 데이터베이스 구축)

  • Kim Dae-Soo;Jo Un-Jong;Huh Jae-Won;Choe Eun-Sang;Cho Byung-Wook;Kim Heui-Soo
    • Journal of Life Science
    • /
    • v.16 no.3 s.76
    • /
    • pp.472-476
    • /
    • 2006
  • We have built a database server called HorseDB which contains the genome annotation information and biological information for horse from public database entries. The aims of HorseDB are the integration of biological information and horse genome data on genome scale using bioinformatic methods. To facilitate the extraction of useful information among collected horse genome and biological data, we developed a user-friendly interface system, HorseDB; an Integrated Horse Resource and web Service. The database is categorized by the general horse information data, a sequence annotation data, and a world-wide web analysis program interface. The database also provides an easy access for user to find out the useful information within horse genomes and support analyzed information, such as sequence alignment and gene annotation results. HorseDB can be accessed at http://www.primate.or.kr./horse.