• Title/Summary/Keyword: bioinformatic analysis

Search Result 118, Processing Time 0.035 seconds

CaGe: A Web-Based Cancer Gene Annotation System for Cancer Genomics

  • Park, Young-Kyu;Kang, Tae-Wook;Baek, Su-Jin;Kim, Kwon-Il;Kim, Seon-Young;Lee, Do-Heon;Kim, Yong-Sung
    • Genomics & Informatics
    • /
    • v.10 no.1
    • /
    • pp.33-39
    • /
    • 2012
  • High-throughput genomic technologies (HGTs), including next-generation DNA sequencing (NGS), microarray, and serial analysis of gene expression (SAGE), have become effective experimental tools for cancer genomics to identify cancer-associated somatic genomic alterations and genes. The main hurdle in cancer genomics is to identify the real causative mutations or genes out of many candidates from an HGT-based cancer genomic analysis. One useful approach is to refer to known cancer genes and associated information. The list of known cancer genes can be used to determine candidates of cancer driver mutations, while cancer gene-related information, including gene expression, protein-protein interaction, and pathways, can be useful for scoring novel candidates. Some cancer gene or mutation databases exist for this purpose, but few specialized tools exist for an automated analysis of a long gene list from an HGT-based cancer genomic analysis. This report presents a new web-accessible bioinformatic tool, called CaGe, a cancer genome annotation system for the assessment of candidates of cancer genes from HGT-based cancer genomics. The tool provides users with information on cancer-related genes, mutations, pathways, and associated annotations through annotation and browsing functions. With this tool, researchers can classify their candidate genes from cancer genome studies into either previously reported or novel categories of cancer genes and gain insight into underlying carcinogenic mechanisms through a pathway analysis. We show the usefulness of CaGe by assessing its performance in annotating somatic mutations from a published small cell lung cancer study.

PGC-Enriched miRNAs Control Germ Cell Development

  • Bhin, Jinhyuk;Jeong, Hoe-Su;Kim, Jong Soo;Shin, Jeong Oh;Hong, Ki Sung;Jung, Han-Sung;Kim, Changhoon;Hwang, Daehee;Kim, Kye-Seong
    • Molecules and Cells
    • /
    • v.38 no.10
    • /
    • pp.895-903
    • /
    • 2015
  • Non-coding microRNAs (miRNAs) regulate the translation of target messenger RNAs (mRNAs) involved in the growth and development of a variety of cells, including primordial germ cells (PGCs) which play an essential role in germ cell development. However, the target mRNAs and the regulatory networks influenced by miRNAs in PGCs remain unclear. Here, we demonstrate a novel miRNAs control PGC development through targeting mRNAs involved in various cellular pathways. We reveal the PGC-enriched expression patterns of nine miRNAs, including miR-10b, -18a, -93, -106b, -126-3p, -127, -181a, -181b, and -301, using miRNA expression analysis along with mRNA microarray analysis in PGCs, embryonic gonads, and postnatal testes. These miRNAs are highly expressed in PGCs, as demonstrated by Northern blotting, miRNA in situ hybridization assay, and miRNA qPCR analysis. This integrative study utilizing mRNA microarray analysis and miRNA target prediction demonstrates the regulatory networks through which these miRNAs regulate their potential target genes during PGC development. The elucidated networks of miRNAs disclose a coordinated molecular mechanism by which these miRNAs regulate distinct cellular pathways in PGCs that determine germ cell development.

Analysis of allele-specific expression using RNA-seq of the Korean native pig and Landrace reciprocal cross

  • Ahn, Byeongyong;Choi, Min-Kyeung;Yum, Joori;Cho, In-Cheol;Kim, Jin-Hoi;Park, Chankyu
    • Asian-Australasian Journal of Animal Sciences
    • /
    • v.32 no.12
    • /
    • pp.1816-1825
    • /
    • 2019
  • Objective: We tried to analyze allele-specific expression in the pig neocortex using bioinformatic analysis of high-throughput sequencing results from the parental genomes and offspring transcriptomes from reciprocal crosses between Korean Native and Landrace pigs. Methods: We carried out sequencing of parental genomes and offspring transcriptomes using next generation sequencing. We subsequently carried out genome scale identification of single nucleotide polymorphisms (SNPs) in two different ways using either individual genome mapping or joint genome mapping of the same breed parents that were used for the reciprocal crosses. Using parent-specific SNPs, allele-specifically expressed genes were analyzed. Results: Because of the low genome coverage (${\sim}4{\times}$) of the sequencing results, most SNPs were non-informative for parental lineage determination of the expressed alleles in the offspring and were thus excluded from our analysis. Consequently, 436 SNPs covering 336 genes were applicable to measure the imbalanced expression of paternal alleles in the offspring. By calculating the read ratios of parental alleles in the offspring, we identified seven genes showing allele-biased expression (p<0.05) including three previously reported and four newly identified genes in this study. Conclusion: The newly identified allele-specifically expressing genes in the neocortex of pigs should contribute to improving our knowledge on genomic imprinting in pigs. To our knowledge, this is the first study of allelic imbalance using high throughput analysis of both parental genomes and offspring transcriptomes of the reciprocal cross in outbred animals. Our study also showed the effect of the number of informative animals on the genome level investigation of allele-specific expression using RNA-seq analysis in livestock species.

Bacterial communities in the feces of insectivorous bats in South Korea

  • Injung An;Byeori Kim;Sungbae Joo;Kihyun Kim;Taek-Woo Lee
    • Journal of Ecology and Environment
    • /
    • v.48 no.2
    • /
    • pp.120-127
    • /
    • 2024
  • Bats serve as vectors and natural reservoir hosts for various infectious viruses, bacteria, and fungi. These pathogens have also been detected in bat feces and can cause severe illnesses in hosts, other animals, and humans. Because pathogens can easily spread into the environment through bat feces, determining the bacterial communities in bat guano is crucial to mitigate potential disease transmission and outbreaks. This study primarily aimed to examine bacterial communities in the feces of insectivorous bats living in South Korea. Fecal samples were collected after capturing 84 individuals of four different bat species in two regions of South Korea, and the bacterial microbiota was assessed through next generation sequencing of the 16S rRNA gene. The results revealed that, with respect to the relative abundance at the phylum level, Myotis bombinus was dominated by Firmicutes (47.24%) and Proteobacteria (42.66%) whereas Miniopterus fuliginosus (82.78%), Rhinolophus ferrumequinum (63.46%), and Myotis macrodactylus (78.04%) were dominated by Proteobacteria. Alpha diversity analysis showed no difference in abundance between species and a significant difference (p < 0.05) between M. bombinus and M. fuliginosus. Beta-diversity analysis revealed that Clostridium, Asaia, and Enterobacteriaceae_g were clustered as major factors at the genus level using principal component analysis. Additionally, linear discriminant analysis effect size was conducted based on relative expression information to select bacterial markers for each bat species. Clostridium was relatively abundant in M. bombinus, whereas Mycoplasma_g10 was relatively abundant in R. ferrumequinum. Our results provide an overview of bat guano microbiota diversity and the significance of pathogenic taxa for humans and the environment, highlighting a better understanding of preventing emerging diseases. We anticipate that this research will yield bioinformatic data to advance our knowledge of overall microbial genetic diversity and clustering characteristics in insectivorous bat feces in South Korea.

Bioinformatic Analysis of Envelope Protein Domains of Zika Virus and Dengue Virus (지카 바이러스 및 뎅기 바이러스의 외피 단백질을 구성하는 도메인의 생물정보학적 분석)

  • Choi, Jae-Won;Kim, Hak Yong
    • The Journal of the Korea Contents Association
    • /
    • v.19 no.11
    • /
    • pp.632-643
    • /
    • 2019
  • In recent years, large scale damages from arbovirus infections by mosquitoes have been reported worldwide due to factors such as change in global climate, increased overseas travel, and increased logistics movement between countries. Among them, Zika virus and dengue virus belonging to genus Flavivirus are representative. In this study, we performed in-depth analyses of the envelope (E) protein that perform essential functions for host infection of Zika virus and dengue virus based on bioinformatics databases. The domain analysis of E protein was performed to determine the type, location, and function, and homology analysis for each domain. From these results, EDIII showing low homology was identified. The homology and immunogenicity of each peptide constituting EDIII were analyzed and three-dimensional structures were modeled. Furthermore, we discussed their biological meaning and how they could be used.

Genomic Analysis of 13 Putative Active Prophages Located in the Genomes of Walnut Blight Pathogen Xanthomonas arboricola pv. juglandis

  • Cao, Zheng;Cuiying, Du;Benzhong, Fu
    • Microbiology and Biotechnology Letters
    • /
    • v.50 no.4
    • /
    • pp.563-573
    • /
    • 2022
  • Xanthomonas arboricola pv. juglandis (Xaj) is a globally important bacterial pathogen of walnut trees that causes substantial economic losses in commercial walnut production. Although prophages are common in bacterial plant pathogens and play important roles in bacterial diversity and pathogenicity, there has been limited investigation into the distribution and function of prophages in Xaj. In this study, we identified and characterized 13 predicted prophages from the genomes of 12 Xaj isolates from around the globe. These prophages ranged in length from 11.8 kb to 51.9 kb, with between 11-75 genes and 57.82-64.15% GC content. The closest relatives of these prophages belong to the Myoviridae and Siphoviridae families of the Caudovirales order. The phylogenetic analysis allowed the classification of the prophages into five groups. The gene constitution of these predicted prophages was revealed via Roary analysis. Amongst 126 total protein groups, the most prevalent group was only present in nine prophages, and 22 protein groups were present in only one prophage (singletons). Also, bioinformatic analysis of the 13 identified prophages revealed the presence of 431 genes with an average length of 389.7 bp. Prokka annotation of these prophages identified 466 hypothetical proteins, 24 proteins with known function, and six tRNA genes. The proteins with known function mainly comprised prophage integrase IntA, replicative DNA helicase, tyrosine recombinase XerC, and IS3 family transposase. There was no detectable insertion site specificity for these prophages in the Xaj genomes. The identified Xaj prophage genes, particularly those of unknown function, merit future investigation.

Nitrate enhances the secondary growth of storage roots in Panax ginseng

  • Kyoung Rok Geem ;Jaewook Kim ;Wonsil Bae ;Moo-Geun Jee ;Jin Yu ;Inbae Jang;Dong-Yun Lee ;Chang Pyo Hong ;Donghwan Shim;Hojin Ryu
    • Journal of Ginseng Research
    • /
    • v.47 no.3
    • /
    • pp.469-478
    • /
    • 2023
  • Background: Nitrogen (N) is an essential macronutrient for plant growth and development. To support agricultural production and enhance crop yield, two major N sources, nitrate and ammonium, are applied as fertilizers to the soil. Although many studies have been conducted on N uptake and signal transduction, the molecular genetic mechanisms of N-mediated physiological roles, such as the secondary growth of storage roots, remain largely unknown. Methods: One-year-old P. ginseng seedlings treated with KNO3 were analyzed for the secondary growth of storage roots. The histological paraffin sections were subjected to bright and polarized light microscopic analysis. Genome-wide RNA-seq and network analysis were carried out to dissect the molecular mechanism of nitrate-mediated promotion of ginseng storage root thickening. Results: Here, we report the positive effects of nitrate on storage root secondary growth in Panax ginseng. Exogenous nitrate supply to ginseng seedlings significantly increased the root secondary growth. Histological analysis indicated that the enhancement of root secondary growth could be attributed to the increase in cambium stem cell activity and the subsequent differentiation of cambium-derived storage parenchymal cells. RNA-seq and gene set enrichment analysis (GSEA) revealed that the formation of a transcriptional network comprising auxin, brassinosteroid (BR)-, ethylene-, and jasmonic acid (JA)-related genes mainly contributed to the secondary growth of ginseng storage roots. In addition, increased proliferation of cambium stem cells by a N-rich source inhibited the accumulation of starch granules in storage parenchymal cells. Conclusion: Thus, through the integration of bioinformatic and histological tissue analyses, we demonstrate that nitrate assimilation and signaling pathways are integrated into key biological processes that promote the secondary growth of P. ginseng storage roots.

An assessment of the taxonomic reliability of DNA barcode sequences in publicly available databases

  • Jin, Soyeong;Kim, Kwang Young;Kim, Min-Seok;Park, Chungoo
    • ALGAE
    • /
    • v.35 no.3
    • /
    • pp.293-301
    • /
    • 2020
  • The applications of DNA barcoding have a wide range of uses, such as in taxonomic studies to help elucidate cryptic species and phylogenetic relationships and analyzing environmental samples for biodiversity monitoring and conservation assessments of species. After obtaining the DNA barcode sequences, sequence similarity-based homology analysis is commonly used. This means that the obtained barcode sequences are compared to the DNA barcode reference databases. This bioinformatic analysis necessarily implies that the overall quantity and quality of the reference databases must be stringently monitored to not have an adverse impact on the accuracy of species identification. With the development of next-generation sequencing techniques, a noticeably large number of DNA barcode sequences have been produced and are stored in online databases, but their degree of validity, accuracy, and reliability have not been extensively investigated. In this study, we investigated the extent to which the amount and types of erroneous barcode sequences were deposited in publicly accessible databases. Over 4.1 million sequences were investigated in three largescale DNA barcode databases (NCBI GenBank, Barcode of Life Data System [BOLD], and Protist Ribosomal Reference database [PR2]) for four major DNA barcodes (cytochrome c oxidase subunit 1 [COI], internal transcribed spacer [ITS], ribulose bisphosphate carboxylase large chain [rbcL], and 18S ribosomal RNA [18S rRNA]); approximately 2% of erroneous barcode sequences were found and their taxonomic distributions were uneven. Consequently, our present findings provide compelling evidence of data quality problems along with insufficient and unreliable annotation of taxonomic data in DNA barcode databases. Therefore, we suggest that if ambiguous taxa are presented during barcoding analysis, further validation with other DNA barcode loci or morphological characters should be mandated.

Quantitative Proteomics Towards Understanding Life and Environment

  • Choi, Jong-Soon;Chung, Keun-Yook;Woo, Sun-Hee
    • Korean Journal of Environmental Agriculture
    • /
    • v.25 no.4
    • /
    • pp.371-381
    • /
    • 2006
  • New proteomic techniques have been pioneered extensively in recent years, enabling the high-throughput and systematic analyses of cellular proteins in combination with bioinformatic tools. Furthermore, the development of such novel proteomic techniques facilitates the elucidation of the functions of proteins under stress or disease conditions, resulting in the discovery of biomarkers for responses to environmental stimuli. The ultimate objective of proteomics is targeted toward the entire proteome of life, subcellular localization biochemical activities, and the regulation thereof. Comprehensive analysis strategies of proteomics can be classified into three categories: (i) protein separation via 2-dimensional gel electrophoresis (2-DE) or liquid chromatography (LC), (ii) protein identification via either Edman sequencing or mass spectrometry (MS), and (iii) proteome quantitation. Currently, MS-based proteomics techniques have shifted from qualitative proteome analysis via 2-DE or 2D-LC coupled with off-line matrix assisted laser desorption ionization (MALDI) and on-line electrospray ionization (ESI) MS, respectively, toward quantitative proteome analysis. In vitro quantitative proteomic techniques include differential gel electrophoresis with fluorescence dyes. protein-labeling tagging with isotope-coded affinity tags, and peptide-labeling tagging with isobaric tags for relative and absolute quantitation. In addition, stable isotope-labeled amino acids can be in vivo labeled into live culture cells via metabolic incorporation. MS-based proteomics techniques extend to the detection of the phosphopeptide mapping of biologically crucial proteins, which ale associated with post-translational modification. These complementary proteomic techniques contribute to our current understanding of the manner in which life responds to differing environment.

Discovering Novel Genes of poultry in Genomic Era

  • S.K. Kang;Lee, B.C.;J.M. Lim;J.Y. Han;W.S. Hwang
    • Korean Journal of Poultry Science
    • /
    • v.28 no.2
    • /
    • pp.143-153
    • /
    • 2001
  • Using bioinformatic tools for searching the massive genome databases, it is possible to Identify new genes in few minutes for initial discoveries based on evolutionary conservation, domain homology, and tissue expression patterns, followed by further verification and characterization using the bench-top works. The development of high-density two-dimensional arrays has allowed the analysis of the expression of thousands of genes simultaneously in the humans, mice, rats, yeast, and bacteria to elucidate the genes and pathways involved in physiological processes. In addition, rapid and automated protein identification is being achieved by searching protein and nucleotide sequence databases directly with data generated from mass spectrometry. Recently, analysis at the bio-chemical level such as biochemical screening and metabolic profiling (Biochemical genomics) has been introduced as an additional approach for categorical assignment of gene function. To make advantage of recent achievements in computational approaches for facilitated gene discoveries in the avian model, chicken expression sequence tags (ESTs) have been reported and deposited in the international databases. By searching EST databases, a chicken heparanase gene was identified and functionally confirmed by subsequent experiments. Using combination of sub-tractive hybridization assay and Genbank database searches, a chicken heme -binding protein family (cSOUL/HBP) was isolated in the retina and pineal gland of domestic chicken and verified by Northern blot analysis. Microarrays have identified several host genes whose expression levels are elevated following infection of chicken embryo fibroblasts (CEF) with Marek's disease virus (MDV). The ongoing process of chicken genome projects and new discoveries and breakthroughs in genomics and proteomics will no doubt reveal new and exciting information and advances in the avian research.

  • PDF