• Title/Summary/Keyword: Omics Data

Search Result 69, Processing Time 0.027 seconds

Set Covering-based Feature Selection of Large-scale Omics Data (Set Covering 기반의 대용량 오믹스데이터 특징변수 추출기법)

  • Ma, Zhengyu;Yan, Kedong;Kim, Kwangsoo;Ryoo, Hong Seo
    • Journal of the Korean Operations Research and Management Science Society
    • /
    • v.39 no.4
    • /
    • pp.75-84
    • /
    • 2014
  • In this paper, we dealt with feature selection problem of large-scale and high-dimensional biological data such as omics data. For this problem, most of the previous approaches used simple score function to reduce the number of original variables and selected features from the small number of remained variables. In the case of methods that do not rely on filtering techniques, they do not consider the interactions between the variables, or generate approximate solutions to the simplified problem. Unlike them, by combining set covering and clustering techniques, we developed a new method that could deal with total number of variables and consider the combinatorial effects of variables for selecting good features. To demonstrate the efficacy and effectiveness of the method, we downloaded gene expression datasets from TCGA (The Cancer Genome Atlas) and compared our method with other algorithms including WEKA embeded feature selection algorithms. In the experimental results, we showed that our method could select high quality features for constructing more accurate classifiers than other feature selection algorithms.

Classification of Colon Cancer Patients Based on the Methylation Patterns of Promoters

  • Choi, Wonyoung;Lee, Jungwoo;Lee, Jin-Young;Lee, Sun-Min;Kim, Da-Won;Kim, Young-Joon
    • Genomics & Informatics
    • /
    • v.14 no.2
    • /
    • pp.46-52
    • /
    • 2016
  • Diverse somatic mutations have been reported to serve as cancer drivers. Recently, it has also been reported that epigenetic regulation is closely related to cancer development. However, the effect of epigenetic changes on cancer is still elusive. In this study, we analyzed DNA methylation data on colon cancer taken from The Caner Genome Atlas. We found that several promoters were significantly hypermethylated in colon cancer patients. Through clustering analysis of differentially methylated DNA regions, we were able to define subgroups of patients and observed clinical features associated with each subgroup. In addition, we analyzed the functional ontology of aberrantly methylated genes and identified the G-protein-coupled receptor signaling pathway as one of the major pathways affected epigenetically. In conclusion, our analysis shows the possibility of characterizing the clinical features of colon cancer subgroups based on DNA methylation patterns and provides lists of important genes and pathways possibly involved in colon cancer development.

Advances in Systems Biology Approaches for Autoimmune Diseases

  • Kim, Ho-Youn;Kim, Hae-Rim;Lee, Sang-Heon
    • IMMUNE NETWORK
    • /
    • v.14 no.2
    • /
    • pp.73-80
    • /
    • 2014
  • Because autoimmune diseases (AIDs) result from a complex combination of genetic and epigenetic factors, as well as an altered immune response to endogenous or exogenous antigens, systems biology approaches have been widely applied. The use of multi-omics approaches, including blood transcriptomics, genomics, epigenetics, proteomics, and metabolomics, not only allow for the discovery of a number of biomarkers but also will provide new directions for further translational AIDs applications. Systems biology approaches rely on high-throughput techniques with data analysis platforms that leverage the assessment of genes, proteins, metabolites, and network analysis of complex biologic or pathways implicated in specific AID conditions. To facilitate the discovery of validated and qualified biomarkers, better-coordinated multi-omics approaches and standardized translational research, in combination with the skills of biologists, clinicians, engineers, and bioinformaticians, are required.

BaSDAS: a web-based pooled CRISPR-Cas9 knockout screening data analysis system

  • Park, Young-Kyu;Yoon, Byoung-Ha;Park, Seung-Jin;Kim, Byung Kwon;Kim, Seon-Young
    • Genomics & Informatics
    • /
    • v.18 no.4
    • /
    • pp.46.1-46.4
    • /
    • 2020
  • We developed the BaSDAS (Barcode-Seq Data Analysis System), a GUI-based pooled knockout screening data analysis system, to facilitate the analysis of pooled knockout screen data easily and effectively by researchers with limited bioinformatics skills. The BaSDAS supports the analysis of various pooled screening libraries, including yeast, human, and mouse libraries, and provides many useful statistical and visualization functions with a user-friendly web interface for convenience. We expect that BaSDAS will be a useful tool for the analysis of genome-wide screening data and will support the development of novel drugs based on functional genomics information.

Prospect and Roles of Molecular Ecogenetic Techniques in the Ecophysiological Study of Cyanobacteria (남조류의 생리·생태 연구에서 분자생태유전학적 기법의 역할 및 전망)

  • Ahn, Chi-Yong
    • Korean Journal of Ecology and Environment
    • /
    • v.51 no.1
    • /
    • pp.16-28
    • /
    • 2018
  • Although physiological and ecological characteristics of cyanobacteria have been studied extensively for decades, unknown areas still remain greater than the already known. Recently, the development of omics techniques based on molecular biology has made it possible to view the ecosystem from a new and holistic perspective. The molecular mechanism of toxin production is being widely investigated, by comparative genomics and the transcriptomic studies. Biological interaction between bacteria and cyanobacteria is also explored: how their interactions and genetic biodiversity change depending on seasons and environmental factors, and how these interactions finally affect each component of ecosystem. Bioinformatics techniques have combined with ecoinformatics and omics data, enabling us to understand the underlying complex mechanisms of ecosystems. Particularly omics started to provide a whole picture of biological responses, occurring from all layers of hierarchical processes from DNA to metabolites. The expectation is growing further that algal blooms could be controlled more effectively in the near future. And an important insight for the successful bloom control would come from a novel blueprint drawn by omics studies.

ZNF204P is a stemness-associated oncogenic long non-coding RNA in hepatocellular carcinoma

  • Hwang, Ji-Hyun;Lee, Jungwoo;Choi, Won-Young;Kim, Min-Jung;Lee, Jiyeon;Chu, Khanh Hoang Bao;Kim, Lark Kyun;Kim, Young-Joon
    • BMB Reports
    • /
    • v.55 no.6
    • /
    • pp.281-286
    • /
    • 2022
  • Hepatocellular carcinoma is a major health burden, and though various treatments through much research are available, difficulties in early diagnosis and drug resistance to chemotherapy-based treatments render several ineffective. Cancer stem cell model has been used to explain formation of heterogeneous cell population within tumor mass, which is one of the underlying causes of high recurrence rate and acquired chemoresistance, highlighting the importance of CSC identification and understanding the molecular mechanisms of CSC drivers. Extracellular CSC-markers such as CD133, CD90 and EpCAM have been used successfully in CSC isolation, but studies have indicated that increasingly complex combinations are required for accurate identification. Pseudogene-derived long non-coding RNAs are useful candidates as intracellular CSC markers - factors that regulate pluripotency and self-renewal - given their cancer-specific expression and versatile regulation across several levels. Here, we present the use of microarray data to identify stemness-associated factors in liver cancer, and selection of sole pseudogene-derived lncRNA ZNF204P for experimental validation. ZNF204P knockdown impairs cell proliferation and migration/invasion. As the cytosolic ZNF204P shares miRNA binding sites with OCT4 and SOX2, well-known drivers of pluripotency and self-renewal, we propose that ZNF204P promotes tumorigenesis through the miRNA-145-5p/OCT4, SOX2 axis.

Perspectives of Integrative Cancer Genomics in Next Generation Sequencing Era

  • Kwon, So-Mee;Cho, Hyun-Woo;Choi, Ji-Hye;Jee, Byul-A;Jo, Yun-A;Woo, Hyun-Goo
    • Genomics & Informatics
    • /
    • v.10 no.2
    • /
    • pp.69-73
    • /
    • 2012
  • The explosive development of genomics technologies including microarrays and next generation sequencing (NGS) has provided comprehensive maps of cancer genomes, including the expression of mRNAs and microRNAs, DNA copy numbers, sequence variations, and epigenetic changes. These genome-wide profiles of the genetic aberrations could reveal the candidates for diagnostic and/or prognostic biomarkers as well as mechanistic insights into tumor development and progression. Recent efforts to establish the huge cancer genome compendium and integrative omics analyses, so-called "integromics", have extended our understanding on the cancer genome, showing its daunting complexity and heterogeneity. However, the challenges of the structured integration, sharing, and interpretation of the big omics data still remain to be resolved. Here, we review several issues raised in cancer omics data analysis, including NGS, focusing particularly on the study design and analysis strategies. This might be helpful to understand the current trends and strategies of the rapidly evolving cancer genomics research.

Estimation of high-dimensional sparse cross correlation matrix

  • Yin, Cao;Kwangok, Seo;Soohyun, Ahn;Johan, Lim
    • Communications for Statistical Applications and Methods
    • /
    • v.29 no.6
    • /
    • pp.655-664
    • /
    • 2022
  • On the motivation by an integrative study of multi-omics data, we are interested in estimating the structure of the sparse cross correlation matrix of two high-dimensional random vectors. We rewrite the problem as a multiple testing problem and propose a new method to estimate the sparse structure of the cross correlation matrix. To do so, we test the correlation coefficients simultaneously and threshold the correlation coefficients by controlling FRD at a predetermined level α. Further, we apply the proposed method and an alternative adaptive thresholding procedure by Cai and Liu (2016) to the integrative analysis of the protein expression data (X) and the mRNA expression data (Y) in TCGA breast cancer cohort. By varying the FDR level α, we show that the new procedure is consistently more efficient in estimating the sparse structure of cross correlation matrix than the alternative one.

Network Analysis in Systems Epidemiology

  • Park, JooYong;Choi, Jaesung;Choi, Ji-Yeob
    • Journal of Preventive Medicine and Public Health
    • /
    • v.54 no.4
    • /
    • pp.259-264
    • /
    • 2021
  • Traditional epidemiological studies have identified a number of risk factors for various diseases using regression-based methods that examine the association between an exposure and an outcome (i.e., one-to-one correspondences). One of the major limitations of this approach is the "black-box" aspect of the analysis, in the sense that this approach cannot fully explain complex relationships such as biological pathways. With high-throughput data in current epidemiology, comprehensive analyses are needed. The network approach can help to integrate multi-omics data, visualize their interactions or relationships, and make inferences in the context of biological mechanisms. This review aims to introduce network analysis for systems epidemiology, its procedures, and how to interpret its findings.

Mutation of the lbp-5 gene alters metabolic output in Caenorhabditis elegans

  • Xu, Mo;Choi, Eun-Young;Paik, Young-Ki
    • BMB Reports
    • /
    • v.47 no.1
    • /
    • pp.15-20
    • /
    • 2014
  • Intracellular lipid-binding proteins (LBPs) impact fatty acid homeostasis in various ways, including fatty acid transport into mitochondria. However, the physiological consequences caused by mutations in genes encoding LBPs remain largely uncharacterized. Here, we explore the metabolic consequences of lbp-5 gene deficiency in terms of energy homeostasis in Caenorhabditis elegans. In addition to increased fat storage, which has previously been reported, deletion of lbp-5 attenuated mitochondrial membrane potential and increased reactive oxygen species levels. Biochemical measurement coupled to proteomic analysis of the lbp-5(tm1618) mutant revealed highly increased rates of glycolysis in this mutant. These differential expression profile data support a novel metabolic adaptation of C. elegans, in which glycolysis is activated to compensate for the energy shortage due to the insufficient mitochondrial ${\beta}$-oxidation of fatty acids in lbp-5 mutant worms. This report marks the first demonstration of a unique metabolic adaptation that is a consequence of LBP-5 deficiency in C. elegans.