• 제목/요약/키워드: Omics data analysis

검색결과 46건 처리시간 0.02초

XPERNATO-TOX: an Integrated Toxicogenomics Knowledgebase

  • Woo Jung-Hoon;Kim Hyeoun-Eui;Kong Gu;Kim Ju-Han
    • Genomics & Informatics
    • /
    • 제4권1호
    • /
    • pp.40-44
    • /
    • 2006
  • Toxicogenomics combines transcriptome, proteome and metabolome profiling with conventional toxicology to investigate the interaction between biological molecules and toxicant or environmental stress in disease caution. Toxicogenomics faces the problems of comparison and integration across different sources of data. Cause of unusual characteristics of toxicogenomic data, researcher should be assisted by data analysis and annotation for getting meaningful information. There are already existing repositories which claim to stand for toxicogenomics database. However, those just contain limited abilities for toxicogenomic research. For supporting toxicologist who comes up against toxicogenomic data flood, now we propose novel toxicogenomics knowledgebase system, XPERANTO-TOX. XPERANTO-TOX is an integrated system for toxicogenomic data management and analysis. It is composed of three distinct but closely connected parts. Firstly, Data Storage System is for reposit many kinds of '-omics' data and conventional toxicology data. Secondly, Data Analysis System consists of analytical modules for integrated toxicogenomics data. At last, Data Annotation System is for giving extensive insight of data to researcher.

Review of statistical methods for survival analysis using genomic data

  • Lee, Seungyeoun;Lim, Heeju
    • Genomics & Informatics
    • /
    • 제17권4호
    • /
    • pp.41.1-41.12
    • /
    • 2019
  • Survival analysis mainly deals with the time to event, including death, onset of disease, and bankruptcy. The common characteristic of survival analysis is that it contains "censored" data, in which the time to event cannot be completely observed, but instead represents the lower bound of the time to event. Only the occurrence of either time to event or censoring time is observed. Many traditional statistical methods have been effectively used for analyzing survival data with censored observations. However, with the development of high-throughput technologies for producing "omics" data, more advanced statistical methods, such as regularization, should be required to construct the predictive survival model with high-dimensional genomic data. Furthermore, machine learning approaches have been adapted for survival analysis, to fit nonlinear and complex interaction effects between predictors, and achieve more accurate prediction of individual survival probability. Presently, since most clinicians and medical researchers can easily assess statistical programs for analyzing survival data, a review article is helpful for understanding statistical methods used in survival analysis. We review traditional survival methods and regularization methods, with various penalty functions, for the analysis of high-dimensional genomics, and describe machine learning techniques that have been adapted to survival analysis.

식물 생명공학과 생물정보학 (Plant Biotechnology and Bioinformatics)

  • 김정은;백효정;김영철;허철구
    • Journal of Plant Biotechnology
    • /
    • 제33권3호
    • /
    • pp.209-222
    • /
    • 2006
  • 애기 장대와 벼의 전체 게놈 염기서열 분석이 완료되었고, 다량의 EST 데이터가 많은 식물에서 이용 가능하게 되었다. 또한, 방대한 양의 다양한 생물학적 데이터들이 transcriptomics, proteomics, metabolomics와 같은 여러 '-omics' 기술에 의하여 만들어져 왔다. 생물정보학은 이런 방대한 양의 생물학적 데이터로부터 유용한 정보를 얻는데 필수적이고도 매우 중요한 역할을 수행한다. 이 총설에서, 우리는 대량의 데이터를 생성하는 실험적 방법들과, 식물 병 저항성과 분자 육종과 같은 식물 연구분야로의 응용, 그리고 식물 생명공학의 연구 개발에 유용한 생물정보학적 기술과. 인터넷 정보 사이트들을 소개하였다. 우리는 새로운 실험 방법들과 생물정보학적 분석 기술들이 식물 생명공학 발전에 중요하게 기여할 것으로 기대하고 있으며, 생물정보학은 식물 생명공학의 연구 개발에 있어서 결정적인 요소가 될 것이라 생각한다.

QCanvas: An Advanced Tool for Data Clustering and Visualization of Genomics Data

  • Kim, Nayoung;Park, Herin;He, Ningning;Lee, Hyeon Young;Yoon, Sukjoon
    • Genomics & Informatics
    • /
    • 제10권4호
    • /
    • pp.263-265
    • /
    • 2012
  • We developed a user-friendly, interactive program to simultaneously cluster and visualize omics data, such as DNA and protein array profiles. This program provides diverse algorithms for the hierarchical clustering of two-dimensional data. The clustering results can be interactively visualized and optimized on a heatmap. The present tool does not require any prior knowledge of scripting languages to carry out the data clustering and visualization. Furthermore, the heatmaps allow the selective display of data points satisfying user-defined criteria. For example, a clustered heatmap of experimental values can be differentially visualized based on statistical values, such as p-values. Including diverse menu-based display options, QCanvas provides a convenient graphical user interface for pattern analysis and visualization with high-quality graphics.

Bioinformatics services for analyzing massive genomic datasets

  • Ko, Gunhwan;Kim, Pan-Gyu;Cho, Youngbum;Jeong, Seongmun;Kim, Jae-Yoon;Kim, Kyoung Hyoun;Lee, Ho-Yeon;Han, Jiyeon;Yu, Namhee;Ham, Seokjin;Jang, Insoon;Kang, Byunghee;Shin, Sunguk;Kim, Lian;Lee, Seung-Won;Nam, Dougu;Kim, Jihyun F.;Kim, Namshin;Kim, Seon-Young;Lee, Sanghyuk;Roh, Tae-Young;Lee, Byungwook
    • Genomics & Informatics
    • /
    • 제18권1호
    • /
    • pp.8.1-8.10
    • /
    • 2020
  • The explosive growth of next-generation sequencing data has resulted in ultra-large-scale datasets and ensuing computational problems. In Korea, the amount of genomic data has been increasing rapidly in the recent years. Leveraging these big data requires researchers to use large-scale computational resources and analysis pipelines. A promising solution for addressing this computational challenge is cloud computing, where CPUs, memory, storage, and programs are accessible in the form of virtual machines. Here, we present a cloud computing-based system, Bio-Express, that provides user-friendly, cost-effective analysis of massive genomic datasets. Bio-Express is loaded with predefined multi-omics data analysis pipelines, which are divided into genome, transcriptome, epigenome, and metagenome pipelines. Users can employ predefined pipelines or create a new pipeline for analyzing their own omics data. We also developed several web-based services for facilitating downstream analysis of genome data. Bio-Express web service is freely available at https://www. bioexpress.re.kr/.

Global Transcriptome-Wide Association Studies (TWAS) Reveal a Gene Regulation Network of Eating and Cooking Quality Traits in Rice

  • Weiguo Zhao;Qiang He;Kyu-Won Kim;Feifei Xu;Thant Zin Maung;Aueangporn Somsri;Min-Young Yoon;Sang-Beom Lee;Seung-Hyun Kim;Joohyun Lee;Soon-Wook Kwon;Gang-Seob Lee;Bhagwat Nawade;Sang-Ho Chu;Wondo Lee;Yoo-Hyun Cho;Chang-Yong Lee;Ill-Min Chung;Jong-Seong Jeon;Yong-Jin Park
    • 한국작물학회:학술대회논문집
    • /
    • 한국작물학회 2022년도 추계학술대회
    • /
    • pp.207-207
    • /
    • 2022
  • Eating and cooking quality (ECQ) is one of the most complex quantitative traits in rice. The understanding of genetic regulation of transcript expression levels attributing to phenotypic variation in ECQ traits is limited. We integrated whole-genome resequencing, transcriptome, and phenotypic variation data from 84 Japonica accessions to build a transcriptome-wide association study (TWAS) based regulatory network. All ECQ traits showed a large phenotypic variation and significant phenotypic correlations among the traits. TWAS analysis identified a total of 285 transcripts significantly associated with six ECQ traits. Genome-wide mapping of ECQ-associated transcripts revealed 66,905 quantitative expression traits (eQTLs), including 21,747 local eQTLs, and 45,158 trans-eQTLs, regulating the expression of 43 genes. The starch synthesis-related genes (SSRGs), starch synthase IV-1 (SSIV-1), starch branching enzyme 1 (SBE1), granule-bound starch synthase 2 (GBSS2), and ADP-glucose pyrophosphorylase small subunit 2a (OsAGPS2a) were found to have eQTLs regulating the expression of ECQ associated transcripts. Further, in co-expression analysis, 130 genes produced at least one network with 22 master regulators. In addition, we developed CRISPR/Cas9-edited glbl mutant lines that confirmed the role of alpha-globulin (glbl) in starch synthesis to validate the co-expression analysis. This study provided novel insights into the genetic regulation of ECQ traits, and transcripts associated with these traits were discovered that could be used in further rice breeding.

  • PDF

Applying a modified AUC to gene ranking

  • Yu, Wenbao;Chang, Yuan-Chin Ivan;Park, Eunsik
    • Communications for Statistical Applications and Methods
    • /
    • 제25권3호
    • /
    • pp.307-319
    • /
    • 2018
  • High-throughput technologies enable the simultaneous evaluation of thousands of genes that could discriminate different subclasses of complex diseases. Ranking genes according to differential expression is an important screening step for follow-up analysis. Many statistical measures have been proposed for this purpose. A good ranked list should provide a stable rank (at least for top-ranked gene), and the top ranked genes should have a high power in differentiating different disease status. However, there is a lack of emphasis in the literature on ranking genes based on these two criteria simultaneously. To achieve the above two criteria simultaneously, we proposed to apply a previously reported metric, the modified area under the receiver operating characteristic cure, to gene ranking. The proposed ranking method is found to be promising in leading to a stable ranking list and good prediction performances of top ranked genes. The findings are illustrated through studies on both synthesized data and real microarray gene expression data. The proposed method is recommended for ranking genes or other biomarkers for high-dimensional omics studies.

Perspectives on Clinical Informatics: Integrating Large-Scale Clinical, Genomic, and Health Information for Clinical Care

  • Choi, In Young;Kim, Tae-Min;Kim, Myung Shin;Mun, Seong K.;Chung, Yeun-Jun
    • Genomics & Informatics
    • /
    • 제11권4호
    • /
    • pp.186-190
    • /
    • 2013
  • The advances in electronic medical records (EMRs) and bioinformatics (BI) represent two significant trends in healthcare. The widespread adoption of EMR systems and the completion of the Human Genome Project developed the technologies for data acquisition, analysis, and visualization in two different domains. The massive amount of data from both clinical and biology domains is expected to provide personalized, preventive, and predictive healthcare services in the near future. The integrated use of EMR and BI data needs to consider four key informatics areas: data modeling, analytics, standardization, and privacy. Bioclinical data warehouses integrating heterogeneous patient-related clinical or omics data should be considered. The representative standardization effort by the Clinical Bioinformatics Ontology (CBO) aims to provide uniquely identified concepts to include molecular pathology terminologies. Since individual genome data are easily used to predict current and future health status, different safeguards to ensure confidentiality should be considered. In this paper, we focused on the informatics aspects of integrating the EMR community and BI community by identifying opportunities, challenges, and approaches to provide the best possible care service for our patients and the population.

Applications of Metabolic Modeling to Drive Bioprocess Development for the Production of Value-added Chemicals

  • Mahadevan, Radhakrishnan;Burgard, Anthony P.;Famili, Iman;Dien, Steve Van;Schilling, Christophe H.
    • Biotechnology and Bioprocess Engineering:BBE
    • /
    • 제10권5호
    • /
    • pp.408-417
    • /
    • 2005
  • Increasing numbers of value added chemicals are being produced using microbial fermentation strategies. Computational modeling and simulation of microbial metabolism is rapidly becoming an enabling technology that is driving a new paradigm to accelerate the bioprocess development cycle. In particular, constraint-based modeling and the development of genome-scale models of industrial microbes are finding increasing utility across many phases of the bioprocess development workflow. Herein, we review and discuss the requirements and trends in the industrial application of this technology as we build toward integrated computational/experimental platforms for bioprocess engineering. Specifically we cover the following topics: (1) genome-scale models as genetically and biochemically consistent representations of metabolic networks; (2) the ability of these models to predict, assess, and interpret metabolic physiology and flux states of metabolism; (3) the model-guided integrative analysis of high throughput 'omics' data; (4) the reconciliation and analysis of on- and off-line fermentation data as well as flux tracing data; (5) model-aided strain design strategies and the integration of calculated biotransformation routes; and (6) control and optimization of the fermentation processes. Collectively, constraint-based modeling strategies are impacting the iterative characterization of metabolic flux states throughout the bioprocess development cycle, while also driving metabolic engineering strategies and fermentation optimization.

NEUROD1 Intrinsically Initiates Differentiation of Induced Pluripotent Stem Cells into Neural Progenitor Cells

  • Choi, Won-Young;Hwang, Ji-Hyun;Cho, Ann-Na;Lee, Andrew J.;Jung, Inkyung;Cho, Seung-Woo;Kim, Lark Kyun;Kim, Young-Joon
    • Molecules and Cells
    • /
    • 제43권12호
    • /
    • pp.1011-1022
    • /
    • 2020
  • Cell type specification is a delicate biological event in which every step is under tight regulation. From a molecular point of view, cell fate commitment begins with chromatin alteration, which kickstarts lineage-determining factors to initiate a series of genes required for cell specification. Several important neuronal differentiation factors have been identified from ectopic over-expression studies. However, there is scarce information on which DNA regions are modified during induced pluripotent stem cell (iPSC) to neuronal progenitor cell (NPC) differentiation, the cis regulatory factors that attach to these accessible regions, or the genes that are initially expressed. In this study, we identified the DNA accessible regions of iPSCs and NPCs via the Assay for Transposase-Accessible Chromatin sequencing (ATAC-seq). We identified which chromatin regions were modified after neuronal differentiation and found that the enhancer regions had more active histone modification changes than the promoters. Through motif enrichment analysis, we found that NEUROD1 controls iPSC differentiation to NPC by binding to the accessible regions of enhancers in cooperation with other factors such as the Hox proteins. Finally, by using Hi-C data, we categorized the genes that directly interacted with the enhancers under the control of NEUROD1 during iPSC to NPC differentiation.