• 제목/요약/키워드: gene set analysis

검색결과 291건 처리시간 0.028초

프마이크로어레이 데이터의 유전자 집합 및 대사 경로 분석 (Gene Set and Pathway Analysis of Microarray Data)

  • 김선영
    • 유전체소식지
    • /
    • 제6권1호
    • /
    • pp.29-33
    • /
    • 2006
  • Gene set analysis is a new concept and method. to analyze and interpret microarray gene expression data and tries to extract biological meaning from gene expression data at gene set level rather than at gene level. Compared with methods which select a few tens or hundreds of genes before gene ontology and pathway analysis, gene set analysis identifies important gene ontology terms and pathways more consistently and performs well even in gene expression data sets with minimal or moderate gene expression changes. Moreover, gene set analysis is useful for comparing multiple gene expression data sets dealing with similar biological questions. This review briefly summarizes the rationale behind the gene set analysis and introduces several algorithms and tools now available for gene set analysis.

  • PDF

Discovery of Cellular RhoA Functions by the Integrated Application of Gene Set Enrichment Analysis

  • Chun, Kwang-Hoon
    • Biomolecules & Therapeutics
    • /
    • 제30권1호
    • /
    • pp.98-116
    • /
    • 2022
  • The small GTPase RhoA has been studied extensively for its role in actin dynamics. In this study, multiple bioinformatics tools were applied cooperatively to the microarray dataset GSE64714 to explore previously unidentified functions of RhoA. Comparative gene expression analysis revealed 545 differentially expressed genes in RhoA-null cells versus controls. Gene set enrichment analysis (GSEA) was conducted with three gene set collections: (1) the hallmark, (2) the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway, and (3) the Gene Ontology Biological Process. GSEA results showed that RhoA is related strongly to diverse pathways: cell cycle/growth, DNA repair, metabolism, keratinization, response to fungus, and vesicular transport. These functions were verified by heatmap analysis, KEGG pathway diagramming, and direct acyclic graphing. The use of multiple gene set collections restricted the leakage of information extracted. However, gene sets from individual collections are heterogenous in gene element composition, number, and the contextual meaning embraced in names. Indeed, there was a limit to deriving functions with high accuracy and reliability simply from gene set names. The comparison of multiple gene set collections showed that although the gene sets had similar names, the gene elements were extremely heterogeneous. Thus, the type of collection chosen and the analytical context influence the interpretation of GSEA results. Nonetheless, the analyses of multiple collections made it possible to derive robust and consistent function identifications. This study confirmed several well-described roles of RhoA and revealed less explored functions, suggesting future research directions.

GSnet: An Integrated Tool for Gene Set Analysis and Visualization

  • Choi, Yoon-Jeong;Woo, Hyun-Goo;Yu, Ung-Sik
    • Genomics & Informatics
    • /
    • 제5권3호
    • /
    • pp.133-136
    • /
    • 2007
  • The Gene Set network viewer (GSnet) visualizes the functional enrichment of a given gene set with a protein interaction network and is implemented as a plug-in for the Cytoscape platform. The functional enrichment of a given gene set is calculated using a hypergeometric test based on the Gene Ontology annotation. The protein interaction network is estimated using public data. Set operations allow a complex protein interaction network to be decomposed into a functionally-enriched module of interest. GSnet provides a new framework for gene set analysis by integrating a priori knowledge of a biological network with functional enrichment analysis.

Fisher Criterion을 이용한 Gene Set Enrichment Analysis 기반 유의 유전자 집합의 검출 방법 연구 (Identifying Statistically Significant Gene-Sets by Gene Set Enrichment Analysis Using Fisher Criterion)

  • 김재영;신미영
    • 전자공학회논문지CI
    • /
    • 제45권4호
    • /
    • pp.19-26
    • /
    • 2008
  • Gene set enrichment analysis (GSEA)는 두 개의 클래스를 가지는 마이크로어레이 실험 데이터 분석을 위해 생물학적 특징을 기반으로 구성된 다양한 유전자-집합 중에서 두 클래스의 발현값들이 통계적으로 중요한 차이를 나타내는 유의한 유전자-집합을 추출하기 위한 분석 방법이다. 특히, 유전자에 대한 다양한 생물학적인 정보를 지닌 유전자 주석 데이터베이스(Cytogenetic Band, KEGG pathway, Gene Ontology 등)를 이용하여 마이크로어레이 실험에 사용된 전체 유전자 중 특정 기능을 가지는 유전자들을 그룹화하여 다양한 유전자-집합을 발굴하고, 각 유전자-집합 내에서 두 클래스간에 발현값의 차이를 참조하여 유의한 유전자들을 결정하여, 이를 기반으로 통계적으로 유의한 유전자-집합들을 최종 검출하는 방법이다. 본 논문에서는 GSEA 분석 과정에서 현재 주로 사용되고 있는 signal-to-noise ratio 기반 유전자 서열화(gene ranking) 방법 대신에, Fisher criterion을 이용한 유전자 서열화 방법을 적용함으로써 기존의 GSEA 방법에서 추출하지 못한 생물학적으로 의미 있는 새로운 유의 유전자-집합을 추출하는 방법을 제안하고자 한다. 또한, 제안한 방법의 성능을 고찰하기 위하여 공개된 Leukemia 관련 마이크로어레이 실험 데이터 분석에 적용하였으며, 기존의 알려진 결과와 비교 분석함으로써 제안한 방법의 유용성을 검증하고자 하였다.

Gene Set Analyses of Genome-Wide Association Studies on 49 Quantitative Traits Measured in a Single Genetic Epidemiology Dataset

  • Kim, Jihye;Kwon, Ji-Sun;Kim, Sangsoo
    • Genomics & Informatics
    • /
    • 제11권3호
    • /
    • pp.135-141
    • /
    • 2013
  • Gene set analysis is a powerful tool for interpreting a genome-wide association study result and is gaining popularity these days. Comparison of the gene sets obtained for a variety of traits measured from a single genetic epidemiology dataset may give insights into the biological mechanisms underlying these traits. Based on the previously published single nucleotide polymorphism (SNP) genotype data on 8,842 individuals enrolled in the Korea Association Resource project, we performed a series of systematic genome-wide association analyses for 49 quantitative traits of basic epidemiological, anthropometric, or blood chemistry parameters. Each analysis result was subjected to subsequent gene set analyses based on Gene Ontology (GO) terms using gene set analysis software, GSA-SNP, identifying a set of GO terms significantly associated to each trait ($p_{corr}$ < 0.05). Pairwise comparison of the traits in terms of the semantic similarity in their GO sets revealed surprising cases where phenotypically uncorrelated traits showed high similarity in terms of biological pathways. For example, the pH level was related to 7 other traits that showed low phenotypic correlations with it. A literature survey implies that these traits may be regulated partly by common pathways that involve neuronal or nerve systems.

NGSEA: Network-Based Gene Set Enrichment Analysis for Interpreting Gene Expression Phenotypes with Functional Gene Sets

  • Han, Heonjong;Lee, Sangyoung;Lee, Insuk
    • Molecules and Cells
    • /
    • 제42권8호
    • /
    • pp.579-588
    • /
    • 2019
  • Gene set enrichment analysis (GSEA) is a popular tool to identify underlying biological processes in clinical samples using their gene expression phenotypes. GSEA measures the enrichment of annotated gene sets that represent biological processes for differentially expressed genes (DEGs) in clinical samples. GSEA may be suboptimal for functional gene sets; however, because DEGs from the expression dataset may not be functional genes per se but dysregulated genes perturbed by bona fide functional genes. To overcome this shortcoming, we developed network-based GSEA (NGSEA), which measures the enrichment score of functional gene sets using the expression difference of not only individual genes but also their neighbors in the functional network. We found that NGSEA outperformed GSEA in identifying pathway gene sets for matched gene expression phenotypes. We also observed that NGSEA substantially improved the ability to retrieve known anti-cancer drugs from patient-derived gene expression data using drug-target gene sets compared with another method, Connectivity Map. We also repurposed FDA-approved drugs using NGSEA and experimentally validated budesonide as a chemical with anti-cancer effects for colorectal cancer. We, therefore, expect that NGSEA will facilitate both pathway interpretation of gene expression phenotypes and anti-cancer drug repositioning. NGSEA is freely available at www.inetbio.org/ngsea.

유전자 발현 메트릭에 기반한 모수적 방식의 유의 유전자 집합 검출 비교 연구 (A Comparative Study of Parametric Methods for Significant Gene Set Identification Depending on Various Expression Metrics)

  • 김재영;신미영
    • 한국정보과학회논문지:소프트웨어및응용
    • /
    • 제37권1호
    • /
    • pp.1-8
    • /
    • 2010
  • 최근 마이크로어레이 데이터를 기반으로 두 개의 샘플 그룹간에 유의한 발현 차이를 나타내는 생물학적 기능 그룹을 검출하기 위한 유전자 집합 분석(gene set analysis) 연구가 많은 주목을 받고 있다. 기존의 유의 유전자 검출 연구와는 달리, 유전자 집합 분석 연구는 유의한 유전자 집합과 이들의 기능적 특징을 함께 검출할 수 있다는 장점이 있다. 이러한 이유로 최근에는 PAGE, GSEA 등과 같은 다양한 통계적 방식의 유전자 집합 분석 방법들이 소개되고 있다. 특히, PAGE의 경우 두 샘플 그룹간의 유전자 발현 차이를 나타내는 스코어의 분포가 정규 분포임을 가정하는 모수적 접근 방식을 취하고 있다. 이러한 방법은 GSEA 등과 같은 비모수적 방식에 비해 계산량이 적고 성능이 비교적 우수한 장점이 있다. 하지만, PAGE에서 유전자 발현 차이를 정량화하기 위한 메트릭으로 사용하고 있는 AD(average difference)의 경우, 두 그룹간에 절대적 평균 발현 차이만을 고려하기 때문에 실제 유전자의 발현값 크기나 분산의 크기에 따른 상대적 중요성을 반영하지 못하는 문제가 있다. 본 논문에서는 이를 보완하기 위해 실제 유전자의 발현값 크기나 그룹 내 샘플들의 분산 정보 등을 스코어 계산에 함께 반영하는 WAD(weighted average difference), FC(Fisher's criterion), 그리고 Abs_SNR(Absolute value of signal-to-noise ratio)을 모수적 방식의 유전자 집합 분석에 적용하고 이에 따른 유의 유전자 집합 검출 결과를 실험을 통해 비교 분석하였다.

마이크로어레이 자료분석에서 모수적 방법을 이용한 유전자군의 유의성 검정 (Developing a Parametric Method for Testing the Significance of Gene Sets in Microarray Data Analysis)

  • 이선호;이승규;이광현
    • Communications for Statistical Applications and Methods
    • /
    • 제16권3호
    • /
    • pp.397-408
    • /
    • 2009
  • 마이크로어레이 기술은 수만 개 유전자의 발현 패턴을 동시에 관찰하는 것을 가능하게 하였고, 이들을 하나씩 검정하여 찾아낸 특이발현 현상을 보이는 유전자를 중심으로 질병의 진단, 치료법 정립과 신약 개발을 위한 기본 정보를 확립하였다. 그러나 개별 유전자분석의 여러 문제점이 발견되면서 유전자들을 생물학적 대사경로나 염색체 위치가 같은 것끼리 묶은 집단을 분석하여 질병의 발생이나 생존에 영향을 미치는 집단을 찾는 방법이 제시되었다. 이러한 유전자 집단의 유의성에 대한 연구는 2002년에 MIT에서 비롯되어 GSEA, SAM-GS와 중심극한 정리의 개념을 이용한 모수적 방법인 PAGE 등이 사용되고 있다. 본 논문에서는 이들 통계량의 구조적 한계를 극복하고 계산이 간단한 새로운 모수적 방법을 제안하고 자료 분석을 통하여 효율성을 보였다.

Analysis of Gene Expression in Human Dermal Fibroblasts Treated with Senescence-Modulating COX Inhibitors

  • Han, Jeong A.;Kim, Jong-Il
    • Genomics & Informatics
    • /
    • 제15권2호
    • /
    • pp.56-64
    • /
    • 2017
  • We have previously reported that NS-398, a cyclooxygenase-2 (COX-2)-selective inhibitor, inhibited replicative cellular senescence in human dermal fibroblasts and skin aging in hairless mice. In contrast, celecoxib, another COX-2-selective inhibitor, and aspirin, a non-selective COX inhibitor, accelerated the senescence and aging. To figure out causal factors for the senescence-modulating effect of the inhibitors, we here performed cDNA microarray experiment and subsequent Gene Set Enrichment Analysis. The data showed that several senescence-related gene sets were regulated by the inhibitor treatment. NS-398 up-regulated gene sets involved in the tumor necrosis factor ${\beta}$ receptor pathway and the fructose and mannose metabolism, whereas it down-regulated a gene set involved in protein secretion. Celecoxib up-regulated gene sets involved in G2M checkpoint and E2F targets. Aspirin up-regulated the gene set involved in protein secretion, and down-regulated gene sets involved in RNA transcription. These results suggest that COX inhibitors modulate cellular senescence by different mechanisms and will provide useful information to understand senescence-modulating mechanisms of COX inhibitors.

Meta- and Gene Set Analysis of Stomach Cancer Gene Expression Data

  • Kim, Seon-Young;Kim, Jeong-Hwan;Lee, Heun-Sik;Noh, Seung-Moo;Song, Kyu-Sang;Cho, June-Sik;Jeong, Hyun-Yong;Kim, Woo Ho;Yeom, Young-Il;Kim, Nam-Soon;Kim, Sangsoo;Yoo, Hyang-Sook;Kim, Yong Sung
    • Molecules and Cells
    • /
    • 제24권2호
    • /
    • pp.200-209
    • /
    • 2007
  • We generated gene expression data from the tissues of 50 gastric cancer patients, and applied meta-analysis and gene set analysis to this data and three other stomach cancer gene expression data sets to define the gene expression changes in gastric tumors. By meta-analysis we identified genes consistently changed in gastric carcinomas, while gene set analysis revealed consistently changed biological themes. Genes and gene sets involved in digestion, fatty acid metabolism, and ion transport were consistently down-regulated in gastric carcinomas, while those involved in cellular proliferation, cell cycle, and DNA replication were consistently up-regulated. We also found significant differences between the genes and gene sets expressed in diffuse and intestinal type gastric carcinoma. By gene set analysis of cytogenetic bands, we identified many chromosomal regions with possible gross chromosomal changes (amplifications or deletions). Similar analysis of transcription factor binding sites (TFBSs), revealed transcription factors that may have caused the observed gene expression changes in gastric carcinomas, and we confirmed the overexpression of one of these, E2F1, in many gastric carcinomas by tissue array and immunohistochemistry. We have incorporated the results of our meta- and gene set analyses into a web accessible database (http://human-genome.kribb.re.kr/stomach/).