Browse > Article

Identifying Statistically Significant Gene-Sets by Gene Set Enrichment Analysis Using Fisher Criterion  

Kim, Jae-Young (Graduate School of Information and Communication Engineering, Kyungpook National University)
Shin, Mi-Young (School of Electrical and Engineering and Computer Science, Kyungpook National University)
Publication Information
Abstract
Gene set enrichment analysis (GSEA) is a computational method to identify statistically significant gene sets showing significant differences between two groups of microarray expression profiles and simultaneously uncover their biological meanings in an elegant way by employing gene annotation databases, such as Cytogenetic Band, KEGG pathways, gene ontology, and etc. For the gone set enrichment analysis, all the genes in a given dataset are first ordered by the signal-to-noise ratio between the groups and then further analyses are proceeded. Despite of its impressive results in several previous studies, however, gene ranking by the signal-to-noise ratio makes it difficult to consider highly up-regulated genes and highly down-regulated genes at the same time as the candidates of significant genes, which possibly reflect certain situations incurred in metabolic and signaling pathways. To deal with this problem, in this article, we investigate the gene set enrichment analysis method with Fisher criterion for gene ranking and also evaluate its effects in Leukemia related pathway analyses.
Keywords
Significant gene-sets; Gene Set Enrichment Analysis; Gene ranking; Fisher Criterion;
Citations & Related Records
연도 인용수 순위
  • Reference
1 P. Bradley et al., "Feature selection via mathematical programming", Technical report to appear in INFORMS Journal on computing, 1998
2 I. Dinu et al., "Improving GSEA for analysis of biologicpathways for differential gene expression across a binary phenotype.", Collection of Biostatistics, 2007
3 T. Manoli et al., "Group testing for Pathway analysis improves comparability of different microarray datasets", Bioinformatics, 22(20):2500-2506, 2006   DOI   ScienceOn
4 S. Kudsen, "Cancer Diagnostics with DNA Microarrays", John Wiley & Sons, Inc., 2006
5 S. Dudoit et al., "Multiple Testing Procedures and Applications to Genomics", Springer, 2007
6 A. Zhang, "Advanced analysis of gene expression microarray data", World Scientific, 2006
7 V. G. Tusher et al., "Significance analysis of microarrays applied to the ionizing radiation response", Proc Natl Acad Sci. 24;98(9):5116-21, Apr 2001
8 A. Subramanian et al., "Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles.", Proc. Natl Acad Sci USA 102: 15545-50, Sep 2005
9 J. Verzani, "Using R for Introductory Statistics" Chapman & Hall/CRC, Boca Raton, FL, 2005
10 M. Kanehisa et al., "The KEGG databases at GenomeNet, Nucleic Acids Res.", 30:42-46, 2002   DOI   ScienceOn
11 C. Potten et al., "Apoptosis", Cambridge University Press, 2005
12 S. Dudoit et al., "Multiple Hypothesis Testing in Microarray Experiments", Statistical Science, 18: 71-103, 2003   DOI   ScienceOn
13 Y. Ge et al., "Resampling-based multiple testing for microarray data analysis", Technical Report 633, Department of Statistics, University of California, Berkeley, 2003
14 KEGG: Kyoto Encyclopedia of Genes and Genomes , http://www.genome.ad.jp/kegg/
15 C. Bishop, "Neural Networks for Pattern Recognition", Oxford University Press, Oxford, 1995
16 G. J. McLachlan et al., "ANALYZING MICROARRAY GENE EXPRESSION DATA", WILEY-INTERSCIENCE John Wiley & Sons, 2004
17 S. Monti et al., "Molecular profiling of diffuse large B-cell lymphoma identifies robust subtypes including one characterized by host inflammatory response.", Blood. 2005 Mar 1;105(5):1851-61, Nov 2004
18 A. Blum et al., "Selection of relevant features and example in machine learning", Artificial intelligence, 97:245-271, 1997   DOI   ScienceOn
19 S. Kawashima et al., "KEGG API: A Web Service Using SOAP/WSDL to Access the KEGG System", Genome Informatics 14: 673-674, 2003
20 T. R. Golub et al., "Molecular classification of cancer: class discovery and class prediction by gene expression monitoring", Science (Wash. DC), 286: 531.537, 1999   DOI   ScienceOn
21 R. Gentleman et al., "Bioinformatics and Computational Biology Solutions Using R and Bioconductor", Springer, 2005
22 E. Taskesen, "Sub-typing of model organisms based on gene expression data." Bioinformatics technical University of Delft Research Assignment, 2006