Browse > Article
http://dx.doi.org/10.5351/KJAS.2009.22.4.781

Effect of Genetic Correlations on the P Values from Randomization Test and Detection of Significant Gene Groups  

Yi, Mi-Sung (Department of Biostatistics, Medical College, The Catholic University of Korea)
Song, Hae-Hiang (Department of Biostatistics, Medical College, The Catholic University of Korea)
Publication Information
The Korean Journal of Applied Statistics / v.22, no.4, 2009 , pp. 781-792 More about this Journal
Abstract
At an early stage of genomic investigations, a small sample of microarrays is used in gene expression experiments to identify small subsets of candidate genes for a further accurate investigation. Unlike the statistical analysis methods for a large sample of microarrays, an appropriate statistical method for identifying small subsets is a randomization test that provides exact P values. These exact P values from a randomization test for a small sample of microarrays are discrete. The possible existence of differentially expressed genes in the sample of a full set of genes can be tested for the null hypothesis of a uniform distribution. Subsets of smaller P values are of prime interest for a further accurate investigation and identifying these outlier cells from a multinomial distribution of P values is possible by M test of Fuchs et al. (1980). Above all, the genome-wide gene expressions in microarrays are correlated, but the majority of statistical analysis methods in the microarray analysis are based on an independence assumption of genes and ignore the possibly correlated expression levels. We investigated with simulation studies the effect that correlated gene expression levels could have on the randomization test results and M test results, and found that the effects are often not ignorable.
Keywords
Randomization test; exact P value; significant gene groups; outlier cells;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Fuchs, C. and Kenett, R. (1980). A test for detecting outlying cells in the multinomial distribution and two-way contingency tables, Journal of the American Statistical Association, 75, 395-398   DOI   ScienceOn
2 Gadbury, G. L., Page, G. P., Heo, M., Mountz, J. D. and Allison, D. B. (2003). Randomization tests for small samples: An application for genetic expression data, Journal of the Royal Statistical Society. Series C (Applied Statistics), 52, 365-376   DOI   ScienceOn
3 Gibbons, J. D. and Pratt, J. W. (1975). P-values: Interpretation and methodology, The American Statistician, 29, 20-25   DOI   ScienceOn
4 Hu, J. and Wright, F. A. (2007). Assessing differential gene expression with small sample sizes in oligonucleotide arrays using a mean-variance model, Biometrics, 63, 41-49   DOI   ScienceOn
5 Lambert, D. (1985). Robust two-sample permutation tests, The Annals of Statistics, 13, 606-625   DOI   ScienceOn
6 Murie, C. and Nadon, R. (2008). A correction for estimating error when using the Local Pooled Error Statistical Test, Bioinformatics, 24, 1735-1736   DOI   ScienceOn
7 Parmigiani, G., Garrett, E. S., Anbazhagan, R. and Gabrielson, E. (2002). A statistical framework for expression-based molecular classification in cancer, Journal of The Royal Statistical Society. Series B, 64, 717-736   DOI   ScienceOn
8 Sidak, Z. (1968). On multivariate normal probabilities on rectangles: Their dependence on correlations, The Annals of Mathematical statistics, 39, 1425-1434   DOI
9 Welch, W. J. (1990). Construction of permutation tests, Journal of the American Statistical Association, 85, 693-698   DOI   ScienceOn
10 Bohrer, R., Chow, W., Faith, R., Joshi, V. and Wu, C. F. (1981). Multiple three-decision rules for factorial simple effects: Bonferroni wins again!, Journal of the American Statistical Association, 76, 119-124   DOI   ScienceOn
11 Dondrup, M., Huser, A. T., Mertens, D. and Goesmann, A. (2009). An evaluation framework for statistical tests on microarray data, Journal of Biotechnology, 140, 18-26   DOI   ScienceOn
12 Fierro, A. C., Vandenbussche, F., Engelen, K., Van de Peer, Y. and Marchal, K. (2008). Meta analysis of gene expression data within and across species, Current Genomics, 9, 525-534   DOI   ScienceOn
13 Fisher, R. A. (1935). The Design of Experiments, Oliver and Boyd, Edinburgh