Browse > Article
http://dx.doi.org/10.5351/KJAS.2010.23.6.1067

Reproducibility and Sample Size in High-Dimensional Data  

Seo, Won-Seok (Department of Statistics, Korea University)
Choi, Jee-A (Department of Statistics, Korea University)
Jeong, Hyeong-Chul (Department of Applied Statistics, University of Suwon)
Cho, Hyung-Jun (Department of Statistics, Korea University)
Publication Information
The Korean Journal of Applied Statistics / v.23, no.6, 2010 , pp. 1067-1080 More about this Journal
Abstract
A number of methods have been developed to determine sample sizes in clinical trial, and most clinical trial organizations determine sample sizes based on the methods. In contrast, determining sufficient sample sizes needed for experiments using microarray chips is unsatisfactory and not widely in use. In this paper, our objective is to provide a guideline in determining sample sizes, utilizing reproducibility of real microarray data. In the reproducibility comparison, five methods for discovering differential expression are used: Fold change, Two-sample t-test, Wilcoxon rank-sum test, SAM, and LPE. In order to standardize gene expression values, both MAS5 and RMA methods are considered. According to the number of repetitions, the upper 20 and 100 gene accordances are also compared. In determining sample sizes, more realistic information can be added to the existing method because of our proposed approach.
Keywords
Microarray; reproducibility; sample size; effect size;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Benjamini, Y. and Hochberg, Y. (1995). Controlling the false discovery rate: A practical and powerful approach to multiple testing, Journal of the Royal Statistical Society. Series B (Methodological), 57, 289-300.   DOI
2 Benjamini, Y. and Yekutieli, D. (2001). The control of the false discovery rate in multiple testing under dependency, Annals of Statistics, 29, 1165-1188.   DOI
3 Dudoit, S., Shaffer, J. P. and Boldrick, J. C. (2003). Multiple hypothesis testing in microarray experiments, Statistical Science, 18, 71-103.   DOI   ScienceOn
4 Irizarry, R. A., Hobbs, B., Collin, F., Beazer-Barclay, Y. D., Antonellis, K. J., Scherf, U. and Speed, T. P. (2003). Exploration, normalization, and summaries of high density oligonucleotide array probe level data, Biostatistics, 4, 249-264.   DOI   ScienceOn
5 Jain, N., Cho, H. J., O'Connell, M. and Lee, J. K. (2005). Rank-invariant resampling based estimation of false discovery rate for analysis of small sample microarray data, BMC Bioinformatics, 6, 187.   DOI   ScienceOn
6 Jain, N., Thatte, J., Braciale, T., Ley, K., O'Connell, M. and Lee, J. K. (2003). Local-pooled-error test for identifying differentially expressed genes with a small number of replicated microarrays, Bioinformatics, 19, 1945-1951.   DOI   ScienceOn
7 Jung, S. H. (2005). Sample size for FDR-control in microarray data analysis, Bioinformatics, 21, 3097-3104.   DOI   ScienceOn
8 Shao, Y. and Tseng, C. H. (2007). Sample size calculation with dependence adjustment for FDR-control in microarray studies, Statistics in Medicine, 26, 4219-4237.   DOI   ScienceOn
9 Tusher, V. G., Tibshirani, R. and Chu, G. (2001). Significance analysis of microarrays applied to the ionizing radiation response, Proceedings of National Academy of Sciences USA, 98, 5116-5121.   DOI   ScienceOn