Comparative Statistic Module (CSM) for Significant Gene Selection

  • Kim, Young-Jin (Division of Epidemiology and Bioinformatics, National Genome Research Institute, National Institute of Health) ;
  • Kim, Hyo-Mi (Division of Epidemiology and Bioinformatics, National Genome Research Institute, National Institute of Health) ;
  • Kim, Sang-Bae (Division of Epidemiology and Bioinformatics, National Genome Research Institute, National Institute of Health) ;
  • Park, Chan (Division of Epidemiology and Bioinformatics, National Genome Research Institute, National Institute of Health) ;
  • Kimm, Kuchan (Division of Epidemiology and Bioinformatics, National Genome Research Institute, National Institute of Health) ;
  • Koh, InSong (Division of Epidemiology and Bioinformatics, National Genome Research Institute, National Institute of Health)
  • Published : 2004.12.01

Abstract

Comparative Statistic Module(CSM) provides more reliable list of significant genes to genomics researchers by offering the commonly selected genes and a method of choice by calculating the rank of each statistical test based on the average ranking of common genes across the five statistical methods, i.e. t-test, Kruskal-Wallis (Wilcoxon signed rank) test, SAM, two sample multiple test, and Empirical Bayesian test. This statistical analysis module is implemented in Perl, and R languages.

Keywords

References

  1. Cui, X. and Churchill, G.A. (2003). Statistical tests for differential expression in cDNA microarray experiments. Genome Biol. 4(4):210 https://doi.org/10.1186/gb-2003-4-4-210
  2. Dudoit, S. and Ge, Y. (2004). Bioconductor's multtestpackage. http://www.mssm.edu/faculty/yongchao-ge/multtest/multtest.pdf
  3. Efron, B., Tishirani, R., Storey, J.D., and Tusher, V. (2001). Empirical Bayes analysis of a microarray experiment. Journal of the American Statistical Association 96, 1151-1160 https://doi.org/10.1198/016214501753382129
  4. Ge, Y., Dudoit, S., and Speed T.P. (2003). Resampling-based multiple testing for microarray data analysis. TEST 12, 1-44 (plus discussion p. 44-77). (Technical Report)
  5. Golub, T.R., Slonim, D.K., Tamayo, P., Huard, C., Gaasenbeek, M., Mesirov, J.P., Coller, H., Loh, M.L, Downing, J.R., Caligiuri, M.A., Bloomeld, C.D., and Lander, E.S. (1999). Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286, 531-537 https://doi.org/10.1126/science.286.5439.531
  6. Kim, S.B., Kim, Y.J., Kim, H.M. Jung, H.Y., Lee, E.J., Park, J.S., Park, Y.J., and Koh, I.S. (2004). cMAMS: cDNA Microarray data Analysis and Management System Proceedings of the 31st Korea information Science Sociefy Spring Conference 2, 247-249
  7. Lobenhofer, E.K., Bushel, P.R., Afshari, C.A., and Hamadeh, H.K. (2001). Progress in the application of DNA microarrays. Environ. Health Perspect 109, 881 -891 https://doi.org/10.2307/3454988
  8. Lowry, R. (2004). Concept and Applications of inferential statistics. http://faculty.vassar.edu/lowry/webtext.html
  9. Pan, W. (2002). A comparative review of statistical methods for discovering differentially expressed genes in replicated microarray experiments. Bioinformatics 18, 546-554 https://doi.org/10.1093/bioinformatics/18.4.546
  10. The R Project for Statistical Computing. (2004). http://www.r-project.org
  11. Tusher, V.G., Narsimhan, B., Tibshirani, R., and Chu, G. (2002). Significance analysis of microarrays. User Guide and Technical Documentation. http://www-statstanford.edu/~tibs/SAM/