Browse > Article
http://dx.doi.org/10.6109/jkiice.2010.14.2.534

Classification Prediction Error Estimation System of Microarray for a Comparison of Resampling Methods Based on Multi-Layer Perceptron  

Park, Su-Young (조선대학교 컴퓨터통계학과)
Jeong, Chai-Yeoung (조선대학교 컴퓨터통계학과)
Abstract
In genomic studies, thousands of features are collected on relatively few samples. One of the goals of these studies is to build classifiers to predict the outcome of future observations. There are three inherent steps to build classifiers: a significant gene selection, model selection and prediction assessment. In the paper, with a focus on prediction assessment, we normalize microarray data with quantile-normalization methods that adjust quartile of all slide equally and then design a system comparing several methods to estimate 'true' prediction error of a prediction model in the presence of feature selection and compare and analyze a prediction error of them. LOOCV generally performs very well with small MSE and bias, the split sample method and 2-fold CV perform with small sample size very pooly. For computationally burdensome analyses, 10-fold CV may be preferable to LOOCV.
Keywords
Microarray; re-sampling method(LOOCV, split, 2-, 10-fold CV); MLP;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Vreiman,L. Friedman,J.H., Olshen,R.A and Stone,C.J., "Classification and Regression Tress.", Wadsworth and Brooks/Cole, Monterey, CA., 1984.
2 Golub, T.R., "Molecular classification of cancer: class discovery and class prediction by gene expression monitoring", Science, vo1286, no, 5439, pp. 531-537, 1999.
3 WEKA, http://www.cs.waikato.ac.nz/ml/weka/
4 Ransohoff,D,F., "Rules of evidence for cancer molecular marker discovery and validation,", Nature Reviews/Cancer, 4, 309-313, 2004.   DOI   ScienceOn
5 Breiman,L. and spector,P., "Submodel selection and evaluation in regression.", The X-random case, Int. Stat. Rev., 60, 291-391, 1992.
6 S, Dudoit, "Comparison of discrimination methods for the classification of trunors using gene expression data", Journal of the American Statistical Association, vol. 97, pp. 77-87, 2002.   DOI   ScienceOn