Browse > Article
http://dx.doi.org/10.5351/KJAS.2007.20.3.531

Gene Selection Based on Support Vector Machine using Bootstrap  

Song, Seuck-Heun (Department of Statistics, Korea University)
Kim, Kyoung-Hee (Department of Statistics, Korea University)
Park, Chang-Yi (Institute of Statistics, Korea University)
Koo, Ja-Yong (Department of Statistics, Korea University)
Publication Information
The Korean Journal of Applied Statistics / v.20, no.3, 2007 , pp. 531-540 More about this Journal
Abstract
The recursive feature elimination for support vector machine is known to be useful in selecting relevant genes. Since the criterion for choosing relevant genes is the absolute value of a coefficient, the recursive feature elimination may suffer from a scaling problem. We propose a modified version of the recursive feature elimination algorithm using bootstrap. In our method, the criterion for determining relevant genes is the absolute value of a coefficient divided by its standard error, which accounts for statistical variability of the coefficient. Through numerical examples, we illustrate that our method is effective in gene selection.
Keywords
Classification; gene selection; recursive feature elimination;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Kohavi, R. and John, G. (1997). Wrappers for feature subset selection, Artificial Intelligence, 97, 273-324   DOI   ScienceOn
2 Koutsoukos, A. D., Rubinstein, L. V., Faraggi, D., Simon, R. M., Kalyandrug, S., Weinstein, J. N., Kohn, K. W. and Paull, K. D. (1994). Discrimination techniques applied to the NCI in vitro anti-tumour drug screen: predicting biochemical mechanism of action, Statistics in Medicine, 13, 719-730   DOI   ScienceOn
3 LeCun, Y., Denker, J. S. and Solla, S. A. (1990). Optimum brain damage, Advances in neural information processing systems, 2, 598-605
4 Pavlidis, P., Weston, J., Cai, J. and Grundy, W. N. (2001). Gene functional classification from heterogeneous data, Annual Conference on Research in Computational Molecular Biology Proceedings of the fifth annual international conference on Computational biology
5 Philip, M. L. and Vinsensius, B. V. (2003). Boosting and microarray data, Machine Learning, 52, 31-44   DOI
6 Vapnik, V. N. (1998). Statistical Learning Theory, John Wiley & Sons, New York
7 Alon, U., Barkai, N., Notterman, D. A., Gish, K., Ybarra. S., Mack, D. and Levine, A. J. (1999). Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays, Proceedings of the National Academy of Sciences of the United States of America, 96, 6745-6750
8 Dudoit, S., Fridlyand, J. and Speed, T. (2002). Comparison of discrimination methods for the classification of tumors using gene expression data, Journal of the American Statistical Association, 97, 77-87   DOI   ScienceOn
9 Furey, T. S., Cristianini, N., Duffy, N., Bednarski, D. W., Schummer, M. and Haussler, D. (2000). Support vector machine classification and validation of cancer tissue samples using microarray expression data, Bioinformatics, 16, 906-914   DOI   ScienceOn
10 Golub, T. R., Slonim, D. K., Tamayo, P., Huard, C., Gaasenbeek, M., Mesirov, J. P., Coller, H., Loh M. L., Downing, J. R., Caligiuri, M. A., Bloomfield C. D. and Lander, E. S. (1999). Molecular classification of cancer: class discovery and class prediction by gene expression monitoring, Science, 286, 531-537   DOI   ScienceOn
11 Khan, J., Wei, J. S., Ringner, M., Saal, L. H., Ladanyi, M., Westermann, F., Berthold, F., Schwab, M., Antonescu, C. R., Peterson, C. and Meltzer, P. S. (2001). Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks, Nature Medicine, 7, 673-679   DOI   ScienceOn
12 Guyon, I., Weston, J., Barnhill, S. and Vapnik, V. (2002). Gene selection for cancer classification using support vector machines, Machine Learning, 46, 389-422   DOI