DOI QR코드

DOI QR Code

유전자 선택을 위해 속성 삭제에 기반을 둔 최적화된 분류기 설계

A Design of an Optimized Classifier based on Feature Elimination for Gene Selection

  • Lee, Byung-Kwan (Department of Computer Internet, Gangwon Provincial College) ;
  • Park, Seok-Gyu (Department of Computer Internet, Gangwon Provincial College) ;
  • Tifani, Yusrina (Department of Computer Internet, Gangwon Provincial College)
  • 투고 : 2015.09.11
  • 심사 : 2015.09.25
  • 발행 : 2015.10.30

초록

본 논문은 두 가지 속성 삭제 방법인 ReliefF와 SVM-REF를 조합하여 유전자 선택을 위한 속성 삭제에 기반을 둔 최적화된 분류법(OCFE)을 제안한다. ReliefF 알고리즘은 데이터의 중요도에 따라 데이터 순위를 매기고 필터(filter) 속성 선택 알고리즘이다. SVM-RFE 알고리즘은 속성의 가중치 기반으로 데이터 순위를 매기고 데이터를 감싸는 래퍼(wrapper) 속성 선택 알고리즘이다. 이러한 두 가지 기법을 조합함으로써, 우리는 SVM-RFE는 0.3096779이고 OCFE는 0.3016138으로 에러율 평균이 좀 더 낮게 나타났다. 또한, 제안된 기법은 SVM-RFE가 69%이고 OCFE는 70%으로 좀 더 정확한 것으로 나타났다.

This paper proposes an optimized classifier based on feature elimination (OCFE) for gene selection with combining two feature elimination methods, ReliefF and SVM-RFE. ReliefF algorithm is filter feature selection which rank the data by the importance of the data. SVM-RFE algorithm is a wrapper feature selection which wrapped the data and rank the data based on the weight of feature. With combining these two methods we get less error rate average, 0.3016138 for OCFE and 0.3096779 for SVM-RFE. The proposed method also get better accuracy with 70% for OCFE and 69% for SVM-RFE.

키워드

참고문헌

  1. V. B. Canedo, N.S. Marono, A.A. Betanoz, Distributed feature selection: An application to microarray data classification, Applied soft computing, vol.30, pp.136-150, May 2015. https://doi.org/10.1016/j.asoc.2015.01.035
  2. X. Zhou, J. Wang, Feature selection for image classification based on a new ranking criterion, Journal of Computer and Communications, vol.3, pp. 74-79. March 2015.
  3. I. Guyon, J. Wetson, S. Barnhill, M. D. and V. Vapnik, Gene Selection for Cancer Classification using Support Vector Machines, Machine Learning, vol. 46, pp.389-422, 2002 https://doi.org/10.1023/A:1012487302797
  4. Y. Guerbai, Y. Chibani, B. Hadjadji, The effective use of the one-class SVM classifier for handwritten signature verification based on writer-independent parameters, Pattern Recognition, vol. 48, no.1, pp.103-113, January 2015. https://doi.org/10.1016/j.patcog.2014.07.016
  5. K.S. Shin, T.S. Lee, H. J. Kim, An application of support vector machines in bankruptcy prediction model, Expert Systems with Applications vol.28, no.1, pp.127-135, January 2005. https://doi.org/10.1016/j.eswa.2004.08.009
  6. O. Devos, G. Downey, L. Dupochel, Simultaneous data pre-processing and SVM classification model selection based on a parallel genetic algorithm applied to spectroscopic data of olive oils, Food Chemistry, vol. 148, pp.124-130, April 2014. https://doi.org/10.1016/j.foodchem.2013.10.020
  7. M.R. Sikonja, I. Kononenko, Theoritical and Empirical Analysis of ReliefF and RReliefF, Machine Learning, vol.53, no.1, pp.23-69, October 2003. https://doi.org/10.1023/A:1025667309714
  8. I. Kononenko, M.R. Sikonja, U. Pompe, ReliefF for estimation and discretization of attributes in classification, regression, and ILP problems, pp.1-15, 1996
  9. J. Weston, A. Eliseff, G. BakIr, F. Sinz, The Spider. Available: http://people.kyb.tuebingen.mpg.de/spider/main.html
  10. Alon et al, Package 'ColonCA'. http://microarray.princeton.edu/oncology/affydata/index.htmle