Browse > Article
http://dx.doi.org/10.5351/CKSS.2003.10.3.879

Evaluation of Attribute Selection Methods and Prior Discretization in Supervised Learning  

Cha, Woon Ock (Division of Computer Engineering, Hansung University)
Huh, Moon Yul (Department of Statistics, SungkyunK$\$kwan University)
Publication Information
Communications for Statistical Applications and Methods / v.10, no.3, 2003 , pp. 879-894 More about this Journal
Abstract
We evaluated the efficiencies of applying attribute selection methods and prior discretization to supervised learning, modelled by C4.5 and Naive Bayes. Three databases were obtained from UCI data archive, which consisted of continuous attributes except for one decision attribute. Four methods were used for attribute selection : MDI, ReliefF, Gain Ratio and Consistency-based method. MDI and ReliefF can be used for both continuous and discrete attributes, but the other two methods can be used only for discrete attributes. Discretization was performed using the Fayyad and Irani method. To investigate the effect of noise included in the database, noises were introduced into the data sets up to the extents of 10 or 20%, and then the data, including those either containing the noises or not, were processed through the steps of attribute selection, discretization and classification. The results of this study indicate that classification of the data based on selected attributes yields higher accuracy than in the case of classifying the full data set, and prior discretization does not lower the accuracy.
Keywords
attribute selection; discretization; classification;
Citations & Related Records
연도 인용수 순위
  • Reference
1 A Probabilistic Approach to Feature Selection: A Filter Solution /
[ Liu,H.;Setino,R. ] / Proceedings of the 13th International Conference on Machine Learning
2 A Measure of Association for Complex Data /
[ Lee,S.C.;Huh,M.Y. ] / Computational Statistics and Data Analysis
3 /
[ Breiman,L.;Friedman,J.H.;Olshen,R.A.;Stone,C.J. ] / Classification and regression trees
4 A language for data analysis and graphics /
[ Ihaka,R.;Gentleman,R. ] / Journal of Computational and Graphical statistics   DOI   ScienceOn
5 /
[ Hall,M.A.;Holmes,G. ] / Benchmarking Attribute Selection Techniques for Data Mining
6 On the Handling of Continuous-valued Attributes in Decision Tree Generation /
[ Fayyad,U.M.;Irani,K.B. ] / Machine Learning
7 /
[ Quinlan,J.R. ] / C4.5: Programs for machine learning
8 /
[ Devijver,P.A.;Kittler,J. ] / Pattern Reognition: A Statistical Approach
9 Feature selection for classification /
[ Dash,M.;Liu,H. ] / Intelligent Data Analysis
10 /
[ Merz,C.J.;Murphuy,P.M. ] / UCI Repository of Machine Learning Databases
11 Estimating attributes : Analysis and extension of RELIEF /
[ Kononenko,I. ] / Proceed. of European Conference on Machine Learning
12 The feature selection problem : Traditional methods and a new algorithm /
[ Kira,K.;Rendell,L.A. ] / Proceed. of Nat'l Conf. of AI
13 Induction of decision trees /
[ Quinlan,J.R. ] / Machine Learning
14 /
[ Liu,H.;Motoda,H. ] / Feature selection for Knowledge Discovery and Data Mining
15 /
[ Witten,I.;Frank,E. ] / Data Mining