Browse > Article
http://dx.doi.org/10.5573/ieie.2015.52.10.073

Nearest-neighbor Rule based Prototype Selection Method and Performance Evaluation using Bias-Variance Analysis  

Shim, Se-Yong (Dept. of Computer Science, Dankook University)
Hwang, Doo-Sung (Dept. of Computer Science, Dankook University)
Publication Information
Journal of the Institute of Electronics and Information Engineers / v.52, no.10, 2015 , pp. 73-81 More about this Journal
Abstract
The paper proposes a prototype selection method and evaluates the generalization performance of standard algorithms and prototype based classification learning. The proposed prototype classifier defines multidimensional spheres with variable radii within class areas and generates a small set of training data. The nearest-neighbor classifier uses the new training set for predicting the class of test data. By decomposing bias and variance of the mean expected error value, we compare the generalization errors of k-nearest neighbor, Bayesian classifier, prototype selection using fixed radius and the proposed prototype selection method. In experiments, the bias-variance changing trends of the proposed prototype classifier are similar to those of nearest neighbor classifiers with all training data and the prototype selection rates are under 27.0% on average.
Keywords
최근접 이웃 규칙;프로토타입 선택;그리디 알고리즘;편의-분산 분해;
Citations & Related Records
연도 인용수 순위
  • Reference
1 X. Wu et al., "The top ten algorithms in data mining," CRC Press, 2009.
2 T. Hastie, R. Tibshirani, and J. Friedman, "The Elements of Statistical Learning: Data Mining," Inference, and Prediction, Springer Series in Statistics, 2001.
3 J. Arturo Olvera-Lopez, J. Ariel Carrasco-Ochoa, J. Francisco Martinez Trinidad, and J. Kittler, "A review of instance selection methods," Artif. Intell. Rev Vol. 34, No. 2, pp. 133-143, Aug. 2010.   DOI
4 P. Flach, "Machine Learning, The Art and Science of Algorithms that Make Sense of Data," Cambridge University Press, 2012.
5 R. Kohavi, D. H. Wolpert, "Bias Plus Variance Decomposition for Zero-One Loss Functions," In Proceedings of the Thirteenth International Conference on Machine Learning, 275-283, 1996.
6 P. Domingos, "A United Bias-Variance Decomposition for Zero-One and Squared Loss," In Proceedings of the Seventeenth National Conference on Artificial Intelligence, 231-238, 2000.
7 J. Bien and R. Tibshirani, "Prototype selection for interpretable classification," The Annuals of Applied Statistics Vol. 5, No. 4, pp. 2403-2424, Dec, 2011.   DOI
8 D. S. Hwang, "Performance Improvement of Nearest-neighbor Classification Learning through Prototype Selection," Journal of The Institute of Electronics Engineers of Korea, Vol. 49(2)-CI, pp. 53-60, Mar. 2012.
9 F. Angiulli, "Fast Nearest Neighbor Condensation for Large Data Sets Classification," IEEE Transactions on Knowledge and Data Engineering, Vol. 19, No. 11, pp. 1450-1464, Nov. 2007.   DOI   ScienceOn
10 D. Marchette, "Class cover catch digraphs," Wiley Interdisciplinary Reviews : Computational Statistics Vol. 2, No. 2, pp. 171-177, Mar. 2010.   DOI
11 R. Younsi, and A. Bagnall, "An efficient randomised sphere cover classifier," Int. J. of Data Mining, Modelling and Management, Vol. 4, No. 2, pp.156-171, Jan. 2012.   DOI
12 S. W. Kim, "Relational Discriminant Analysis Using Prototype Reduction Schemes and Mahalanobis Distances," Journal of The Institute of Electronics Engineers of Korea, Vol. 43(1)-CI, pp. 9-16, Jan. 2006.
13 Stalog project, http://www1.maths.leed.ac.uk/-charles/statlog/ indexdos.html
14 Dietterich, T. G and Kong, E. B., "Machine learning bias, statistical bias, and statistical variance of decision tree algorithms," Technical report, Department of Computer Science, Oregon State University, 1995.
15 UCI Machine Learning Repository, http://archive.ics.uci.edu/ml/
16 The DELVE Manual, http://www.cs.utoronto.ca/-deve/