[KSCI] Korea Science Citation Index Service

http://dx.doi.org/10.5573/ieie.2015.52.10.073

Nearest-neighbor Rule based Prototype Selection Method and Performance Evaluation using Bias-Variance Analysis

Shim, Se-Yong (Dept. of Computer Science, Dankook University)
Hwang, Doo-Sung (Dept. of Computer Science, Dankook University)

Publication Information

Journal of the Institute of Electronics and Information Engineers / v.52, no.10, 2015 , pp. 73-81 More about this Journal

Abstract

The paper proposes a prototype selection method and evaluates the generalization performance of standard algorithms and prototype based classification learning. The proposed prototype classifier defines multidimensional spheres with variable radii within class areas and generates a small set of training data. The nearest-neighbor classifier uses the new training set for predicting the class of test data. By decomposing bias and variance of the mean expected error value, we compare the generalization errors of k-nearest neighbor, Bayesian classifier, prototype selection using fixed radius and the proposed prototype selection method. In experiments, the bias-variance changing trends of the proposed prototype classifier are similar to those of nearest neighbor classifiers with all training data and the prototype selection rates are under 27.0% on average.

Keywords

최근접 이웃 규칙;프로토타입 선택;그리디 알고리즘;편의-분산 분해;

Citations & Related Records

Reference

1	X. Wu et al., "The top ten algorithms in data mining," CRC Press, 2009.
2	T. Hastie, R. Tibshirani, and J. Friedman, "The Elements of Statistical Learning: Data Mining," Inference, and Prediction, Springer Series in Statistics, 2001.
3	J. Arturo Olvera-Lopez, J. Ariel Carrasco-Ochoa, J. Francisco Martinez Trinidad, and J. Kittler, "A review of instance selection methods," Artif. Intell. Rev Vol. 34, No. 2, pp. 133-143, Aug. 2010. DOI
4	P. Flach, "Machine Learning, The Art and Science of Algorithms that Make Sense of Data," Cambridge University Press, 2012.
5	R. Kohavi, D. H. Wolpert, "Bias Plus Variance Decomposition for Zero-One Loss Functions," In Proceedings of the Thirteenth International Conference on Machine Learning, 275-283, 1996.
6	P. Domingos, "A United Bias-Variance Decomposition for Zero-One and Squared Loss," In Proceedings of the Seventeenth National Conference on Artificial Intelligence, 231-238, 2000.
7	J. Bien and R. Tibshirani, "Prototype selection for interpretable classification," The Annuals of Applied Statistics Vol. 5, No. 4, pp. 2403-2424, Dec, 2011. DOI
8	D. S. Hwang, "Performance Improvement of Nearest-neighbor Classification Learning through Prototype Selection," Journal of The Institute of Electronics Engineers of Korea, Vol. 49(2)-CI, pp. 53-60, Mar. 2012.
9	F. Angiulli, "Fast Nearest Neighbor Condensation for Large Data Sets Classification," IEEE Transactions on Knowledge and Data Engineering, Vol. 19, No. 11, pp. 1450-1464, Nov. 2007. DOI ScienceOn
10	D. Marchette, "Class cover catch digraphs," Wiley Interdisciplinary Reviews : Computational Statistics Vol. 2, No. 2, pp. 171-177, Mar. 2010. DOI
11	R. Younsi, and A. Bagnall, "An efficient randomised sphere cover classifier," Int. J. of Data Mining, Modelling and Management, Vol. 4, No. 2, pp.156-171, Jan. 2012. DOI
12	S. W. Kim, "Relational Discriminant Analysis Using Prototype Reduction Schemes and Mahalanobis Distances," Journal of The Institute of Electronics Engineers of Korea, Vol. 43(1)-CI, pp. 9-16, Jan. 2006.
13	Stalog project, http://www1.maths.leed.ac.uk/-charles/statlog/ indexdos.html
14	Dietterich, T. G and Kong, E. B., "Machine learning bias, statistical bias, and statistical variance of decision tree algorithms," Technical report, Department of Computer Science, Oregon State University, 1995.
15	UCI Machine Learning Repository, http://archive.ics.uci.edu/ml/
16	The DELVE Manual, http://www.cs.utoronto.ca/-deve/

KSCI

Nearest-neighbor Rule based Prototype Selection Method and Performance Evaluation using Bias-Variance Analysis 최근접 이웃 규칙 기반 프로토타입 선택과 편의-분산을 이용한 성능 평가

Nearest-neighbor Rule based Prototype Selection Method and Performance Evaluation using Bias-Variance Analysis