[KSCI] Korea Science Citation Index Service

Calculating Attribute Weights in K-Nearest Neighbor Algorithms using Information Theory

Lee Chang-Hwan (동국대학교 정보통신학과)

Publication Information

Journal of KIISE:Software and Applications / v.32, no.9, 2005 , pp. 920-926 More about this Journal

Abstract

Nearest neighbor algorithms classify an unseen input instance by selecting similar cases and use the discovered membership to make predictions about the unknown features of the input instance. The usefulness of the nearest neighbor algorithms have been demonstrated sufficiently in many real-world domains. In nearest neighbor algorithms, it is an important issue to assign proper weights to the attributes. Therefore, in this paper, we propose a new method which can automatically assigns to each attribute a weight of its importance with respect to the target attribute. The method has been implemented as a computer program and its effectiveness has been tested on a number of machine learning databases publicly available.

Keywords

Nearest neighbor algorithm; Machine learning; Feature selection; Information theory;

Citations & Related Records

Reference

1	D. Aha, 'Tolerating noisy, irrelevant, and novel attributes in instance-based learning algorithms,' Int'l Journal of Man-Machine Studies, 36, pp. 267-287, 1992 DOI
2	S. Cost and S. Salzberg, 'A Weighted Nearest Neighbor Algorithm for Learning with Symbolic Features,' Machine Learning, 10, pp. 57-78, 1993 DOI
3	J. D. Kelly and L. Davis, 'A Hybrid Genetic Algorithm for Classification,' Proc. of the 12th Int. Joint Canf. on Artificial Intelligence, pp. 645-650, Sydney, Australia: Morgan Kaufmann, 1991
4	van den Bosch, A. and Daelemans, W. 'Data Oriented-Method for Grapheme-to-Phoneme Convertsion' Technical Report, Tilburg, Netherlands, Tilburg University: Institute for Language Technology and Artificial Intelligence, 1993
5	C. Stanfill and D. Waltz, 'Toward Memory-based Reasoning,' Communications of the ACM, 29(12), pp. 1213-1228, 1986 DOI ScienceOn
6	R. J. Beran, 'Minimum Hellinger Distances for Parametric Models,' Ann. Statistics, Vol. 5, pp. 445-463, 1977 DOI ScienceOn
7	Z. Ying, 'Minimum Hellinger Distance Estimation for Censored Data,' Annals of Statistics, Vol. 20, No. 3, pp. 1361-1390, 1992 DOI ScienceOn
8	P. Murphy and D. Aha, UCI Repository of Machine Learning Databases, Irvine, CA:University of California Irvine, Department of Information and Compter Science, 1993
9	S. Haykin, 'Neural Networks: A Comprehensive Foundation,' Prentice Hall, 1999
10	T. M. Cover and P. E. Hart, 'Nearest Neighbor Pattern Classification,' IEEE Transactions on Information Theory, Vol. 13, 1967
11	E. E. Smith and D. L. Medin, 'Categories and Concepts,' Cambridge, MA: Harvard University Press, 1981
12	D. Aha, D. Kibler and M. Albert, 'Instance-based Learning Algorithms,' Machine Learning, 6(1) pp. 37-66, 1991 DOI
13	J. Zhang, 'Selecting typical instances in instance-based learning,' Proceedings of the Ninth Int. Machine Learning Conference, pp. 470-479, Aberdeen, Scotland: Morgan Kaufmann, 1992
14	S. Romaniuk, 'Efficient Storage of Instances: The Multi-pass Approach,' Proc. of the Seventh Int. Conf. on Industrial and Engineering Applications of Artificial Intelligence and Expert systems, Austin, TX, 1994
15	L. Breiman, J. Friedman, R. Olshen, and C. Stone, 'Classification and Regression Trees,' Monterey, CA: Wadsworth International Group, 1984
16	D. Skalak, 'Prototype and Feature Selection by Sampling and Random Mutation Hill Climbing Algorithms,' Proc. of the 11th International Conference on Machine Learning, New Brunswick, NJ, 1994
17	S. Salzberg, 'A Nearest Hyperrectangle Learning Method,' Machine Learning, 6, pp. 251-276, 1991 DOI
18	R. Creecy, B. Masand, S. Smith, and D. Waltz 'Trading MIPS and Memory for Knowledge Engineering,' Communications of the ACM, 35, pp. 48-64, 1992 DOI
19	Quinlan, J. R. 'C4.5: Programs for Machine Learning,' San Mateo, CA:Morgan Kaufmann, 1993

1	Design and Implementation of a Web Server Using a Learning-based Dynamic Thread Pool Scheme / [Yoo, Seo-Hee;Kang, Dong-Hyun;Lee, Kwon-Yong;Park, Sung-Yong;] / Journal of KIISE:Computing Practices and Letters
2	A History-based Dynamic Thread Pool Method for Reducing Thread Creation and Removal Overheads / [Oh, Sam-Kweon;Kim, Jin-Sub;] / Journal of Advanced Navigation Technology

KSCI

Calculating Attribute Weights in K-Nearest Neighbor Algorithms using Information Theory 정보이론을 이용한 K-최근접 이웃 알고리즘에서의 속성 가중치 계산

Calculating Attribute Weights in K-Nearest Neighbor Algorithms using Information Theory