Browse > Article

Calculating Attribute Weights in K-Nearest Neighbor Algorithms using Information Theory  

Lee Chang-Hwan (동국대학교 정보통신학과)
Abstract
Nearest neighbor algorithms classify an unseen input instance by selecting similar cases and use the discovered membership to make predictions about the unknown features of the input instance. The usefulness of the nearest neighbor algorithms have been demonstrated sufficiently in many real-world domains. In nearest neighbor algorithms, it is an important issue to assign proper weights to the attributes. Therefore, in this paper, we propose a new method which can automatically assigns to each attribute a weight of its importance with respect to the target attribute. The method has been implemented as a computer program and its effectiveness has been tested on a number of machine learning databases publicly available.
Keywords
Nearest neighbor algorithm; Machine learning; Feature selection; Information theory;
Citations & Related Records
연도 인용수 순위
  • Reference
1 D. Aha, 'Tolerating noisy, irrelevant, and novel attributes in instance-based learning algorithms,' Int'l Journal of Man-Machine Studies, 36, pp. 267-287, 1992   DOI
2 S. Cost and S. Salzberg, 'A Weighted Nearest Neighbor Algorithm for Learning with Symbolic Features,' Machine Learning, 10, pp. 57-78, 1993   DOI
3 J. D. Kelly and L. Davis, 'A Hybrid Genetic Algorithm for Classification,' Proc. of the 12th Int. Joint Canf. on Artificial Intelligence, pp. 645-650, Sydney, Australia: Morgan Kaufmann, 1991
4 van den Bosch, A. and Daelemans, W. 'Data Oriented-Method for Grapheme-to-Phoneme Convertsion' Technical Report, Tilburg, Netherlands, Tilburg University: Institute for Language Technology and Artificial Intelligence, 1993
5 C. Stanfill and D. Waltz, 'Toward Memory-based Reasoning,' Communications of the ACM, 29(12), pp. 1213-1228, 1986   DOI   ScienceOn
6 R. J. Beran, 'Minimum Hellinger Distances for Parametric Models,' Ann. Statistics, Vol. 5, pp. 445-463, 1977   DOI   ScienceOn
7 Z. Ying, 'Minimum Hellinger Distance Estimation for Censored Data,' Annals of Statistics, Vol. 20, No. 3, pp. 1361-1390, 1992   DOI   ScienceOn
8 P. Murphy and D. Aha, UCI Repository of Machine Learning Databases, Irvine, CA:University of California Irvine, Department of Information and Compter Science, 1993
9 S. Haykin, 'Neural Networks: A Comprehensive Foundation,' Prentice Hall, 1999
10 T. M. Cover and P. E. Hart, 'Nearest Neighbor Pattern Classification,' IEEE Transactions on Information Theory, Vol. 13, 1967
11 E. E. Smith and D. L. Medin, 'Categories and Concepts,' Cambridge, MA: Harvard University Press, 1981
12 D. Aha, D. Kibler and M. Albert, 'Instance-based Learning Algorithms,' Machine Learning, 6(1) pp. 37-66, 1991   DOI
13 J. Zhang, 'Selecting typical instances in instance-based learning,' Proceedings of the Ninth Int. Machine Learning Conference, pp. 470-479, Aberdeen, Scotland: Morgan Kaufmann, 1992
14 S. Romaniuk, 'Efficient Storage of Instances: The Multi-pass Approach,' Proc. of the Seventh Int. Conf. on Industrial and Engineering Applications of Artificial Intelligence and Expert systems, Austin, TX, 1994
15 L. Breiman, J. Friedman, R. Olshen, and C. Stone, 'Classification and Regression Trees,' Monterey, CA: Wadsworth International Group, 1984
16 D. Skalak, 'Prototype and Feature Selection by Sampling and Random Mutation Hill Climbing Algorithms,' Proc. of the 11th International Conference on Machine Learning, New Brunswick, NJ, 1994
17 S. Salzberg, 'A Nearest Hyperrectangle Learning Method,' Machine Learning, 6, pp. 251-276, 1991   DOI
18 R. Creecy, B. Masand, S. Smith, and D. Waltz 'Trading MIPS and Memory for Knowledge Engineering,' Communications of the ACM, 35, pp. 48-64, 1992   DOI
19 Quinlan, J. R. 'C4.5: Programs for Machine Learning,' San Mateo, CA:Morgan Kaufmann, 1993