Data Reduction for Classification using Entropy-based Partitioning and Center Instances

엔트로피 기반 분할과 중심 인스턴스를 이용한 분류기법의 데이터 감소

  • 손승현 (한양대학교 산업공학과) ;
  • 김재련 (한양대학교 산업공학과)
  • Published : 2006.06.30

Abstract

The instance-based learning is a machine learning technique that has proven to be successful over a wide range of classification problems. Despite its high classification accuracy, however, it has a relatively high storage requirement and because it must search through all instances to classify unseen cases, it is slow to perform classification. In this paper, we have presented a new data reduction method for instance-based learning that integrates the strength of instance partitioning and attribute selection. Experimental results show that reducing the amount of data for instance-based learning reduces data storage requirements, lowers computational costs, minimizes noise, and can facilitates a more rapid search.

Keywords

References

  1. Dasarath, B. V., 'Nearest Neighbor Norms : NN Pattern Classfication Techniques,' IEEE Computer Society Press, Los Alamitos, CA, 1991
  2. Datta, P. and Kibler, D., 'Learning prototypical concept description,' Proceedings of the 12th International Conference on Machine Learning, pp. 158-166, 1995
  3. Datta, P. and Kibler, D., 'Symbolic nearest mean classifier,' Proceedings of the 14th National Conference of Artificial Intelligence, pp. 82-87, 1997
  4. Lam, W., Keung, C. K., and Ling, C. X., 'Learning good prototypes for classification using filtering and abstraction of instances,' Pattern Recognition, 35 : 1491-1506, 2002 https://doi.org/10.1016/S0031-3203(01)00131-5
  5. Sanchez, J. S., 'High training set size reduction by space partitioning and prototype abstraction,' Pattern Recognition, 37 : 1561-1564, 2004 https://doi.org/10.1016/j.patcog.2003.12.012
  6. Han, J. and Kamber, M., Data Mining: Concepts and Techniques, Morgan Kaufman, 2001
  7. Merz, C. J. and Murphy, P. M., UCI Repository of Machine Learning Databases, Irvine, CA : Department of Information and ComputerScience. Internet:http:// www.ics.uci.edu/~mlearn/MLRepository.html