Browse > Article
http://dx.doi.org/10.3745/KTSDE.2020.9.3.83

Hyper-Rectangle Based Prototype Selection Algorithm Preserving Class Regions  

Baek, Byunghyun (단국대학교 소프트웨어학과)
Euh, Seongyul (단국대학교 소프트웨어학과)
Hwang, Doosung (단국대학교 소프트웨어학과)
Publication Information
KIPS Transactions on Software and Data Engineering / v.9, no.3, 2020 , pp. 83-90 More about this Journal
Abstract
Prototype selection offers the advantage of ensuring low learning time and storage space by selecting the minimum data representative of in-class partitions from the training data. This paper designs a new training data generation method using hyper-rectangles that can be applied to general classification algorithms. Hyper-rectangular regions do not contain different class data and divide the same class space. The median value of the data within a hyper-rectangle is selected as a prototype to form new training data, and the size of the hyper-rectangle is adjusted to reflect the data distribution in the class area. A set cover optimization algorithm is proposed to select the minimum prototype set that represents the whole training data. The proposed method reduces the time complexity that requires the polynomial time of the set cover optimization algorithm by using the greedy algorithm and the distance equation without multiplication. In experimented comparison with hyper-sphere prototype selections, the proposed method is superior in terms of prototype rate and generalization performance.
Keywords
Prototype Selection; Prototype; Hyper-Rectangle; Set Cover Optimization Algorithm;
Citations & Related Records
연도 인용수 순위
  • Reference
1 N. Bhatia, Vandana, "Survey of Nearest Neighbor Techniques," International Journal of Computer Science and Information Security, Vol.8, No.2, 2010.
2 I. Triguero, J. Derrac, S. Garcia, and F. Herrea, "A Taxonomy and Experimental Study on Prototype Generation for Nearest Neighbor Classification," IEEE Transactions on Systems, Man, and Cybernetics Part C(Application And Reviews), Vol.42, No.1, pp.86-100, 2012.   DOI
3 R. M. Cruz, R. Sabourin, and G. D. Cavalcanti, "Prototype selection for dynamic classifier and ensemble selection," Neural Computing and Applications, Vol.29, pp.447-457, 2016.   DOI
4 R. M. Curz, R. Sabourin, and G. D. Cavalcanti, "Analyzing different prototype selection techniques for dynamic classifier and ensemble selection," International Joint Conference on Neural Networks, pp.3959-3966, 2017.
5 E. Pekalska, R. P. W. Duin, and P. Paclik, "Prototype selection for dissimilarity-based classifier," Pattern Recognition, 39, pp.189-208, 2006.   DOI
6 J. A. Olvera-Lopez, J. A. Carrasco-Ochoa, J. F. Martinez Trinidad, and J. Kittler, "A review of instance selection methods," Artif Intell Rev, Vol.34, No.2, pp.133-143, 2010.   DOI
7 D. R. Wilson and T. R. Martinez, "Reduction techniques for instance-based learning algorithms," Machine Learning, Vol.38, No.3, pp.257-286, 2000.   DOI
8 S. Garcia, J. Derrac, J. Cano, and F. Herrera, "Prototype selection for nearest neighbor classification: taxonomy and empirical study," IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol.34, No.3, pp.417-435, 2012.   DOI
9 S. Choi, S. Cha, and C. Tappert, "A Survey of Binary Similarity and Distance Measures," J. Systemics, Cybernetics and Informatics, Vol.8, No.1, pp.43-48, 2010.
10 J. Bien and R. Tibshirani, "Prototype selection for interpretable classification," The Annals of Applied Statistics, Vol.5, No.4, pp.2403-2424, 2011.   DOI
11 D. Marchette, "Class cover catch digraphs," Wiley Interdisciplinary Reviews: Computational Statistics, Vol.2, No.2, pp.171-177, 2010.   DOI
12 R. Younsi and A. Bagnall, "A randomized sphere cover classifier," International Conference on Intelligent Data Engineering and Automated Learning, pp.234-241, 2010.
13 S. Seyong and H. Doosung, "Prototype based Classification by Generating Multidimensional Spheres per Class Area," Journal of The Korea Society of Computer and Information, Vol.20, No.2, 2015.
14 S. Arora, D. Karger, and M. Karpinski, "Polynomial time approximation schemes for dense instances of NP-hard problems," Journal of Computer and System Sciences, Vol.58, pp.193-210, 1999.   DOI
15 D. S. Hwang and D. W. Kim, "Near-boundary data selection ofor fast support vector machines," Malasian Journal of Computer Science, Vol.25, No.1, pp.23-37, 2012.
16 F. Angiulli, "Fast Nearest Neighbor Condensation for Large Data Sets Classification," IEEE Transactions on Knowledge and Data Engineering, Vol.19, No.11, pp.1450-1464, 2007.   DOI
17 A. H. Cannon and L. J. Cowen, "Approximation algorithms for the class cover problem," Annals of Mathematics and Artificial Intelligence, Vol.40, No.3-4, pp.215-223, 2004.   DOI
18 UCI Machine Learning Repository [Online]. Available: https://archive.ics.uci.edu/ml/.