[KSCI] Korea Science Citation Index Service

http://dx.doi.org/10.5573/IEIESPC.2017.6.3.183

Plurality Rule-based Density and Correlation Coefficient-based Clustering for K-NN

Aung, Swe Swe (Information Engineering Department, University of the Ryukyus)
Nagayama, Itaru (Information Engineering Department, University of the Ryukyus)
Tamaki, Shiro (Information Engineering Department, University of the Ryukyus)

Publication Information

IEIE Transactions on Smart Processing and Computing / v.6, no.3, 2017 , pp. 183-192 More about this Journal

Abstract

k-nearest neighbor (K-NN) is a well-known classification algorithm, being feature space-based on nearest-neighbor training examples in machine learning. However, K-NN, as we know, is a lazy learning method. Therefore, if a K-NN-based system very much depends on a huge amount of history data to achieve an accurate prediction result for a particular task, it gradually faces a processing-time performance-degradation problem. We have noticed that many researchers usually contemplate only classification accuracy. But estimation speed also plays an essential role in real-time prediction systems. To compensate for this weakness, this paper proposes correlation coefficient-based clustering (CCC) aimed at upgrading the performance of K-NN by leveraging processing-time speed and plurality rule-based density (PRD) to improve estimation accuracy. For experiments, we used real datasets (on breast cancer, breast tissue, heart, and the iris) from the University of California, Irvine (UCI) machine learning repository. Moreover, real traffic data collected from Ojana Junction, Route 58, Okinawa, Japan, was also utilized to lay bare the efficiency of this method. By using these datasets, we proved better processing-time performance with the new approach by comparing it with classical K-NN. Besides, via experiments on real-world datasets, we compared the prediction accuracy of our approach with density peaks clustering based on K-NN and principal component analysis (DPC-KNN-PCA).

Keywords

Classification; Density-based; K-NN; DPC-KNN-PCA; Processing time;

Citations & Related Records

Reference

1	B.Tang and H.He, Department of Electrical, Computer and Biomedical Engineering, University of Rhode Island, Kingston, RI, USA, "Enn: Extended Nearest Neighbor Method for pattern Recognition", IEEE Computational Intelligence Magazine, 1556-603X, August 2015.
2	K.Chomboon, P.Chujai, P.Teerarassamee, K.Kerdprasop and N.Kerdprasop, "An Empirical Study of Distance Metrics for k-Nearest Neighbor Algorithm", Proceedings of the 3rd International Conference on Industrial Application Engineering, 2015.
3	Z.Nazari and D.Kang, "Denisty Based Support Vector Machines for Classification", International Journal of Advanced Research in Artificial Intelligence(IJARAI), Vol. 4, N0.4, 2015.
4	M.Du,S.Ding and H.Jia, School of Computer Science and Technology, China University of Mining and Technology, "Study on density peaks clustering based on k-nearest neighbors and principal component analysis", Knowledge-Based Systems Journal, Knowledge-Bawd System 99 (2016) 135-145. DOI
5	M.Doshi and S.K.Chaturvedi, "Correlation Based Feature Selection Technique to Predict Student Performance", International Journal of Computer Networks & Communications (IJCNC) Vol.6, No.3, May 2014.
6	L.Zhang, Q.Liu, W.Yang, N.Wei and D.Dong, "An Improved K-nearest Neighbor Model for Short-term Traffic Flow Prediction", 13th COTA International Conference of Transportation Professionals (CICTP 2013), pp 653-662, 2013.
7	T.N.Tran, R.Wehrens, and L.M.C.Buydens, "KNN-kernel density-based clustering for high-dimensional multivariate data", Computataional Statistic and Data Analysis 51 (2006) 513-525. DOI