Search | Korea Science

Nearest neighbor and validity-based clustering

Son, Seo H.;Seo, Suk T.;Kwon, Soon H.
- International Journal of Fuzzy Logic and Intelligent Systems
- /
- v.4 no.3
- /
- pp.337-340
- /
- 2004
The clustering problem can be formulated as the problem to find the number of clusters and a partition matrix from a given data set using the iterative or non-iterative algorithms. The author proposes a nearest neighbor and validity-based clustering algorithm where each data point in the data set is linked with the nearest neighbor data point to form initial clusters and then a cluster in the initial clusters is linked with the nearest neighbor cluster to form a new cluster. The linking between clusters is continued until no more linking is possible. An optimal set of clusters is identified by using the conventional cluster validity index. Experimental results on well-known data sets are provided to show the effectiveness of the proposed clustering algorithm.
https://doi.org/10.5391/IJFIS.2004.4.3.337 인용 PDF KSCI

Identification of Plastic Wastes by Using Fuzzy Radial Basis Function Neural Networks Classifier with Conditional Fuzzy C-Means Clustering

Roh, Seok-Beom;Oh, Sung-Kwun
- Journal of Electrical Engineering and Technology
- /
- v.11 no.6
- /
- pp.1872-1879
- /
- 2016
The techniques to recycle and reuse plastics attract public attention. These public attraction and needs result in improving the recycling technique. However, the identification technique for black plastic wastes still have big problem that the spectrum extracted from near infrared radiation spectroscopy is not clear and is contaminated by noise. To overcome this problem, we apply Raman spectroscopy to extract a clear spectrum of plastic material. In addition, to improve the classification ability of fuzzy Radial Basis Function Neural Networks, we apply supervised learning based clustering method instead of unsupervised clustering method. The conditional fuzzy C-Means clustering method, which is a kind of supervised learning based clustering algorithms, is used to determine the location of radial basis functions. The conditional fuzzy C-Means clustering analyzes the data distribution over input space under the supervision of auxiliary information. The auxiliary information is defined by using k Nearest Neighbor approach.
https://doi.org/10.5370/JEET.2016.11.6.1872 인용 PDF KSCI

Robust Similarity Measure for Spectral Clustering Based on Shared Neighbors

Ye, Xiucai;Sakurai, Tetsuya
- ETRI Journal
- /
- v.38 no.3
- /
- pp.540-550
- /
- 2016
Spectral clustering is a powerful tool for exploratory data analysis. Many existing spectral clustering algorithms typically measure the similarity by using a Gaussian kernel function or an undirected k-nearest neighbor (kNN) graph, which cannot reveal the real clusters when the data are not well separated. In this paper, to improve the spectral clustering, we consider a robust similarity measure based on the shared nearest neighbors in a directed kNN graph. We propose two novel algorithms for spectral clustering: one based on the number of shared nearest neighbors, and one based on their closeness. The proposed algorithms are able to explore the underlying similarity relationships between data points, and are robust to datasets that are not well separated. Moreover, the proposed algorithms have only one parameter, k. We evaluated the proposed algorithms using synthetic and real-world datasets. The experimental results demonstrate that the proposed algorithms not only achieve a good level of performance, they also outperform the traditional spectral clustering algorithms.
https://doi.org/10.4218/etrij.16.0115.0517 인용 PDF KSCI

Classification of Traffic Flows into QoS Classes by Unsupervised Learning and KNN Clustering

Zeng, Yi;Chen, Thomas M.
- KSII Transactions on Internet and Information Systems (TIIS)
- /
- v.3 no.2
- /
- pp.134-146
- /
- 2009
Traffic classification seeks to assign packet flows to an appropriate quality of service(QoS) class based on flow statistics without the need to examine packet payloads. Classification proceeds in two steps. Classification rules are first built by analyzing traffic traces, and then the classification rules are evaluated using test data. In this paper, we use self-organizing map and K-means clustering as unsupervised machine learning methods to identify the inherent classes in traffic traces. Three clusters were discovered, corresponding to transactional, bulk data transfer, and interactive applications. The K-nearest neighbor classifier was found to be highly accurate for the traffic data and significantly better compared to a minimum mean distance classifier.
https://doi.org/10.3837/tiis.2009.02.001 인용 PDF

A Generic Algorithm for k-Nearest Neighbor Graph Construction Based on Balanced Canopy Clustering (Balanced Canopy Clustering에 기반한 일반적 k-인접 이웃 그래프 생성 알고리즘)

Park, Youngki;Hwang, Heasoo;Lee, Sang-Goo
- KIISE Transactions on Computing Practices
- /
- v.21 no.4
- /
- pp.327-332
- /
- 2015
Constructing a k-nearest neighbor (k-NN) graph is a primitive operation in the field of recommender systems, information retrieval, data mining and machine learning. Although there have been many algorithms proposed for constructing a k-NN graph, either the existing approaches cannot be used for various types of similarity measures, or the performance of the approaches is decreased as the number of nodes or dimensions increases. In this paper, we present a novel algorithm for k-NN graph construction based on "balanced" canopy clustering. The experimental results show that irrespective of the number of nodes or dimensions, our algorithm is at least five times faster than the brute-force approach while retaining an accuracy of approximately 92%.
https://doi.org/10.5626/KTCP.2015.21.4.327 인용 KSCI

Spectral clustering based on the local similarity measure of shared neighbors

Cao, Zongqi;Chen, Hongjia;Wang, Xiang
- ETRI Journal
- /
- v.44 no.5
- /
- pp.769-779
- /
- 2022
Spectral clustering has become a typical and efficient clustering method used in a variety of applications. The critical step of spectral clustering is the similarity measurement, which largely determines the performance of the spectral clustering method. In this paper, we propose a novel spectral clustering algorithm based on the local similarity measure of shared neighbors. This similarity measurement exploits the local density information between data points based on the weight of the shared neighbors in a directed k-nearest neighbor graph with only one parameter k, that is, the number of nearest neighbors. Numerical experiments on synthetic and real-world datasets demonstrate that our proposed algorithm outperforms other existing spectral clustering algorithms in terms of the clustering performance measured via the normalized mutual information, clustering accuracy, and F-measure. As an example, the proposed method can provide an improvement of 15.82% in the clustering performance for the Soybean dataset.
https://doi.org/10.4218/etrij.2021-0230 인용 PDF KSCI

A Low Complexity PTS Technique using Threshold for PAPR Reduction in OFDM Systems

Lim, Dai Hwan;Rhee, Byung Ho
- KSII Transactions on Internet and Information Systems (TIIS)
- /
- v.6 no.9
- /
- pp.2191-2201
- /
- 2012
Traffic classification seeks to assign packet flows to an appropriate quality of service (QoS) class based on flow statistics without the need to examine packet payloads. Classification proceeds in two steps. Classification rules are first built by analyzing traffic traces, and then the classification rules are evaluated using test data. In this paper, we use self-organizing map and K-means clustering as unsupervised machine learning methods to identify the inherent classes in traffic traces. Three clusters were discovered, corresponding to transactional, bulk data transfer, and interactive applications. The K-nearest neighbor classifier was found to be highly accurate for the traffic data and significantly better compared to a minimum mean distance classifier.
https://doi.org/10.3837/tiis.2012.09.012 인용 PDF KSCI

A Study on a Statistical Matching Method Using Clustering for Data Enrichment

Kim Soon Y.;Lee Ki H.;Chung Sung S.
- Communications for Statistical Applications and Methods
- /
- v.12 no.2
- /
- pp.509-520
- /
- 2005
Data fusion is defined as the process of combining data and information from different sources for the effectiveness of the usage of useful information contents. In this paper, we propose a data fusion algorithm using k-means clustering method for data enrichment to improve data quality in knowledge discovery in database(KDD) process. An empirical study was conducted to compare the proposed data fusion technique with the existing techniques and shows that the newly proposed clustering data fusion technique has low MSE in continuous fusion variables.
https://doi.org/10.5351/CKSS.2005.12.2.509 인용 PDF KSCI

Plurality Rule-based Density and Correlation Coefficient-based Clustering for K-NN

Aung, Swe Swe;Nagayama, Itaru;Tamaki, Shiro
- IEIE Transactions on Smart Processing and Computing
- /
- v.6 no.3
- /
- pp.183-192
- /
- 2017
k-nearest neighbor (K-NN) is a well-known classification algorithm, being feature space-based on nearest-neighbor training examples in machine learning. However, K-NN, as we know, is a lazy learning method. Therefore, if a K-NN-based system very much depends on a huge amount of history data to achieve an accurate prediction result for a particular task, it gradually faces a processing-time performance-degradation problem. We have noticed that many researchers usually contemplate only classification accuracy. But estimation speed also plays an essential role in real-time prediction systems. To compensate for this weakness, this paper proposes correlation coefficient-based clustering (CCC) aimed at upgrading the performance of K-NN by leveraging processing-time speed and plurality rule-based density (PRD) to improve estimation accuracy. For experiments, we used real datasets (on breast cancer, breast tissue, heart, and the iris) from the University of California, Irvine (UCI) machine learning repository. Moreover, real traffic data collected from Ojana Junction, Route 58, Okinawa, Japan, was also utilized to lay bare the efficiency of this method. By using these datasets, we proved better processing-time performance with the new approach by comparing it with classical K-NN. Besides, via experiments on real-world datasets, we compared the prediction accuracy of our approach with density peaks clustering based on K-NN and principal component analysis (DPC-KNN-PCA).
https://doi.org/10.5573/IEIESPC.2017.6.3.183 인용 PDF KSCI

Improving Web Service Recommendation using Clustering with K-NN and SVD Algorithms

Weerasinghe, Amith M.;Rupasingha, Rupasingha A.H.M.
- KSII Transactions on Internet and Information Systems (TIIS)
- /
- v.15 no.5
- /
- pp.1708-1727
- /
- 2021
In the advent of the twenty-first century, human beings began to closely interact with technology. Today, technology is developing, and as a result, the world wide web (www) has a very important place on the Internet and the significant task is fulfilled by Web services. A lot of Web services are available on the Internet and, therefore, it is difficult to find matching Web services among the available Web services. The recommendation systems can help in fixing this problem. In this paper, our observation was based on the recommended method such as the collaborative filtering (CF) technique which faces some failure from the data sparsity and the cold-start problems. To overcome these problems, we first applied an ontology-based clustering and then the k-nearest neighbor (KNN) algorithm for each separate cluster group that effectively increased the data density using the past user interests. Then, user ratings were predicted based on the model-based approach, such as singular value decomposition (SVD) and the predictions used for the recommendation. The evaluation results showed that our proposed approach has a less prediction error rate with high accuracy after analyzing the existing recommendation methods.
https://doi.org/10.3837/tiis.2021.05.007 인용 PDF KSCI HTML

Search Result 47, Processing Time 0.02 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)