Search | Korea Science

Yongli Liu;Congcong Zhao;Hao Chao
- Journal of Information Processing Systems
- /
- v.19 no.6
- /
- pp.778-790
- /
- 2023
Although density peak clustering can often easily yield excellent results, there is still room for improvement when dealing with complex, high-dimensional datasets. One of the main limitations of this algorithm is its reliance on geometric distance as the sole similarity measurement. To address this limitation, we draw inspiration from the information bottleneck theory, and propose a novel density peak clustering algorithm that incorporates this theory as a similarity measure. Specifically, our algorithm utilizes the joint probability distribution between data objects and feature information, and employs the loss of mutual information as the measurement standard. This approach not only eliminates the potential for subjective error in selecting similarity method, but also enhances performance on datasets with multiple centers and high dimensionality. To evaluate the effectiveness of our algorithm, we conducted experiments using ten carefully selected datasets and compared the results with three other algorithms. The experimental results demonstrate that our information bottleneck-based density peaks clustering (IBDPC) algorithm consistently achieves high levels of accuracy, highlighting its potential as a valuable tool for data clustering tasks.
https://doi.org/10.3745/JIPS.04.0294 인용 PDF

Xiaodan Lv
- Journal of Information Processing Systems
- /
- v.20 no.2
- /
- pp.185-199
- /
- 2024
In this paper, an improved automated spectral clustering (IASC) algorithm is proposed to address the limitations of the traditional spectral clustering (TSC) algorithm, particularly its inability to automatically determine the number of clusters. Firstly, a cluster number evaluation factor based on the optimal clustering principle is proposed. By iterating through different k values, the value corresponding to the largest evaluation factor was selected as the first-rank number of clusters. Secondly, the IASC algorithm adopts a density-sensitive distance to measure the similarity between the sample points. This rendered a high similarity to the data distributed in the same high-density area. Thirdly, to improve clustering accuracy, the IASC algorithm uses the cosine angle classification method instead of K-means to classify the eigenvectors. Six algorithms-K-means, fuzzy C-means, TSC, EIGENGAP, DBSCAN, and density peak-were compared with the proposed algorithm on six datasets. The results show that the IASC algorithm not only automatically determines the number of clusters but also obtains better clustering accuracy on both synthetic and UCI datasets.
https://doi.org/10.3745/JIPS.04.0307 인용 PDF

Zan, Xiaofei;Liu, Weibin;Xing, Weiwei
- KSII Transactions on Internet and Information Systems (TIIS)
- /
- v.13 no.9
- /
- pp.4624-4644
- /
- 2019
With the development of films, games and animation industry, analysis and reuse of human motion capture data become more and more important. Human motion segmentation, which divides a long motion sequence into different types of fragments, is a key part of mocap-based techniques. However, most of the segmentation methods only take into account low-level physical information (motion characteristics) or high-level data information (statistical characteristics) of motion data. They cannot use the data information fully. In this paper, we propose an unsupervised framework using both low-level physical information and high-level data information of human motion data to solve the human segmentation problem. First, we introduce the algorithm of CFSFDP and optimize it to carry out initial segmentation and obtain a good result quickly. Second, we use the ACA method to perform optimized segmentation for improving the result of segmentation. The experiments demonstrate that our framework has an excellent performance.
https://doi.org/10.3837/tiis.2019.09.017 인용 PDF KSCI HTML