• Title/Summary/Keyword: K-Means Similarity Clustering

Search Result 79, Processing Time 0.022 seconds

Adaptive Clustering based Sparse Representation for Image Denoising (적응 군집화 기반 희소 부호화에 의한 영상 잡음 제거)

  • Kim, Seehyun
    • Journal of IKEEE
    • /
    • v.23 no.3
    • /
    • pp.910-916
    • /
    • 2019
  • Non-local similarity of natural images is one of highly exploited features in various applications dealing with images. Unique edges, texture, and pattern of the images are frequently repeated over the entire image. Once the similar image blocks are classified into a cluster, representative features of the image blocks can be extracted from the cluster. The bigger the size of the cluster is the better the additive white noise can be separated. Denoising is one of major research topics in the image processing field suppressing the additive noise. In this paper, a denoising algorithm is proposed which first clusters the noisy image blocks based on similarity, extracts the feature of the cluster, and finally recovers the original image. Performance experiments with several images under various noise strengths show that the proposed algorithm recovers the details of the image such as edges, texture, and patterns while outperforming the previous methods in terms of PSNR in removing the additive Gaussian noise.

Clustering Technique Using Relevance of Data and Applied Algorithms (데이터와 적용되는 알고리즘의 연관성을 이용한 클러스터링 기법)

  • Han Woo-Yeon;Nam Mi-Young;Rhee PhillKyu
    • The KIPS Transactions:PartB
    • /
    • v.12B no.5 s.101
    • /
    • pp.577-586
    • /
    • 2005
  • Many algorithms have been proposed for (ace recognition that is one of the most successful applications in image processing, pattern recognition and computer vision fields. Research for what kind of attribute of face that make harder or easier recognizing the target is going on recently. In flus paper, we propose method to improve recognition performance using relevance of face data and applied algorithms, because recognition performance of each algorithm according to facial attribute(illumination and expression) is change. In the experiment, we use n-tuple classifier, PCA and Gabor wavelet as recognition algorithm. And we propose three vectorization methods. First of all, we estimate the fitnesses of three recognition algorithms about each cluster after clustering the test data using k-means algorithm then we compose new clusters by integrating clusters that select same algorithm. We estimate similarity about a new cluster of test data and then we recognize the target using the nearest cluster. As a result, we can observe that the recognition performance has improved than the performance by a single algorithm without clustering.

Sequence-based Similar Music Retrieval Scheme (시퀀스 기반의 유사 음악 검색 기법)

  • Jun, Sang-Hoon;Hwang, Een-Jun
    • Journal of IKEEE
    • /
    • v.13 no.2
    • /
    • pp.167-174
    • /
    • 2009
  • Music evokes human emotions or creates music moods through various low-level musical features. Typical music clip consists of one or more moods and this can be used as an important criteria for determining the similarity between music clips. In this paper, we propose a new music retrieval scheme based on the mood change patterns of music clips. For this, we first divide music clips into segments based on low level musical features. Then, we apply K-means clustering algorithm for grouping them into clusters with similar features. By assigning a unique mood symbol for each cluster, we can represent each music clip by a sequence of mood symbols. Finally, to estimate the similarity of music clips, we measure the similarity of their musical mood sequence using the Longest Common Subsequence (LCS) algorithm. To evaluate the performance of our scheme, we carried out various experiments and measured the user evaluation. We report some of the results.

  • PDF

Nucleus Segmentation and Recognition of Uterine Cervical Pop-Smears using Region Growing Technique and Backpropagation Algorithm (영역 확장 기법과 오류 역전파 알고리즘을 이용한 자궁경부 세포진 영역 분할 및 인식)

  • Kim Kwang-Baek;Kim Sung-Shin
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.10 no.6
    • /
    • pp.1153-1158
    • /
    • 2006
  • The classification of the background and cell areas is very important research area because of the ambiguous boundary. In this paper, the region of cell is extracted from an image of uterine cervical cytodiagnosis using the region growing method that increases the region of interest based on similarity between pixels. Segmented image from background and cell areas is binarized using a threshold value. And then 8-directional tracking algorithm for contour lines is applied to extract the cell area. First, the extracted nucleus is transformed to RGB color that is the original image. Second, the K-means clustering algorithm is employed to classify RGB pixels to the R, G, and B channels, respectively. Third, the Hue information of nucleus is extracted from the HSI models that is the transformation of the clustering values in R, G, and B channels. The backpropagation algorithm is employed to classify and identify the normal or abnormal nucleus.

Optimal Associative Neighborhood Mining using Representative Attribute (대표 속성을 이용한 최적 연관 이웃 마이닝)

  • Jung Kyung-Yong
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.43 no.4 s.310
    • /
    • pp.50-57
    • /
    • 2006
  • In Electronic Commerce, the latest most of the personalized recommender systems have applied to the collaborative filtering technique. This method calculates the weight of similarity among users who have a similar preference degree in order to predict and recommend the item which hits to propensity of users. In this case, we commonly use Pearson Correlation Coefficient. However, this method is feasible to calculate a correlation if only there are the items that two users evaluated a preference degree in common. Accordingly, the accuracy of prediction falls. The weight of similarity can affect not only the case which predicts the item which hits to propensity of users, but also the performance of the personalized recommender system. In this study, we verify the improvement of the prediction accuracy through an experiment after observing the rule of the weight of similarity applying Vector similarity, Entropy, Inverse user frequency, and Default voting of Information Retrieval field. The result shows that the method combining the weight of similarity using the Entropy with Default voting got the most efficient performance.

Health Risk Management using Feature Extraction and Cluster Analysis considering Time Flow (시간흐름을 고려한 특징 추출과 군집 분석을 이용한 헬스 리스크 관리)

  • Kang, Ji-Soo;Chung, Kyungyong;Jung, Hoill
    • Journal of the Korea Convergence Society
    • /
    • v.12 no.1
    • /
    • pp.99-104
    • /
    • 2021
  • In this paper, we propose health risk management using feature extraction and cluster analysis considering time flow. The proposed method proceeds in three steps. The first is the pre-processing and feature extraction step. It collects user's lifelog using a wearable device, removes incomplete data, errors, noise, and contradictory data, and processes missing values. Then, for feature extraction, important variables are selected through principal component analysis, and data similar to the relationship between the data are classified through correlation coefficient and covariance. In order to analyze the features extracted from the lifelog, dynamic clustering is performed through the K-means algorithm in consideration of the passage of time. The new data is clustered through the similarity distance measurement method based on the increment of the sum of squared errors. Next is to extract information about the cluster by considering the passage of time. Therefore, using the health decision-making system through feature clusters, risks able to managed through factors such as physical characteristics, lifestyle habits, disease status, health care event occurrence risk, and predictability. The performance evaluation compares the proposed method using Precision, Recall, and F-measure with the fuzzy and kernel-based clustering. As a result of the evaluation, the proposed method is excellently evaluated. Therefore, through the proposed method, it is possible to accurately predict and appropriately manage the user's potential health risk by using the similarity with the patient.

A Study on Clustering of Core Competencies to Deploy in and Develop Courseworks for New Digital Technology (카드소팅을 활용한 디지털 신기술 과정 핵심역량 군집화에 관한 연구)

  • Ji-Woon Lee;Ho Lee;Joung-Huem Kwon
    • Journal of Practical Engineering Education
    • /
    • v.14 no.3
    • /
    • pp.565-572
    • /
    • 2022
  • Card sorting is a useful data collection method for understanding users' perceptions of relationships between items. In general, card sorting is an intuitive and cost-effective technique that is very useful for user research and evaluation. In this study, the core competencies of each field were used as competency cards used in the next stage of card sorting for course development, and the clustering results were derived by applying the K-means algorithm to cluster the results. As a result of card sorting, competency clustering for core competencies for each occupation in each field was verified based on Participant-Centric Analysis (PCA). For the number of core competency cards for each occupation, the number of participants who agreed appropriately for clustering and the degree of card similarity were derived compared to the number of sorting participants.

Partial Discharge Data Analysis with Unsupervised Classification (무감독분류 기법에 의한 부분방전 데이터 분석)

  • Cho, Kyungsoon;Hong, Seonhack
    • Journal of Korea Society of Digital Industry and Information Management
    • /
    • v.14 no.4
    • /
    • pp.9-16
    • /
    • 2018
  • This study described partial discharge(PD) distribution analysis between the XLPE(Cross-Linked PolyEthylene)and EPDM(Ethylene Propylene Diene Monomer) interface with unsupervised classification. The ${\phi}-q-n$ patterns were analyzed using phase resolved partial discharge(PRPD). K-means cluster analysis forms a cluster based on similarities and distances among scattered individuals, and analyzes the characteristics of the formed clusters, dividing the multivariate data into several groups according to the similarity of each characteristic, Is a statistical analysis that makes it easier to navigate. It was confirmed that the phase angle of the cluster with the maximum discharge charge was concentrated around $0^{\circ}$ and $180^{\circ}$ at 30 kV after the initial phase distribution localized around $90^{\circ}$ and $300^{\circ}$ expanded to the whole phase angle according to the voltage rise. The Euclidean distance between the center of gravity and the discharge charge in the ${\Phi}-q$ cluster increased with increasing applied voltage.

A study on the ordering of PIM family similarity measures without marginal probability (주변 확률을 고려하지 않는 확률적 흥미도 측도 계열 유사성 측도의 서열화)

  • Park, Hee Chang
    • Journal of the Korean Data and Information Science Society
    • /
    • v.26 no.2
    • /
    • pp.367-376
    • /
    • 2015
  • Today, big data has become a hot keyword in that big data may be defined as collection of data sets so huge and complex that it becomes difficult to process by traditional methods. Clustering method is to identify the information in a big database by assigning a set of objects into the clusters so that the objects in the same cluster are more similar to each other clusters. The similarity measures being used in the cluster analysis may be classified into various types depending on the nature of the data. In this paper, we computed upper and lower limits for probability interestingness measure based similarity measures without marginal probability such as Yule I and II, Michael, Digby, Baulieu, and Dispersion measure. And we compared these measures by real data and simulated experiment. By Warrens (2008), Coefficients with the same quantities in the numerator and denominator, that are bounded, and are close to each other in the ordering, are likely to be more similar. Thus, results on bounds provide means of classifying various measures. Also, knowing which coefficients are similar provides insight into the stability of a given algorithm.

SNA-based Trend Analysis of Naval Ship Maintenance

  • Yoo, Jung-Min;Yoon, Soung-woong;Lee, Sang-Hoon
    • Journal of the Korea Society of Computer and Information
    • /
    • v.24 no.6
    • /
    • pp.165-174
    • /
    • 2019
  • Naval ship maintenance generally produces various issues for effective maintenance methods and procedures, because they have been composed by numerous modules and systems, and manual-oriented maintenance needed well-trained technicians who always busy to do many other works. In this paper, we adapt SNA scheme to the service procedure and trends of ROK naval ships' equipments. Various SNA algorithms are deployed which show lots of operating options, and we show analysis results that have enough potential improvement points for the maintainers.