• Title/Summary/Keyword: k-means clustering analysis

Search Result 454, Processing Time 0.022 seconds

A Major DNA Marker Mining of BMS941 Microsatellite Locus in Hanwoo Chromosome 17

  • Lee, Jea-Young;Lee, Yong-Won
    • Journal of the Korean Data and Information Science Society
    • /
    • v.16 no.4
    • /
    • pp.913-921
    • /
    • 2005
  • We describe tests for detecting and locating quantitative traits loci (QTL) for traits in Hanwoo. Lod scores and a permutation test have been described. From results of a permutation test to detect QTL, we select major DNA markers of BMS941 microsatellite locus in Hanwoo chromosome 17 for further analysis. K-means clustering analysis applied to four traits and eight DNA markers in BMS941 resulted in three cluster groups. We conclude that the major DNA markers of BMS941 microsatellite locus in Hanwoo chromosome 17 are markers 80bp, 85bp 90bp and 105bp.

  • PDF

Clustering Validity of Social Network Subgroup Using Attribute Similarity (속성유사도에 따른 사회연결망 서브그룹의 군집유효성)

  • Yoon, Han-Seong
    • Journal of Korea Society of Digital Industry and Information Management
    • /
    • v.17 no.1
    • /
    • pp.75-84
    • /
    • 2021
  • For analyzing big data, the social network is increasingly being utilized through relational data, which means the connection characteristics between entities such as people and objects. When the relational data does not exist directly, a social network can be configured by calculating relational data such as attribute similarity from attribute data of entities and using it as links. In this paper, the composition method of the social network using the attribute similarity between entities as a connection relationship, and the clustering method using subgroups for the configured social network are suggested, and the clustering effectiveness of the clustering results is evaluated. The analysis results can vary depending on the type and characteristics of the data to be analyzed, the type of attribute similarity selected, and the criterion value. In addition, the clustering effectiveness may not be consistent depending on the its evaluation method. Therefore, selections and experiments are necessary for better analysis results. Since the analysis results may be different depending on the type and characteristics of the analysis target, options for clustering, etc., there is a limitation. In addition, for performance evaluation of clustering, a study is needed to compare the method of this paper with the conventional method such as k-means.

Major DNA Marker Mining of Hanwoo Chromosome 6 by Bootstrap Method

  • Lee, Jea-Young;Lee, Yong-Won
    • Communications for Statistical Applications and Methods
    • /
    • v.11 no.3
    • /
    • pp.657-668
    • /
    • 2004
  • Permutation test has been applied for the QTL(quantitative trait loci) analysis and we selected a major locus. K -means clustering analysis, for the major DNA Marker mining of ILSTS035 microsatellite loci in Hanwoo chromosome 6, has been described. Finally, bootstrap testing method has been adapted to calculate confidence intervals and for finding major DNA Markers.

Sample Based Algorithm for k-Spatial Medians Clustering

  • Jin, Seo-Hoon;Jung, Byoung-Cheol
    • The Korean Journal of Applied Statistics
    • /
    • v.23 no.2
    • /
    • pp.367-374
    • /
    • 2010
  • As an alternative to the k-means clustering the k-spatial medians clustering has many good points because of advantages of spatial median. However, it has not been used a lot since it needs heavy computation. If the number of objects and the number of variables are large the computation time problem is getting serious. In this study we propose fast algorithm for the k-spatial medians clustering. Practical applicability of the algorithm is shown with some numerical studies.

Performance evaluation of principal component analysis for clustering problems

  • Kim, Jae-Hwan;Yang, Tae-Min;Kim, Jung-Tae
    • Journal of Advanced Marine Engineering and Technology
    • /
    • v.40 no.8
    • /
    • pp.726-732
    • /
    • 2016
  • Clustering analysis is widely used in data mining to classify data into categories on the basis of their similarity. Through the decades, many clustering techniques have been developed, including hierarchical and non-hierarchical algorithms. In gene profiling problems, because of the large number of genes and the complexity of biological networks, dimensionality reduction techniques are critical exploratory tools for clustering analysis of gene expression data. Recently, clustering analysis of applying dimensionality reduction techniques was also proposed. PCA (principal component analysis) is a popular methd of dimensionality reduction techniques for clustering problems. However, previous studies analyzed the performance of PCA for only full data sets. In this paper, to specifically and robustly evaluate the performance of PCA for clustering analysis, we exploit an improved FCBF (fast correlation-based filter) of feature selection methods for supervised clustering data sets, and employ two well-known clustering algorithms: k-means and k-medoids. Computational results from supervised data sets show that the performance of PCA is very poor for large-scale features.

An Empirical Analysis Approach to Investigating Effectiveness of the PSO-based Clustering Method for Scholarly Papers Supported by the Research Grant Projects (개선된 PSO방법에 의한 학술연구조성사업 논문의 효과적인 분류 방법과 그 효과성에 관한 실증분석)

  • Lee, Kun-Chang;Seo, Young-Wook;Lee, Dae-Sung
    • Knowledge Management Research
    • /
    • v.10 no.4
    • /
    • pp.17-30
    • /
    • 2009
  • This study is concerned with suggesting a new clustering algorithm to evaluate the value of papers which were supported by research grants by Korea Research Fund (KRF). The algorithm is based on an extended version of a conventional PSO (Particle Swarm Optimization) mechanism. In other words, the proposed algorithm is based on integration of k-means algorithm and simulated annealing mechanism, named KASA-PSO. To evaluate the robustness of KASA-PSO, its clustering results are evaluated by research grants experts working at KRF. Empirical results revealed that the proposed KASA-PSO clustering method shows improved results than conventional clustering method.

  • PDF

Analysis of Partial Discharge Pattern of Closed Switchgear using K-means Clustering (K-means 군집화 기법을 이용한 개폐장치의 부분방전 패턴 해석)

  • Byun, Doo-Gyoon;Kim, Weon-Jong;Lee, Kang-Won;Hong, Jin-Woong
    • Journal of the Korean Institute of Electrical and Electronic Material Engineers
    • /
    • v.20 no.10
    • /
    • pp.901-906
    • /
    • 2007
  • In this study, we measured the partial discharge phenomenon of inside the closed switchgear, using ultra wide band antenna. The characteristics of $\Phi-q-n$ in the normal state are stable, and confirmed at less than 0.01, but in proceeding states, about 2 times larger. And in the abnormal state, it grew hundreds of times larger compared with normal state. According to K-means analysis, if slant of discharge characteristics is a straight line close to "0" and standard deviation is small, it is in a normal state. However if we can find a peak from K-means clusters and standard deviation to be large, it is in an abnormal state.

A Study on Sitting Posture Recognition using Machine Learning (머신러닝을 이용한 앉은 자세 분류 연구)

  • Ma, Sangyong;Hong, Sangpyo;Shim, Hyeon-min;Kwon, Jang-Woo;Lee, Sangmin
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.65 no.9
    • /
    • pp.1557-1563
    • /
    • 2016
  • According to recent studies, poor sitting posture of the spine has been shown to lead to a variety of spinal disorders. For this reason, it is important to measure the sitting posture. We proposed a strategy for classification of sitting posture using machine learning. We retrieved acceleration data from single tri-axial accelerometer attached on the back of the subject's neck in 5-types of sitting posture. 6 subjects without any spinal disorder were participated in this experiment. Acceleration data were transformed to the feature vectors of principle component analysis. Support vector machine (SVM) and K-means clustering were used to classify sitting posture with the transformed feature vectors. To evaluate performance, we calculated the correct rate for each classification strategy. Although the correct rate of SVM in sitting back arch was lower than that of K-means clustering by 2.0%, SVM's correct rate was higher by 1.3%, 5.2%, 16.6%, 7.1% in a normal posture, sitting front arch, sitting cross-legged, sitting leaning right, respectively. In conclusion, the overall correction rates were 94.5% and 88.84% in SVM and K-means clustering respectively, which means that SVM have more advantage than K-means method for classification of sitting posture.

Document Clustering Technique by K-means Algorithm and PCA (주성분 분석과 k 평균 알고리즘을 이용한 문서군집 방법)

  • Kim, Woosaeng;Kim, Sooyoung
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.18 no.3
    • /
    • pp.625-630
    • /
    • 2014
  • The amount of information is increasing rapidly with the development of the internet and the computer. Since these enormous information is managed by the document forms, it is necessary to search and process them efficiently. The document clustering technique which clusters the related documents through the similarity between the documents help to classify, search, and process the large amount of documents automatically. This paper proposes a method to find the initial seed points through principal component analysis when the documents represented by vectors in the feature vector space are clustered by K-means algorithm in order to increase clustering performance. The experiment shows that our method has a better performance than the traditional K-means algorithm.

Correlation Analysis between Injury Index of Multi-cell Headrest through k-means Clustering DB (k-means clustering DB를 통한 Multi-cell headrest의 상해지수 간 상관관계 분석)

  • Sungwook Cho;Seong S. Cheon
    • Composites Research
    • /
    • v.37 no.1
    • /
    • pp.46-52
    • /
    • 2024
  • The development of transportation methods has improved human transportation convenience and made it possible to expand the travel radius of people with disabilities who have difficulty moving. However, in the case of WAV (wheelchair Accessible Vehicle), the safety that may occur in a vehicle accident is still lower than that of regular passenger seats. In particular, in the case of a rear-end collision that may occur in a defenseless situation, it can cause fatal neck injuries to disabled passengers. Therefore, a more detailed design plan must be reflected in the headrest to be applied to WAV. In this study, a multi-cell headrest was proposed to implement local compression characteristic distribution of the headrest during rear-end collision of WAV. Afterwards, a correlation analysis was performed between the passenger's NIC (Neck Injury Criterion) and impact energy absorption using the data set construction through analysis and the clustering results using k-means clustering. As a result of clustering, it was confirmed that data clusters with similar characteristics were formed, and a correlation analysis between NIC and impact energy absorption through the characteristics of each cluster was performed. As a result of the analysis, it was confirmed that the softer the cell compression characteristics in Mid3 and Mid6, the more impact energy absorption increases, and the harder the cell compression characteristics in Front2, Mid3, and Mid6, the more effective it is in reducing NIC.