• Title/Summary/Keyword: k-means clustering method

Search Result 556, Processing Time 0.023 seconds

A Method of Detecting the Aggressive Driving of Elderly Driver (노인 운전자의 공격적인 운전 상태 검출 기법)

  • Koh, Dong-Woo;Kang, Hang-Bong
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.6 no.11
    • /
    • pp.537-542
    • /
    • 2017
  • Aggressive driving is a major cause of car accidents. Previous studies have mainly analyzed young driver's aggressive driving tendency, yet they were only done through pure clustering or classification technique of machine learning. However, since elderly people have different driving habits due to their fragile physical conditions, it is necessary to develop a new method such as enhancing the characteristics of driving data to properly analyze aggressive driving of elderly drivers. In this study, acceleration data collected from a smartphone of a driving vehicle is analyzed by a newly proposed ECA(Enhanced Clustering method for Acceleration data) technique, coupled with a conventional clustering technique (K-means Clustering, Expectation-maximization algorithm). ECA selects high-intensity data among the data of the cluster group detected through K-means and EM in all of the subjects' data and models the characteristic data through the scaled value. Using this method, the aggressive driving data of all youth and elderly experiment participants were collected, unlike the pure clustering method. We further found that the K-means clustering has higher detection efficiency than EM method. Also, the results of K-means clustering demonstrate that a young driver has a driving strength 1.29 times higher than that of an elderly driver. In conclusion, the proposed method of our research is able to detect aggressive driving maneuvers from data of the elderly having low operating intensity. The proposed method is able to construct a customized safe driving system for the elderly driver. In the future, it will be possible to detect abnormal driving conditions and to use the collected data for early warning to drivers.

Combined Artificial Bee Colony for Data Clustering (융합 인공벌군집 데이터 클러스터링 방법)

  • Kang, Bum-Su;Kim, Sung-Soo
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.40 no.4
    • /
    • pp.203-210
    • /
    • 2017
  • Data clustering is one of the most difficult and challenging problems and can be formally considered as a particular kind of NP-hard grouping problems. The K-means algorithm is one of the most popular and widely used clustering method because it is easy to implement and very efficient. However, it has high possibility to trap in local optimum and high variation of solutions with different initials for the large data set. Therefore, we need study efficient computational intelligence method to find the global optimal solution in data clustering problem within limited computational time. The objective of this paper is to propose a combined artificial bee colony (CABC) with K-means for initialization and finalization to find optimal solution that is effective on data clustering optimization problem. The artificial bee colony (ABC) is an algorithm motivated by the intelligent behavior exhibited by honeybees when searching for food. The performance of ABC is better than or similar to other population-based algorithms with the added advantage of employing fewer control parameters. Our proposed CABC method is able to provide near optimal solution within reasonable time to balance the converged and diversified searches. In this paper, the experiment and analysis of clustering problems demonstrate that CABC is a competitive approach comparing to previous partitioning approaches in satisfactory results with respect to solution quality. We validate the performance of CABC using Iris, Wine, Glass, Vowel, and Cloud UCI machine learning repository datasets comparing to previous studies by experiment and analysis. Our proposed KABCK (K-means+ABC+K-means) is better than ABCK (ABC+K-means), KABC (K-means+ABC), ABC, and K-means in our simulations.

Automatic Dynamic Range Improvement Method using Histogram Modification and K-means Clustering (히스토그램 변형 및 K-means 분류 기반 동적 범위 개선 기법)

  • Cha, Su-Ram;Kim, Jeong-Tae;Kim, Min-Seok
    • Journal of Broadcast Engineering
    • /
    • v.16 no.6
    • /
    • pp.1047-1057
    • /
    • 2011
  • In this paper, we propose a novel tone mapping method that implements histogram modification framework on two local regions that are classified using K-means clustering algorithm. In addition, we propose automatic parameter tuning method for histogram modification. The proposed method enhances local details better than the global histogram method. Moreover, the proposed method is fully automatic in the sense that it does not require intervention from human to tune parameters that are involved for computing tone mapping functions. In simulations and experimental studies, the proposed method showed better performance than existing histogram modification method.

Revising K-Means Clustering under Semi-Supervision

  • Huh Myung-Hoe;Yi SeongKeun;Lee Yonggoo
    • Communications for Statistical Applications and Methods
    • /
    • v.12 no.2
    • /
    • pp.531-538
    • /
    • 2005
  • In k-means clustering, we standardize variables before clustering and iterate two steps: units allocation by Euclidean sense and centroids updating. In applications to DB marketing where clusters are to be used as customer segments with similar consumption behaviors, we frequently acquire additional variables on the customers or the units through marketing campaigns a posteriori. Hence we need to modify the clusters originally formed after each campaign. The aim of this study is to propose a revision method of k-means clusters, incorporating added information by weighting clustering variables. We illustrate the proposed method in an empirical case.

Segmentation of Color Image by Subtractive and Gravity Fuzzy C-means Clustering (차감 및 중력 fuzzy C-means 클러스터링을 이용한 칼라 영상 분할에 관한 연구)

  • Jin, Young-Goun;Kim, Tae-Gyun
    • Journal of IKEEE
    • /
    • v.1 no.1 s.1
    • /
    • pp.93-100
    • /
    • 1997
  • In general, fuzzy C-means clustering method was used on the segmentation of true color image. However, this method requires number of clusters as an input. In this study, we suggest new method that uses subtractive and gravity fuzzy C-means clustering. We get number of clusters and initial cluster centers by applying subtractive clustering on color image. After coarse segmentation of the image, we apply gravity fuzzy C-means for optimizing segmentation of the image. We show efficiency of the proposed algorithm by qualitative evaluation.

  • PDF

An Improved Automated Spectral Clustering Algorithm

  • Xiaodan Lv
    • Journal of Information Processing Systems
    • /
    • v.20 no.2
    • /
    • pp.185-199
    • /
    • 2024
  • In this paper, an improved automated spectral clustering (IASC) algorithm is proposed to address the limitations of the traditional spectral clustering (TSC) algorithm, particularly its inability to automatically determine the number of clusters. Firstly, a cluster number evaluation factor based on the optimal clustering principle is proposed. By iterating through different k values, the value corresponding to the largest evaluation factor was selected as the first-rank number of clusters. Secondly, the IASC algorithm adopts a density-sensitive distance to measure the similarity between the sample points. This rendered a high similarity to the data distributed in the same high-density area. Thirdly, to improve clustering accuracy, the IASC algorithm uses the cosine angle classification method instead of K-means to classify the eigenvectors. Six algorithms-K-means, fuzzy C-means, TSC, EIGENGAP, DBSCAN, and density peak-were compared with the proposed algorithm on six datasets. The results show that the IASC algorithm not only automatically determines the number of clusters but also obtains better clustering accuracy on both synthetic and UCI datasets.

Tree-structured Clustering for Continuous Data (연속형 자료에 대한 나무형 군집화)

  • Huh Myung-Hoe;Yang Kyung-Sook
    • The Korean Journal of Applied Statistics
    • /
    • v.18 no.3
    • /
    • pp.661-671
    • /
    • 2005
  • The aim of this study is to propose a clustering method, called tree-structured clustering, by recursively partitioning continuous multivariate dat a based on overall $R^2$ criterion with a practical node-splitting decision rule. The clustering method produces easily interpretable clustering rules of tree types with the variable selection function. In numerical examples (Fisher's iris data and a Telecom case), we note several differences between tree-structured clustering and K-means clustering.

Development of a Clustering Model for Automatic Knowledge Classification (지식 분류의 자동화를 위한 클러스터링 모형 연구)

  • 정영미;이재윤
    • Journal of the Korean Society for information Management
    • /
    • v.18 no.2
    • /
    • pp.203-230
    • /
    • 2001
  • The purpose of this study is to develop a document clustering model for automatic classification of knowledge. Two test collections of newspaper article texts and journal article abstracts are built for the clustering experiment. Various feature reduction criteria as well as term weighting methods are applied to the term sets of the test collections, and cosine and Jaccard coefficients are used as similarity measures. The performances of complete linkage and K-means clustering algorithms are compared using different feature selection methods and various term weights. It was found that complete linkage clustering outperforms K-means algorithm and feature reduction up to almost 10% of the total feature sets does not lower the performance of document clustering to any significant extent.

  • PDF

Nonparametric analysis of income distributions among different regions based on energy distance with applications to China Health and Nutrition Survey data

  • Ma, Zhihua;Xue, Yishu;Hu, Guanyu
    • Communications for Statistical Applications and Methods
    • /
    • v.26 no.1
    • /
    • pp.57-67
    • /
    • 2019
  • Income distribution is a major concern in economic theory. In regional economics, it is often of interest to compare income distributions in different regions. Traditional methods often compare the income inequality of different regions by assuming parametric forms of the income distributions, or using summary statistics like the Gini coefficient. In this paper, we propose a nonparametric procedure to test for heterogeneity in income distributions among different regions, and a K-means clustering procedure for clustering income distributions based on energy distance. In simulation studies, it is shown that the energy distance based method has competitive results with other common methods in hypothesis testing, and the energy distance based clustering method performs well in the clustering problem. The proposed approaches are applied in analyzing data from China Health and Nutrition Survey 2011. The results indicate that there are significant differences among income distributions of the 12 provinces in the dataset. After applying a 4-means clustering algorithm, we obtained the clustering results of the income distributions in the 12 provinces.

Clustering of Decision Making Units using DEA (DEA를 이용한 의사결정단위의 클러스터링)

  • Kim, Kyeongtaek
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.37 no.4
    • /
    • pp.239-244
    • /
    • 2014
  • The conventional clustering approaches are mostly based on minimizing total dissimilarity of input and output. However, the clustering approach may not be helpful in some cases of clustering decision making units (DMUs) with production feature converting multiple inputs into multiple outputs because it does not care converting functions. Data envelopment analysis (DEA) has been widely applied for efficiency estimation of such DMUs since it has non-parametric characteristics. We propose a new clustering method to identify groups of DMUs that are similar in terms of their input-output profiles. A real world example is given to explain the use and effectiveness of the proposed method. And we calculate similarity value between its result and the result of a conventional clustering method applied to the example. After the efficiency value was added to input of K-means algorithm, we calculate new similarity value and compare it with the previous one.