• 제목/요약/키워드: K-Means Clustering

검색결과 1,106건 처리시간 0.025초

Development of an unsupervised learning-based ESG evaluation process for Korean public institutions without label annotation

  • Do Hyeok Yoo;SuJin Bak
    • 한국컴퓨터정보학회논문지
    • /
    • 제29권5호
    • /
    • pp.155-164
    • /
    • 2024
  • 본 연구는 ESG 등급이 제공되지 않는 국내 공공기관의 ESG 등급을 추정하는 비지도 학습 기반 군집모형을 제안한다. 이를 위해, 스펙트럼 군집과 k-means 군집에서 최적의 클러스터 수를 비교했고, 그 결과의 신뢰성을 보장하기 위해 성능지표인 Davies-Bouldin Index (DBI)를 계산했다. 결과적으로, 스펙트럼 군집과 k-means 군집에서 각각 0.734 및 1.715의 DBI 값을 산출했는데, 이는 값이 작을수록 우수한 성능을 의미하므로 스펙트럼 군집의 우수성을 확인하였다. 게다가, T-검정 및 ANOVA를 이용하여 ESG 비재무 데이터 간 통계적으로 유의미한 차이를 밝혀내고, 상관계수를 이용하여 ESG 항목 간 상관관계를 확인했다. 본 연구는 이러한 결과를 바탕으로 기존 ESG 등급 없이 공공기관별 ESG 성과 순위를 추정할 가능성을 제시한다. 이는 최적의 클러스터 수를 계산한 다음, 각 클러스터 내 ESG 데이터의 평균 총합을 결정함으로써 달성된다. 따라서, 제안된 모델은 다양한 국내 공공기관의 ESG 등급을 평가하는 근거로 활용될 수 있고, 국내 지속가능경영 실천과 성과관리에 유용할 것으로 기대된다.

머신러닝을 이용한 앉은 자세 분류 연구 (A Study on Sitting Posture Recognition using Machine Learning)

  • 마상용;홍상표;심현민;권장우;이상민
    • 전기학회논문지
    • /
    • 제65권9호
    • /
    • pp.1557-1563
    • /
    • 2016
  • According to recent studies, poor sitting posture of the spine has been shown to lead to a variety of spinal disorders. For this reason, it is important to measure the sitting posture. We proposed a strategy for classification of sitting posture using machine learning. We retrieved acceleration data from single tri-axial accelerometer attached on the back of the subject's neck in 5-types of sitting posture. 6 subjects without any spinal disorder were participated in this experiment. Acceleration data were transformed to the feature vectors of principle component analysis. Support vector machine (SVM) and K-means clustering were used to classify sitting posture with the transformed feature vectors. To evaluate performance, we calculated the correct rate for each classification strategy. Although the correct rate of SVM in sitting back arch was lower than that of K-means clustering by 2.0%, SVM's correct rate was higher by 1.3%, 5.2%, 16.6%, 7.1% in a normal posture, sitting front arch, sitting cross-legged, sitting leaning right, respectively. In conclusion, the overall correction rates were 94.5% and 88.84% in SVM and K-means clustering respectively, which means that SVM have more advantage than K-means method for classification of sitting posture.

K-MEANS CLUSTERING 기반 영상의 공간 해상도 축소 변환을 위한 효울적 움직임 벡터 재예측 방법 (Efficient Motion Re-Estimation Method Based on K-Means Clustering for Spatial Resolution Reduction Transcoding)

  • 김경환;정진우;최윤식
    • 한국방송∙미디어공학회:학술대회논문집
    • /
    • 한국방송공학회 2011년도 하계학술대회
    • /
    • pp.567-569
    • /
    • 2011
  • 최근 비디오를 즐기는 방법에 있어서 다양한 형식 및 기기가 사용되고 있으며, 이러한 실질적 요구를 충족시키기 위한 방법으로 빠른 비디오 변환 기술이 필요하다. 비디오 변환 기술 중 해상도 축소를 위한 새로운 움직임 벡터 재예측 방법을 제안한다. 줄어든 영상 내 블록의 움직임 벡터를 결정하기 위해 원본 영상 내 대응 되는 위치의 2개 이상의 움직임 벡터들을 K-means clustering 방법 기반으로 다중 후보 움직임 벡터를 결정하고, 결정된 움직임 벡터 중에서 차이의 절대값 합이 최소가 되는 움직임 벡터를 줄어든 영상 내 블록을 위한 움직임 벡터로 결정한다,. 실험 결과 비디오 변환 없이 압축을 수행한 연산시간에 비해 9% 정도의 연산시간이 필요하였으며, 압축 효율은 BR-RATE가 약 17정도 증가하여 기존의 방식의 증가량에 비해 60%로 줄어든 결과를 보여주었다.

  • PDF

Reinterpretation of Multiple Correspondence Analysis using the K-Means Clustering Analysis

  • Choi, Yong-Seok;Hyun, Gee Hong;Kim, Kyung Hee
    • Communications for Statistical Applications and Methods
    • /
    • 제9권2호
    • /
    • pp.505-514
    • /
    • 2002
  • Multiple correspondence analysis graphically shows the correspondent relationship among categories in multi-way contingency tables. It is well known that the proportions of the principal inertias as part of the total inertia is low in multiple correspondence analysis. Moreover, although this problem can be overcome by using the Benzecri formula, it is not enough to show clear correspondent relationship among categories (Greenacre and Blasius, 1994, Chapter 10). In addition, they show that Andrews' plot is useful in providing the correspondent relationship among categories. However, this method also does not give some concise interpretation among categories when the number of categories is large. Therefore, in this study, we will easily interpret the multiple correspondence analysis by applying the K-means clustering analysis.

A Study on the Gustafson-Kessel Clustering Algorithm in Power System Fault Identification

  • Abdullah, Amalina;Banmongkol, Channarong;Hoonchareon, Naebboon;Hidaka, Kunihiko
    • Journal of Electrical Engineering and Technology
    • /
    • 제12권5호
    • /
    • pp.1798-1804
    • /
    • 2017
  • This paper presents an approach of the Gustafson-Kessel (GK) clustering algorithm's performance in fault identification on power transmission lines. The clustering algorithm is incorporated in a scheme that uses hybrid intelligent technique to combine artificial neural network and a fuzzy inference system, known as adaptive neuro-fuzzy inference system (ANFIS). The scheme is used to identify the type of fault that occurs on a power transmission line, either single line to ground, double line, double line to ground or three phase. The scheme is also capable an analyzing the fault location without information on line parameters. The range of error estimation is within 0.10 to 0.85 relative to five values of fault resistances. This paper also presents the performance of the GK clustering algorithm compared to fuzzy clustering means (FCM), which is particularly implemented in structuring a data. Results show that the GK algorithm may be implemented in fault identification on power system transmission and performs better than FCM.

Pattern Analysis and Performance Comparison of Lottery Winning Numbers

  • Jung, Yong Gyu;Han, Soo Ji;kim, Jae Hee
    • International Journal of Internet, Broadcasting and Communication
    • /
    • 제6권1호
    • /
    • pp.16-22
    • /
    • 2014
  • Clustering methods such as k-means and EM are the group of classification and pattern recognition, which are used in management science and literature search widely. In this paper, k-means and EM algorithm are compared the performance using by Weka. The winning Lottery numbers of 567 cases are experimented for our study and presentation. Processing speed of the k-means algorithm is superior to the EM algorithm, which is about 0.08 seconds faster than the other. As the result it is summerized that EM algorithm is better than K-means algorithm with comparison of accuracy, precision and recall. While K-means is known to be sensitive to the distribution of data, EM algorithm is probability sensitive for clustering.

Normal Mixture Model with General Linear Regressive Restriction: Applied to Microarray Gene Clustering

  • Kim, Seung-Gu
    • Communications for Statistical Applications and Methods
    • /
    • 제14권1호
    • /
    • pp.205-213
    • /
    • 2007
  • In this paper, the normal mixture model subjected to general linear restriction for component-means based on linear regression is proposed, and its fitting method by EM algorithm and Lagrange multiplier is provided. This model is applied to gene clustering of microarray expression data, which demonstrates it has very good performances for real data set. This model also allows to obtain the clusters that an analyst wants to find out in the fashion that the hypothesis for component-means is represented by the design matrices and the linear restriction matrices.

퍼지 Clustering 알고리즘을 이용한 휘발성 화학물질의 분류 (Classification of Volatile Chemicals using Fuzzy Clustering Algorithm)

  • 변형기;김갑일
    • 대한전기학회:학술대회논문집
    • /
    • 대한전기학회 1996년도 하계학술대회 논문집 B
    • /
    • pp.1042-1044
    • /
    • 1996
  • The use of fuzzy theory in task of pattern recognition may be applicable gases and odours classification and recognition. This paper reports results obtained from fuzzy c-means algorithms to patterns generated by odour sensing system using an array of conducting polymer sensors, for volatile chemicals. For the volatile chemicals clustering problem, the three unsupervise fuzzy c-means algorithms were applied. From among the pattern clustering methods, the FCMAW algorithm, which updated the cluster centres more frequently, consistently outperformed. It has been confirmed as an outstanding clustering algorithm throughout experimental trials.

  • PDF

공간 탐색 최적화 알고리즘을 이용한 K-Means 클러스터링 기반 다항식 방사형 기저 함수 신경회로망: 설계 및 비교 해석 (K-Means-Based Polynomial-Radial Basis Function Neural Network Using Space Search Algorithm: Design and Comparative Studies)

  • 김욱동;오성권
    • 제어로봇시스템학회논문지
    • /
    • 제17권8호
    • /
    • pp.731-738
    • /
    • 2011
  • In this paper, we introduce an advanced architecture of K-Means clustering-based polynomial Radial Basis Function Neural Networks (p-RBFNNs) designed with the aid of SSOA (Space Search Optimization Algorithm) and develop a comprehensive design methodology supporting their construction. In order to design the optimized p-RBFNNs, a center value of each receptive field is determined by running the K-Means clustering algorithm and then the center value and the width of the corresponding receptive field are optimized through SSOA. The connections (weights) of the proposed p-RBFNNs are of functional character and are realized by considering three types of polynomials. In addition, a WLSE (Weighted Least Square Estimation) is used to estimate the coefficients of polynomials (serving as functional connections of the network) of each node from output node. Therefore, a local learning capability and an interpretability of the proposed model are improved. The proposed model is illustrated with the use of nonlinear function, NOx called Machine Learning dataset. A comparative analysis reveals that the proposed model exhibits higher accuracy and superb predictive capability in comparison to some previous models available in the literature.

새떼 이동의 모방에 의한 k-평균 군집 속도의 향상 (Enhancement of the k-Means Clustering Speed by Emulation of Birds' Motion in Flock)

  • 이창영
    • 한국전자통신학회논문지
    • /
    • 제9권9호
    • /
    • pp.965-970
    • /
    • 2014
  • K-평균 군집에서 수렴 속도를 향상시키기 위한 노력으로서, 우리는 새떼 이동의 개념을 도입한다. 그들 운동의 특징은 각 새가 그의 가장 가까운 이웃을 쫓아간다는 것이다. 우리는 군집 과정에 이 특징을 활용한다. 일단 한 벡터의 클래스가 결정되면, 그 근처의 몇 벡터들에게 동일한 클래스가 부여된다. 실험 결과 군집 종결에 필요한 계산 반복 횟수가 종전 방법에 비해 유의미하게 작은 것으로 나타났다. 게다가 단일 반복 계산에 소요되는 시간이 5% 이상 짧았다. 벡터와 센트로이드 사이의 거리를 누적한 값으로 군집의 품질을 평가한 바, 본 논문에서 제안한 방법과 종전 방법과의 차이는 거의 없었다. 결론적으로, 본 논문에서 제안한 방법에 의해, 보다 짧은 계산 시간으로 질적 하락 없는 군집을 수행할 수 있었다.