Robust K-means for Global Optimization

Si-Hwan Jang;Joon Lee;Jae-Hyeon Eom;Sung-Soo Kim;

doi:10.22805/JIT.2024.44.1.017

Journal of Industrial Technology (산업기술연구)

Volume 44 Issue 1
/
Pages.17-23
/
2024
/
1229-9588(pISSN)
/
1598-1371(eISSN)

Kangwon National University, Institute of Industrial Technology (강원대학교 산업기술연구소)

DOI QR Code

Robust K-means for Global Optimization

전역 최적화를 위한 강건한 K-means

Si-Hwan Jang (ETRI) ;
Joon Lee (Division of Energy Resource and Industrial Engineering, Kangwon National University) ;
Jae-Hyeon Eom (Division of Energy Resource and Industrial Engineering, Kangwon National University) ;
Sung-Soo Kim (Division of Energy Resource and Industrial Engineering, Kangwon National University)

Received : 2024.05.31
Accepted : 2024.11.18
Published : 2024.12.31

https://doi.org/10.22805/JIT.2024.44.1.017 Citation PDF

Download PDF

⟨ Previous Next ⟩

Abstract

K-means is a popular and efficient data clustering method which is one of the most important technique in data mining. K-means is sensitive for initialization and has the possibility to be stuck in local optimum because of hill climbing clustering method. Therefore, we need a robust K-means (RK-means) not only to reduce this possibility but also to increase the probability to search the global optimal clustering solution. The objective of this paper is to propose RK-means with best initial solution from good solutions with good central data for each cluster. The central data of each cluster is selected based on Roulette wheel probabilistic selection using sum of relative distance rate of each data. They have a problem in high density data because they deterministically select the central data for just one initial solution of K-medoid. Our proposed initial solution is the good starting point to find the robust solution by K-means with reducing the possibility being stuck in local optimal solutions. The performance of proposed RK-means data clustering is validated using machine learning repository datasets (Iris, Wine, Glass, Vowel, Cloud) comparing to original K-means by experiment and analysis. Our simulation shows that RK-means using probabilistically relative distance rate are better than K-means with random initialization. The minimum squared distance by RK-means with smaller deviation is lower than that by K-means with higher deviation. RK-means is competitive comparing to data clustering methods based on simulated annealing (SA) and hybrid K-means with SA (KSA & KSAK).

Keywords

Acknowledgement

본 연구는 문화체육관광부 및 한국콘텐츠진흥원의 2024년도 문화체육관광 연구개발사업으로 수행되었음(과제명 : 중소 게임 기업의 게임 제작 검증 효율화를 위한 AI 기반의 대규모 게임 자동검증 기술 개발, 과제번호 : RS-2024-00393500, 기여율: 100%)

Journal of Industrial Technology (산업기술연구)

Robust K-means for Global Optimization

전역 최적화를 위한 강건한 K-means

Abstract

Keywords

Acknowledgement

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)