• Title/Summary/Keyword: K-means 군집화

Search Result 273, Processing Time 0.03 seconds

SPOT/VEGETATION-based Algorithm for the Discrimination of Cloud and Snow (SPOT/VEGETATION 영상을 이용한 눈과 구름의 분류 알고리즘)

  • Han Kyung-Soo;Kim Young-Seup
    • Korean Journal of Remote Sensing
    • /
    • v.20 no.4
    • /
    • pp.235-244
    • /
    • 2004
  • This study focuses on the assessment for proposed algorithm to discriminate cloudy pixels from snowy pixels through use of visible, near infrared, and short wave infrared channel data in VEGETATION-1 sensor embarked on SPOT-4 satellite. Traditional threshold algorithms for cloud and snow masks did not show very good accuracy. Instead of these independent masking procedures, K-Means clustering scheme is employed for cloud/snow discrimination in this study. The pixels used in clustering were selected through an integration of two threshold algorithms, which group ensemble the snow and cloud pixels. This may give a opportunity to simplify the clustering procedure and to improve the accuracy as compared with full image clustering. This paper also compared the results with threshold methods of snow cover and clouds, and assesses discrimination capability in VEGETATION channels. The quality of the cloud and snow mask even more improved when present algorithm is implemented. The discrimination errors were considerably reduced by 19.4% and 9.7% for cloud mask and snow mask as compared with traditional methods, respectively.

Design and Implementation of Paper Classification Systems based on Keyword Extraction and Clustering (키워드 추출과 군집화 기반의 논문 분류 시스템의 설계 및 구현)

  • Lee, Yun-Soo;Pheaktra, They;Lee, Jong-Hyuk;Gil, Joon-Min
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2018.05a
    • /
    • pp.48-51
    • /
    • 2018
  • 컴퓨터 및 기술의 발전으로 힘입어 수많은 논문이 오프라인뿐 아니라 온라인으로 발행되고 있고, 새로운 분야들도 계속 생기면서 사용자들은 방대한 논문들 중 자신이 필요로 하는 논문을 검색하거나 분류하기에 많은 어려움을 겪고 있다. 이러한 한계를 극복하기 위해 본 논문에서는 유사 내용의 논문을 분류하고 이를 군집화하는 방법을 제안한다. 제안하는 방법은 TF-IDF를 이용하여 각 논문의 초록으로 부터 대표 주제어를 추출하고, K-means 클러스터링 알고리즘을 이용하여 추출한 TF-IDF 값을 근거로 논문들을 유사 내용의 논문으로 군집화한다.

Hierarchical and Incremental Clustering for Semi Real-time Issue Analysis on News Articles (준 실시간 뉴스 이슈 분석을 위한 계층적·점증적 군집화)

  • Kim, Hoyong;Lee, SeungWoo;Jang, Hong-Jun;Seo, DongMin
    • The Journal of the Korea Contents Association
    • /
    • v.20 no.6
    • /
    • pp.556-578
    • /
    • 2020
  • There are many different researches about how to analyze issues based on real-time news streams. But, there are few researches which analyze issues hierarchically from news articles and even a previous research of hierarchical issue analysis make clustering speed slower as the increment of news articles. In this paper, we propose a hierarchical and incremental clustering for semi real-time issue analysis on news articles. We trained siamese neural network based weighted cosine similarity model, applied this model to k-means algorithm which is used to make word clusters and converted news articles to document vectors by using these word clusters. Finally, we initialized an issue cluster tree from document vectors, updated this tree whenever news articles happen, and analyzed issues in semi real-time. Through the experiment and evaluation, we showed that up to about 0.26 performance has been improved in terms of NMI. Also, in terms of speed of incremental clustering, we also showed about 10 times faster than before.

Adjustment of the Mean Field Rainfall Bias by Clustering Technique (레이더 자료의 군집화를 통한 Mean Field Rainfall Bias의 보정)

  • Kim, Young-Il;Kim, Tae-Soon;Heo, Jun-Haeng
    • Journal of Korea Water Resources Association
    • /
    • v.42 no.8
    • /
    • pp.659-671
    • /
    • 2009
  • Fuzzy c-means clustering technique is applied to improve the accuracy of G/R ratio used for rainfall estimation by radar reflectivity. G/R ratio is computed by the ground rainfall records at AWS(Automatic Weather System) sites to the radar estimated rainfall from the reflectivity of Kwangduck Mt. radar station with 100km effective range. G/R ratio is calculated by two methods: the first one uses a single G/R ratio for the entire effective range and the other two different G/R ratio for two regions that is formed by clustering analysis, and absolute relative error and root mean squared error are employed for evaluating the accuracy of radar rainfall estimation from two G/R ratios. As a result, the radar rainfall estimated by two different G/R ratio from clustering analysis is more accurate than that by a single G/R ratio for the entire range.

Clustering Performance Analysis for Time Series Data: Wavelet vs. Autoencoder (시계열 데이터에 대한 클러스터링 성능 분석: Wavelet과 Autoencoder 비교)

  • Hwang, Woosung;Lim, Hyo-Sang
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2018.10a
    • /
    • pp.585-588
    • /
    • 2018
  • 시계열 데이터의 특징을 추출하여 분석하는 과정에서 시게열 데이터가 가지는 고차원성은 차원의 저주(Course of Dimensionality)로 인해 데이터내의 유효한 정보를 찾는데 어려움을 만든다. 이러한 문제를 해결하기 위해 차원 축소 기법(dimensionality reduction)이 널리 사용되고 있지만, 축소 과정에서 발생하는 정보의 희석으로 인하여 시계열 데이터에 대한 군집화(clustering)등을 수행하는데 있어서 성능의 변화를 가져온다. 본 논문은 이러한 현상을 관찰하기 위해 이산 웨이블릿 변환(Discrete Wavelet Transform:DWT)과 오토 인코더(AutoEncoder)를 차원 축소 기법으로 활용하여 시계열 데이터의 차원을 압축 한 뒤, 압축된 데이터를 K-평균(K-means) 알고리즘에 적용하여 군집화의 효율성을 비교하였다. 성능 비교 결과, DWT는 압축된 차원수 그리고 오토인코더는 시계열 데이터에 대한 충분한 학습이 각각 보장된다면 좋은 군집화 성능을 보이는 것을 확인하였다.

Optimal Arrangement of Patrol Ships based on k-Means Clustering for Quick Response of Marine Accidents (해양사고 신속대응을 위한 k-평균 군집화 기반 경비함정 최적배치)

  • Yoo, Sang-Lok;Jung, Cho-Young
    • Journal of the Korean Society of Marine Environment & Safety
    • /
    • v.23 no.7
    • /
    • pp.775-782
    • /
    • 2017
  • The position of existing patrol ships has been decided according to subjective judgments, not purely by any reasonable or scientific criteria, because of a lack of access to marine accident positions. In this study, the optimal location of patrol ships is quantitatively determined based on historical marine accident data. The study area used included the coastal sea of Pohang in South Korea. In this study, a k-means clustering algorithm was used to derive the location of patrol ships, and then a Voronoi diagram was used to divide the region around each patrol ship. As a result, the average navigation distance for patrol ships was improved by 4.4 nautical miles, and the average arrival time was improved by 13.2 minutes per marine accident. Moreover, if the locations of patrol ships need to be changed flexibly, it will be possible to optimally arrange limited resources using the technique developed in this study to ensure a fast rescue.

A Comparative Study on Statistical Clustering Methods and Kohonen Self-Organizing Maps for Highway Characteristic Classification of National Highway (일반국도 도로특성분류를 위한 통계적 군집분석과 Kohonen Self-Organizing Maps의 비교연구)

  • Cho, Jun Han;Kim, Seong Ho
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.29 no.3D
    • /
    • pp.347-356
    • /
    • 2009
  • This paper is described clustering analysis of traffic characteristics-based highway classification in order to deviate from methodologies of existing highway functional classification. This research focuses on comparing the clustering techniques performance based on the total within-group errors and deriving the optimal number of cluster. This research analyzed statistical clustering method (Hierarchical Ward's minimum-variance method, Nonhierarchical K-means method) and Kohonen self-organizing maps clustering method for highway characteristic classification. The outcomes of cluster techniques compared for the number of samples and traffic characteristics from subsets derived by the optimal number of cluster. As a comprehensive result, the k-means method is superior result to other methods less than 12. For a cluster of more than 20, Kohonen self-organizing maps is the best result in the cluster method. The main contribution of this research is expected to use important the basic road attribution information that produced the highway characteristic classification.

A Study on Research Paper Classification Using Keyword Clustering (키워드 군집화를 이용한 연구 논문 분류에 관한 연구)

  • Lee, Yun-Soo;Pheaktra, They;Lee, JongHyuk;Gil, Joon-Min
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.7 no.12
    • /
    • pp.477-484
    • /
    • 2018
  • Due to the advancement of computer and information technologies, numerous papers have been published. As new research fields continue to be created, users have a lot of trouble finding and categorizing their interesting papers. In order to alleviate users' this difficulty, this paper presents a method of grouping similar papers and clustering them. The presented method extracts primary keywords from the abstracts of each paper by using TF-IDF. Based on TF-IDF values extracted using K-means clustering algorithm, our method clusters papers to the ones that have similar contents. To demonstrate the practicality of the proposed method, we use paper data in FGCS journal as actual data. Based on these data, we derive the number of clusters using Elbow scheme and show clustering performance using Silhouette scheme.

Tree-structured Clustering for Mixed Data (혼합형 데이터에 대한 나무형 군집화)

  • Yang Kyung-Sook;Huh Myung-Hoe
    • The Korean Journal of Applied Statistics
    • /
    • v.19 no.2
    • /
    • pp.271-282
    • /
    • 2006
  • The aim of this study is to propose a tree-structured clustering for mixed data. We suggest a scaling method to reduce the variable selection bias among categorical variables. In numerical examples such as credit data, German credit data, we note several differences between tree-structured clustering and K-means clustering.

Crowd Analysis System Using Human Recognition and Clustering Techniques (사람인식 및 클러스터링 기법을 이용한 군집분석 시스템)

  • Tae-jeong Park;Ji-ho Park;Bo-yoon Seo;Jun-ha Shin;Kyung-hwan Choi;Hongseok Yoo
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2023.07a
    • /
    • pp.485-487
    • /
    • 2023
  • 최근 코로나 19 방역지침 해제로 인한 대면적인 활동이 많아지면서 사람에 대한 서비스 제공이 중요한 이슈가 되었다. 하지만 사람들이 밀집되어있는 곳에서는 서비스가 원할하게 이루어지지 않는 경우가 대부분이다. 본 논문에서는 객체인식 알고리즘 기술인 Yolo와 OpenCv를 통해 카메라로 영상 속의 사람들을 인식하여 군집화 기술인 K-means 클러스터링을 이용해서 사람에 대한 군집화를 진행후 우선순위를 선정하고 좌표를 지정하여서 로봇이 군집의 좌표로 이동하여서 사람들에게 직접 접근하여 서비스를 제공할 수 있도록 하였다.

  • PDF