• Title/Summary/Keyword: 공간 군집분석

Search Result 488, Processing Time 0.028 seconds

Cluster Merging Using Density based Fuzzy C-Means algorithm (밀도 기반의 퍼지 C-Means 알고리즘을 이용한 클러스터 합병)

  • 한진우;전성해;오경환
    • Proceedings of the Korean Institute of Intelligent Systems Conference
    • /
    • 2003.05a
    • /
    • pp.235-238
    • /
    • 2003
  • Fuzzy C-Means(FCM) 알고리즘은 초기 군집 중심의 개수와 위치에 따라 군집 결과의 성능차이가 많이 나타난다. 하지만 일반적인 경우에 군집 중심의 개수는 분석가의 주관에 의해 결정되고, 임의적으로 결정되기 때문에 원래 데이터의 구조와는 무관하게 수행되어 최적화된 군집화 수행을 실행하지 못하는 경우가 발생하게 된다. 따라서 본 논문에서는 원래의 데이터의 구조에 좀더 근접한 퍼지 군집화를 수행하기 위하여 격자를 바탕으로 한 데이터의 밀도를 이용한 FCM을 제안하고, 이러한 밀도 기반 FCM에 의해 결정된 군집의 합병 기법을 제안하였다. N-차원의 데이터 공간을 N-차원의 격자로 나누고, 초기 군집 중심의 개수와 위치는 각 격자의 밀도를 바탕으로 결정된다. 초기화 이후에 각 격자 내부에서 FCM을 이용하여 군집화를 수행하고, 계속해서 이웃 격자의 군집결과에 대하여 군집간의 유사도 측도를 이용하여 군집 합병을 수행함으로써 데이터의 자연적인 구조에 근접한 군집화를 수행하였다. 제안된 군집화 합병 기법의 향상된 성능은 UCI Machine Learning Repository 데이터를 이용하여 확인하였다.

  • PDF

Analysis of Temporal and Spatial Distribution of Traffic Accidents in Jinju (진주시 교통사고의 시계열적 공간분포특성 분석)

  • Sung, Byeong Jun;Bae, Gyu Han;Yoo, Hwan Hee
    • Journal of Korean Society for Geospatial Information Science
    • /
    • v.23 no.2
    • /
    • pp.3-9
    • /
    • 2015
  • Since changes in land use in urban space cause traffic volume and it is closely related to traffic accidents. Therefore, an analysis on the causes of traffic accidents is judged to be an essential factor to establish the measure to reduce traffic accidents. In this regard, the analysis was conducted on the clustering by using the nearest neighbor indexes with regard to the occurrence frequencies of commercial and residential zone based on traffic accident data of the past five years (2009-2013) with the target of local small-medium sized city, Jinju-si. The analysis results, obtained in this study, are as follows: the occurrence frequency of traffic accidents was the highest in spring and the lowest in winter respectively. The clustering of traffic accident occurrence at nighttime was stronger than at daytime. In addition, terms of the analysis on the clustering of traffic accident according to land use, changes according to the seasons was not significant in commercial areas, while clustering density in winter tended to become significantly lower in residential areas. The analysis results of traffic accident types showed that the side-right angle collision of cars was the highest in frequency occurrence, and widespread in both commercial areas and residential areas. These results can provide us with important information to identify the occurrence pattern of traffic accidents in the structure of urban space, and it is expected that they will be appropriately utilized to establish measures to reduce traffic accidents.

K-Means Clustering in the PCA Subspace using an Unified Measure (통합 측도를 사용한 주성분해석 부공간에서의 k-평균 군집화 방법)

  • Yoo, Jae-Hung
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.17 no.4
    • /
    • pp.703-708
    • /
    • 2022
  • K-means clustering is a representative clustering technique. However, there is a limitation in not being able to integrate the performance evaluation scale and the method of determining the minimum number of clusters. In this paper, a method for numerically determining the minimum number of clusters is introduced. The explained variance is presented as an integrated measure. We propose that the k-means clustering method should be performed in the subspace of the PCA in order to simultaneously satisfy the minimum number of clusters and the threshold of the explained variance. It aims to present an explanation in principle why principal component analysis and k-means clustering are sequentially performed in pattern recognition and machine learning.

Statistical Analysis for Ozone Long-term Trend Stations in Seoul, Korea (통계적 기법을 적용한 서울의 오존 장기변동 대표측정소 선정)

  • Shin, Hyejung;Park, Jihoon;Son, Jungseok;Rho, Soona;Hong, Youdeong
    • Journal of Environmental Impact Assessment
    • /
    • v.24 no.2
    • /
    • pp.111-118
    • /
    • 2015
  • This study was conducted for the establishment of statistical method to determine the representative air quality monitoring station representing long-term ozone trends of Seoul. In this study, hourly ozone concentrations from 2002 to 2011 were used for further analysis. KZ-filter, correlation matrix, cluster analysis, and Kriging method were applied to select the representative station. The analysis based on correlation matrix found that long-term trend of ozone concentrations measured at Sinjung, Sadang, and Bun-dong showed a high correlation. The cluster analysis found that the former three stations belonged to the same cluster. The analysis based on Kriging method also showed that the former three stations were highly correlated with other stations in spatial distribution. Considering these results and the highest correlation coefficient of Sinjung station, the Sinjung station was the most suitable as the representative station used to understand the long-term ozone trend of Seoul. This result could be applied to understand long-term trend of other pollutants. Furthermore, this result can also be used to assess the appropriacy of spatial distribution of national air quality monitoring stations.

Tree Based Cluster Analysis Using Reference Data (배경자료를 이용한 나무구조의 군집분석)

  • 최대우;구자용;최용석
    • The Korean Journal of Applied Statistics
    • /
    • v.17 no.3
    • /
    • pp.535-545
    • /
    • 2004
  • The clustering method suggested in this paper produces clusters based on the 'rules of variables' by merging the 'training' and the identically structured reference data and then by filtering it to obtain the clusters of the 'training data' through the use of the 'tree classification model'. The reference dataset is generated by spatially contrasting it to the 'training data' through the 'reverse arcing' algorithm to effectively identify the clusters. The strength of this method is that it can be applied even to the mixture of continuous and discrete types of 'training data' and the performance of this algorithm is illustrated by applying it to the simulated data as well as to the actual data.

A Study on the Adjectives for Selection of Color Patterns (컬러 패턴 선택을 위한 형용사에 관한 연구)

  • Kim Sung-Hwan;Eum Kyoung-Bae;Chung Sung-Suk;Lee Joon-Whoan
    • Science of Emotion and Sensibility
    • /
    • v.8 no.4
    • /
    • pp.355-363
    • /
    • 2005
  • The adjectives for represnting emotions is important to evaluate and select the colors or color patterns. In this paper, we perform the MDS analysis, factor analysis, and cluster analysis to the Soen's experimental data obtained from the evaluation of random color patterns with 13 adjective pairs. As a result, those adjectives can be reduced 3 different factors representing emotions of weight, activity and temperature, which is approximately corresponds the results of previous researches on single colors. Also, we show that the adjectives for preference can be approximate4 by other primary adjectives for color patterns using regression analysis. This implies that one can construct a uniform emotion space for evaluating and selecting color patterns regardless of objects such as wall papers, carpets, and so on.

  • PDF

Spatial Analysis of Drought Characteristics in Korea Using Cluster Analysis (군집분석을 이용한 우리나라 가뭄특성의 공간적 분석)

  • Yoo, Ji-Young;Choi, Min-Ha;Kim, Tae-Woong
    • Journal of Korea Water Resources Association
    • /
    • v.43 no.1
    • /
    • pp.15-24
    • /
    • 2010
  • Regional frequency analysis is often used to overcome the limitation of point frequency analysis to estimate probability rainfall depths. However, point frequency analysis is still used in drought analyses. This study proposed a practical method to categorize the homogeneous regions of drought characteristics for the analyses of regional characteristics of droughts in Korea. Using rainfall data from 58 observation stations managed by the Korea Meteorological Administration, this study calculated drought attributes, i.e., mean drought indices for various durations using the Standardized Precipitation Index (SPI) and drought severities expressed by durations, depth, and intensity. The drought attributes provided useful information for categorizing stations into the hydrological homogeneous regions. This study introduced a cluster analysis with K-means techniques to group observation stations. The cluster analysis grouped observation stations into 6 regions in Korea. The data in the hydrological homogeneous region would be used in spatial analysis of drought characteristics and drought regional frequency analysis.

A Study on Vector Data Compression using K-means Clustering (K평균 군집화를 이용한 벡터데이터 압축 기법 연구)

  • Lee, Dong-Heon;Chun, Woo-Je;Park, Soo-Hong
    • 한국공간정보시스템학회:학술대회논문집
    • /
    • 2004.12a
    • /
    • pp.132-138
    • /
    • 2004
  • 최근 이동전화, PDA, 텔레매틱스 단말기 등과 같은 모바일 기기에서 공간데이터에 대한 사용이 증가하고 있다. 하지만 모바일 기기의 저장 공간이 늘어났음에도 불구하고 여전히 공간데이터에 대한 요구를 수용하기에는 한계가 있다. 따라서 본 연구에서는 모바일 환경에서 사용 가능한 공간데이터에 대한 손실 압축 기법을 제시하고, 실험을 통한 압축률, 데이터 손실률을 분석하여 연구의 타당성과 적용 가능성을 제시하고자 한다. 세부적으로 압축률과 데이터 손실에 따르는 위치 정확도 관계에서 위치정확도를 높일 수 있는 방향을 모색하여 보았다. 그리고 다양한 군집화 기법 중 연구에 적용 가능한 기법을 선정 이용하였다. 또한 저장 공간뿐만 아닌 연산 성능 측면에서도 열악한 모바일 환경에서 만족할 만한 복원 성능을 보여야 한다. 따라서 압축된 데이터를 복원하는데 소요되는 비용을 최소화할 수 있는 방향이 연구되었다.

  • PDF

Assessment of Spatiotemporal Water Quality Variation Using Multivariate Statistical Techniques: A Case Study of the Imjin River Basin, Korea (다변량 통계기법을 이용한 시·공간적 수질변화의 평가: 임진강유역에 관한 연구)

  • Cho, Yong-Chul;Lee, Su-Woong;Ryu, In-Gu;Yu, Soon-Ju
    • Journal of Korean Society of Environmental Engineers
    • /
    • v.39 no.11
    • /
    • pp.641-649
    • /
    • 2017
  • In the study, the water quality of the Imjin River basin with pollutants of changing characteristics it was determined through statistical analysis, correlation analysis, principle component and factor analysis, and cluster analysis. Among all analyzed data points, the average water quality concentration at the Sincheon 3 site shows high levels of BOD 13.4 mg/L, COD 19.9 mg/L, T-N 11.145 mg/L, T-P 0.336 mg/L, TOC 14.2 mg/L, indicating that Sincheon basin requires intersive water quality management out of the entire drainage basin. The correlational analysis of comprehensive water quality data shows statistically significant correlation between COD, TOC, BOD, T-N water quality factors, as well as finding of high correlation between organic and nutrients. The principal component analysis show that 2 main components being extracted at 81.221% from the measuring station's entire data, while seasonal data show 3 main components being extracted at 96.241%. Factor analysis of the entire data set and the seasonal data identify BOD, COD, T-N, T-P, TOC as the common factors influencing water quality. The spatial and temporal cluster analysis showed 4 groups and 3 groups, respectively, according to seasonal characteristics and land use. By analysing the water quality factors for the Imjin River basins over an 8 year period, with consideration to the spatial and temporal characteristics, this study will become the fundamental analytic data that will help understand the future changes of water quality in the Imjin River basin.

System Theory Approach for Decision Making of GIS-based Optimum Allocation (GIS기반 최적공간선정을 위한 시스템론적 접근)

  • Oh, Sang-Young
    • The Journal of the Korea Contents Association
    • /
    • v.6 no.12
    • /
    • pp.121-127
    • /
    • 2006
  • As information technologies are improving, geographical information system (GIS) technologies are also developing rapidly and demands for spatial analysis with GIS are increasing. Particularly, the spatial analyses with GIS researches have been noted rather than general GIS researches. However, most GIS researches focus on space dimension: a density-based clustering method (DBSCAN) or a DBSCAN algorithm using region expressed as Weight (DBSCAN-W) but the importance of rational decision making based on time dimension has been neglected. This study adopts system dynamics in order to put time dimension in GIS-based optimum allocation.

  • PDF