• 제목/요약/키워드: K-means clustering technique

검색결과 151건 처리시간 0.027초

고차원 데이터에서 랜드마크를 이용한 거리 기반 이상치 탐지 방법 (A Distance-based Outlier Detection Method using Landmarks in High Dimensional Data)

  • 박정희
    • 한국멀티미디어학회논문지
    • /
    • 제24권9호
    • /
    • pp.1242-1250
    • /
    • 2021
  • Detection of outliers deviating normal data distribution in high dimensional data is an important technique in many application areas. In this paper, a distance-based outlier detection method using landmarks in high dimensional data is proposed. Given normal training data, the k-means clustering method is applied for the training data in order to extract the centers of the clusters as landmarks which represent normal data distribution. For a test data sample, the distance to the nearest landmark gives the outlier score. In the experiments using high dimensional data such as images and documents, it was shown that the proposed method based on the landmarks of one-tenth of training data can give the comparable outlier detection performance while reducing the time complexity greatly in the testing stage.

가속도센서와 각속도센서를 이용한 특정 비정상보행에 관한 연구 (A Study on Particular Abnormal Gait Using Accelerometer and Gyro Sensor)

  • 허근섭;양승한;이상룡;이종규;이춘영
    • 한국정밀공학회지
    • /
    • 제29권11호
    • /
    • pp.1199-1206
    • /
    • 2012
  • Recently, technologies to help the elderly or disabled people who have difficulty in walking are being developed. In order to develop these technologies, it is necessary to construct a system that gathers the gait data of people and analysis of these data is also important. In this research, we constructed the development of sensor system which consists of pressure sensor, three-axis accelerometer and two-axis gyro sensor. We used k-means clustering algorithm to classify the data for characterization, and then calculated the symmetry index with histogram which was produced from each cluster. We collected gait data from sensors attached on two subjects. The experiment was conducted for two kinds of gait status. One is walking with normal gait; the other is walking with abnormal gait (abnormal gait means that the subject walks by dragging the right leg intentionally). With the result from the analysis of acceleration component, we were able to confirm that the analysis technique of this data could be used to determine gait symmetry. In addition, by adding gyro components in the analysis, we could find that the symmetry index was appropriate to express symmetry better.

군집분석과 상세기상모델을 통한 포항지역 계절별 바람장 특성 (The Characteristics of Seasonal Wind Fields around the Pohang Using Cluster Analysis and Detailed Meteorological Model)

  • 정주희;오인보;고대권;김유근
    • 한국환경과학회지
    • /
    • 제20권6호
    • /
    • pp.737-753
    • /
    • 2011
  • The typical characteristics of seasonal winds were studied around the Pohang using two-stage (average linkage then k-means) clustering technique based on u- and v-component wind at 850 hpa from 2004 to 2006 (obtained the Pohang station) and a high-resolution (0.5 km grid for the finest domain) WRF-UCM model along with an up-to-date detailed land use data during the most predominant pattern in each season. The clustering analysis identified statistically distinct wind patterns (7, 4, 5, and 3 clusters) representing each spring, summer, fall, and winter. During the spring, the prevailed pattern (80 days) showed weak upper northwesterly flow and late sea-breeze. Especially at night, land-breeze developed along the shoreline was converged around Yeongil Bay. The representative pattern (92 days) in summer was weak upper southerly flow and intensified sea-breeze combined with sea surface wind. In addition, convergence zone between the large scale background flow and well-developed land-breeze was transported around inland (industrial and residential areas). The predominant wind distribution (94 days) in fall was similar to that of spring showing weak upper-level flow and distinct sea-land breeze circulation. On the other hand, the wind pattern (117 days) of high frequency in winter showed upper northwesterly and surface westerly flows, which was no change in daily wind direction.

혼합군집분석 기법을 이용한 도로 교통량의 첨두율 산정 (Calculation of the Peak-hour Ratio for Road Traffic Volumes using a Hybrid Clustering Technique)

  • 김형주;장수은
    • 대한교통학회지
    • /
    • 제30권1호
    • /
    • pp.19-30
    • /
    • 2012
  • 하루 동안 발생하는 교통수요는 대부분 특정 시간대에 집중됨으로써 수요 및 편익 산정에 어려움을 초래한다. 따라서 보다 신뢰성 높은 결과를 산출하기 위해서는 시간대별 특성을 고려할 필요가 있다. 이를 위한 첨두/비첨두의 1시간 통행량으로 환산하는 방법으로는 직관적 방법, 경험적 방법, 통계적 방법 등이 있다. 본 연구에서는 통계적 방법의 일환인 혼합군집분석 기법을 적용하여 첨두/비첨두/심야시간에 대한 지속시간과 집중률을 산정한다. 한국건설기술연구원이 제공하는 2009년 전국 24시간 수시교통량 자료를 이용하였으며, 차종별 특성을 살펴보기 위해 승용차, 트럭, 전차종 등으로 나누어 분석을 실시하였다. 분석결과의 검증을 위해 한국도로공사의 TCS 통행시간 자료를 이용하였다. 검증결과 본 연구결과가 타 연구에 비해 비첨두/심야 시간에는 오차율이 낮으며, 첨두시에는 통행거리가 멀어질수록 오차율이 높아지는 결과를 보였다. 본 연구결과는 임의성을 배제할 수 있으며, 첨두율 추정치에 대한 신뢰성 검증을 수행할 수 있어 보다 안정적인 방법론이라 평가할 수 있을 것이다. 본 연구의 결과가 향후 교통수요 분석의 신뢰성 향상에 일조할 수 있기를 기대한다.

시간 정보를 이용한 확장성 있는 하이브리드 Recommender 시스템 (Scalable Hybrid Recommender System with Temporal Information)

  • ;;김재우;문경덕;김진태;이성창
    • 한국인터넷방송통신학회논문지
    • /
    • 제12권2호
    • /
    • pp.61-68
    • /
    • 2012
  • 최근 디지털 컨텐츠와 컨텐츠 사용자의 기하 급수적인 증가와 함께 recommender 시스템이 주목을 받으며 많은 응용 프로그램에 적용되고 있는 가운데, recommender 시스템의 확장성과 대체적으로 이와 반비례하는 정확성이 이슈가 되고 있다. 본 논문에서는 recommender 시스템 모델 중 하이브리드 모델의 매트릭스를 제거하고 아이템의 특성을 정하기 위해 클러스터링 기술을 사용한 Scalable Hybrid Recommender System을 제안한다. 제안된 모델은 recommender 시스템의 확장성과 정확성을 향상시키기 위해서 아이템에 대한 사용자의 평가 정보, demographic 정보와 구체적인 시간 정보를 사용한다. Reduction 기술 사용을 통해 Item-feature 매트릭스의 사이즈를 축소하고, 사용자 demographic 정보를 사용하여 temporal aware hybrid user model을 만든 후, 비슷한 정보를 가진 사용자간 클러스터링을 통해, 가장 유사한 정보를 가진 사용자들을 추출하여, 사용자간 정보를 비교함으로써 사용자가 원하는 아이템의 특성을 예상하고 사용자에게 N개의 아이템을 추천함으로써, 기존의 recommender 시스템보다 더욱 향상된 결과를 도출해 낼 수 있는 알고리즘을 제시하였다.

표준강수지수를 활용한 제주도 가뭄의 공간적 분류 방법 연구 (Drought Classification Method for Jeju Island using Standard Precipitation Index)

  • 박재규;이준호;양성기;김민철;양세창
    • 한국환경과학회지
    • /
    • 제25권11호
    • /
    • pp.1511-1519
    • /
    • 2016
  • Jeju Island relies on subterranean water for over 98% of its water resources, and it is therefore necessary to continue to perform studies on drought due to climate changes. In this study, the representative standardized precipitation index (SPI) is classified by various criteria, and the spatial characteristics and applicability of drought in Jeju Island are evaluated from the results. As the result of calculating SPI of 4 weather stations (SPI 3, 6, 9, 12), SPI 12 was found to be relatively simple compared to SPI 6. Also, it was verified that the fluctuation of SPI was greater fot short-term data, and that long-term data was relatively more useful for judging extreme drought. Cluster analysis was performed using the K-means technique, with two variables extracted as the result of factor analysis, and the clustering was terminated with seven-time repeated calculations, and eventually two clusters were formed.

An Empirical Study of the Usage Performance of Mobile Emoticons : Applying to the Five Construct Model by Huang et al.

  • Lim, Se-Hun;Kim, Dae-Kil;Watts, Sean
    • Journal of Information Technology Applications and Management
    • /
    • 제18권4호
    • /
    • pp.21-40
    • /
    • 2011
  • Emoticons perform an important role as an enhancement to written communication, in areas such as Windows Live Messenger instant messaging, e-mails, mobile Short Message Services (SMS), and others. Emoticons are graphic images used in communications to indicate the feelings of people exchanging messages via mobile technology. In this research, the perceived usefulness of the emoticon in mobile phone text messages is verified with consumers using the five construct model of Huang. A K-means clustering technique for separating three groups based on levels of perceived usefulness of mobile emoticons is used with a structural equation model test using Smart PLS 2.0, and the bootstrap re-sampling procedure. We analyzed relationships among use of emoticons, enjoyment, interaction, information richness, and perceived usefulness. The results show there are relationships among use of emoticons, enjoyment, interaction, perceived usefulness, and information richness, however enjoyment of emoticons did not significantly affect the perceived usefulness of messages with emoticons alone. The results suggest emoticons have different affects on emotion in both mobile, and Messenger contexts. Our study did not consider more detailed media properties, and thus more studies are needed. Our research results contribute to mobile communication activation, provides companies with an understanding of key characteristics of consumers who use emoticons, and provides useful implications for improving management and marketing strategies.

Pattern Recognition of Meteorological fields Using Self-Organizing Map (SOM)

  • Nishiyama Koji;Endo Shinichi;Jinno Kenji
    • 한국수자원학회:학술대회논문집
    • /
    • 한국수자원학회 2005년도 학술발표회 논문집
    • /
    • pp.9-18
    • /
    • 2005
  • In order to systematically and visually understand well-known but qualitative and rotatively complicated relationships between synoptic fields in the BAIU season and heavy rainfall events in Japan, these synoptic fields were classified using the Self-Organizing Map (SOM) algorithm. This algorithm can convert complex nonlinear features into simple two-dimensional relationships, and was followed by the application of the clustering techniques of the U-matrix and the K-means. It was assumed that the meteorological field patterns be simply expressed by the spatial distribution of wind components at the 850 hPa level and Precipitable Water (PW) in the southwestern area including Kyushu in Japan. Consequently, the synoptic fields could be divided into eight kinds of patterns (clusters). One of the clusters has the notable spatial feature represented by high PW accompanied by strong wind components known as Low-Level Jet (LLJ). The features of this cluster indicate a typical meteorological field pattern that frequently causes disastrous heavy rainfall in Kyushu in the rainy season. From these results, the SOM technique may be an effective tool for the classification of complicated non-linear synoptic fields.

  • PDF

THE MODIFIED UNSUPERVISED SPECTRAL ANGLE CLASSIFICATION (MUSAC) OF HYPERION, HYPERION-FLASSH AND ETM+ DATA USING UNIT VECTOR

  • Kim, Dae-Sung;Kim, Yong-Il
    • 대한원격탐사학회:학술대회논문집
    • /
    • 대한원격탐사학회 2005년도 Proceedings of ISRS 2005
    • /
    • pp.134-137
    • /
    • 2005
  • Unsupervised spectral angle classification (USAC) is the algorithm that can extract ground object information with the minimum 'Spectral Angle' operation on behalf of 'Spectral Euclidian Distance' in the clustering process. In this study, our algorithm uses the unit vector instead of the spectral distance to compute the mean of cluster in the unsupervised classification. The proposed algorithm (MUSAC) is applied to the Hyperion and ETM+ data and the results are compared with K-Meails and former USAC algorithm (FUSAC). USAC is capable of clearly classifying water and dark forest area and produces more accurate results than K-Means. Atmospheric correction for more accurate results was adapted on the Hyperion data (Hyperion-FLAASH) but the results did not have any effect on the accuracy. Thus we anticipate that the 'Spectral Angle' can be one of the most accurate classifiers of not only multispectral images but also hyperspectral images. Furthermore the cluster unit vector can be an efficient technique for determination of each cluster mean in the USAC.

  • PDF

Identification of failure mechanisms for CFRP-confined circular concrete-filled steel tubular columns through acoustic emission signals

  • Li, Dongsheng;Du, Fangzhu;Chen, Zhi;Wang, Yanlei
    • Smart Structures and Systems
    • /
    • 제18권3호
    • /
    • pp.525-540
    • /
    • 2016
  • The CFRP-confined circular concrete-filled steel tubular column is composed of concrete, steel, and CFRP. Its failure mechanics are complex. The most important difficulties are lack of an available method to establish a relationship between a specific damage mechanism and its acoustic emission (AE) characteristic parameter. In this study, AE technique was used to monitor the evolution of damage in CFRP-confined circular concrete-filled steel tubular columns. A fuzzy c-means method was developed to determine the relationship between the AE signal and failure mechanisms. Cluster analysis results indicate that the main AE sources include five types: matrix cracking, debonding, fiber fracture, steel buckling, and concrete crushing. This technology can not only totally separate five types of damage sources, but also make it easier to judge the damage evolution process. Furthermore, typical damage waveforms were analyzed through wavelet analysis based on the cluster results, and the damage modes were determined according to the frequency distribution of AE signals.