• Title/Summary/Keyword: 군집 자료

Search Result 1,192, Processing Time 0.024 seconds

A Study of the Fuzzy Clustering Algorithm using a Growth Curve Model (성장곡선을 이용한 퍼지군집분석 기법의 연구)

  • 김응환;이석훈
    • The Korean Journal of Applied Statistics
    • /
    • v.14 no.2
    • /
    • pp.439-448
    • /
    • 2001
  • 본 연구는 시간자료(Longitudinal data)의 분석을 위하여 Fuzzy k-means 군집분석 방법을 확장한 알고리즘을 제안한다. 이 논문에서 제안하는 군집분석방법은 각각의 개체에 대응하는 성장곡선에 Fuzzy k-means 군집분석의 알고리즘을 결합하는 것을 핵심아이디어로한다. 분석결과는 생성된 군집을 성장곡선모형으로 표현할 수 있고 또한 추정된 모형의 식을 활용하여 새로운 개체를 분류도 할수 있음을 보인다. 그리고 이 군집분석방법은 아직 자라지 않은 나이 어린 개체가 미래에 어느 군집에 속할 것인가 하는 분류와 함께 이 개체의 향후 성장상태를 예측을 하는 데에도 적용이 가능하다. 제안된 알고리즘을 원숭이(macaque)의 상악동(maxillary sinus)의 자료에 적용한 실례로 보인다.

  • PDF

A Study On Predicting Stock Prices Of Hallyu Content Companies Using Two-Stage k-Means Clustering (2단계 k-평균 군집화를 활용한 한류컨텐츠 기업 주가 예측 연구)

  • Kim, Jeong-Woo
    • Journal of the Korea Convergence Society
    • /
    • v.12 no.7
    • /
    • pp.169-179
    • /
    • 2021
  • This study shows that the two-stage k-means clustering method can improve prediction performance by predicting the stock price, To this end, this study introduces the two-stage k-means clustering algorithm and tests the prediction performance through comparison with various machine learning techniques. It selects the cluster close to the prediction target obtained from the k-means clustering, and reapplies the k-means clustering method to the cluster to search for a cluster closer to the actual value. As a result, the predicted value of this method is shown to be closer to the actual stock price than the predicted values of other machine learning techniques. Furthermore, it shows a relatively stable predicted value despite the use of a relatively small cluster. Accordingly, this method can simultaneously improve the accuracy and stability of prediction, and it can be considered as the new clustering method useful for small data. In the future, developing the two-stage k-means clustering is required for the large-scale data application.

Study on Application of Neural Network for Unsupervised Training of Remote Sensing Data (신경망을 이용한 원격탐사자료의 군집화 기법 연구)

  • 김광은;이태섭;채효석
    • Spatial Information Research
    • /
    • v.2 no.2
    • /
    • pp.175-188
    • /
    • 1994
  • A competitive learning network was proposed as unsupervised training method of remote sensing data, Its performance and computational re¬quirements were compared with conventional clustering techniques such as Se¬quential and K - Means. An airborne remote sensing data set was used to study the performance of these classifiers. The proposed algorithm required a little more computational time than the conventional techniques. However, the perform¬ance of competitive learning network algorithm was found to be slightly more than those of Sequential and K - Means clustering techniques.

  • PDF

Comparison of the Cluster Validation Techniques using Gene Expression Data (유전자 발현 자료를 이용한 군집 타당성분석 기법 비교)

  • Jeong, Yun-Kyoung;Baek, Jang-Sun
    • 한국데이터정보과학회:학술대회논문집
    • /
    • 2006.04a
    • /
    • pp.63-76
    • /
    • 2006
  • Several clustering algorithms to analyze gene expression data and cluster validation techniques that assess the quality of their outcomes, have been suggested, but evaluations of these cluster validation techniques have seldom been implemented. In this paper we compared various cluster validity indices for simulation data and real genomic data, and found that Dunn's index is more effective and robust through small simulations and with real gene expression data.

  • PDF

A Fusion of the Period Characterized and Hierarchical Bayesian Techniques for Efficient Cluster Analysis of Time Series Data (시계열자료의 효율적 군집분석을 위한 구간특징화와 계층적 베이지안 기법의 융합)

  • Jung, Young-Ae;Jeon, Jin-Ho
    • Journal of Digital Convergence
    • /
    • v.13 no.7
    • /
    • pp.169-175
    • /
    • 2015
  • An effective way to understand the dynamic and time series that follows the passage of time, as valuation is to establish a model to analyze the phenomena of the system. Model of the decision process is efficient clustering information of the total mass of the time series data of the relevant population been collected in a particular number of sub-groups than to look at all a time to an understand of the overall data through each community-specific model determination. In this study, a sub-grouping of the group and the first of the two process model of each cluster by determining, in the following in sub-population characterized by a fusion with heuristic Bayesian clustering techniques proposed a process which can reduce calculation time and cost was confirmed by experiments using actual effectiveness valuation.

Strategy for Visual Clustering (시각적 군집분석에 대한 전략)

  • 허문열
    • The Korean Journal of Applied Statistics
    • /
    • v.14 no.1
    • /
    • pp.177-190
    • /
    • 2001
  • 전통적으로 많이 사용하는 군집분석의 방법들은 개체간의 거리를 고려하여 이들을 분류해 내는 것이며, 따라서 거리 측정 방법에 따라 여러 형태의 군집분석 방법이 나타나게 된다. 어떤 방법을 적용하던 간에 그 결과는 고정된 수치로써 나타난다. 다차원 자료의 구조파악이 몇 개의 수치로 나타나게 되면 어쩔 수 없이 정보의 손실이 발생하게 된다. 이를 보완하기 위해 시각적 매체를 동원하여 다차원 자료의 구조를 파악하는 연구가 있었으며, 이를 시각적 군집분석이라고 명명하고 있다. 본 연구에서는 시각적 군집분석에 대한 기본적 개념과 이를 위한 통계 도형의 활용, 구현방법 등에 대해 살펴보기로 한다.

  • PDF

Tree Based Cluster Analysis Using Reference Data (배경자료를 이용한 나무구조의 군집분석)

  • 최대우;구자용;최용석
    • The Korean Journal of Applied Statistics
    • /
    • v.17 no.3
    • /
    • pp.535-545
    • /
    • 2004
  • The clustering method suggested in this paper produces clusters based on the 'rules of variables' by merging the 'training' and the identically structured reference data and then by filtering it to obtain the clusters of the 'training data' through the use of the 'tree classification model'. The reference dataset is generated by spatially contrasting it to the 'training data' through the 'reverse arcing' algorithm to effectively identify the clusters. The strength of this method is that it can be applied even to the mixture of continuous and discrete types of 'training data' and the performance of this algorithm is illustrated by applying it to the simulated data as well as to the actual data.

Plant Community Structure Analysis in Jujeongol Valley of Soraksan National Park (설악산 국립공원 주전골계곡 식물군집구조분석)

  • 이경재;민성환;한봉호
    • Korean Journal of Environment and Ecology
    • /
    • v.10 no.2
    • /
    • pp.283-296
    • /
    • 1997
  • To investigate the plant community structure in valley and suggest the management of Mational Park, fifty plots were set up and surveyed in Jujeongol Valley, Soraksan National Park. The classification by TWINSPAN and DCA ordination technique were applied to the study area in order to classify them into several groups based on woody plants. The dividing groups were Quercus mpnngolica - Q. variabilis - Pinus densiflora community, P. densiflora community, Carpinus laxiflora community, Q. serrata community. The ecological trends of tree species by DCA ordination technique and DBH class distribution analysis was like that Q. mongolica - Q. variabilis - P. densiflora community and P. densiflora community seems to be trended from P. densiflora community to Q. mongolica community. Q. serrata community seems to be trended from Q. serrata community to C. laxiflora community and C. laxiflora will be maintaimed stable state.

  • PDF

Categorical time series clustering: Case study of Korean pro-baseball data (범주형 시계열 자료의 군집화: 프로야구 자료의 사례 연구)

  • Pak, Ro Jin
    • Journal of the Korean Data and Information Science Society
    • /
    • v.27 no.3
    • /
    • pp.621-627
    • /
    • 2016
  • A certain professional baseball team tends to be very weak against another particular team. For example, S team, the strongest team in Korea, is relatively weak to H team. In this paper, we carried out clustering the Korean baseball teams based on the records against the team S to investigate whether the pattern of the record of the team H is different from those of the other teams. The technique we have employed is 'time series clustering', or more specifically 'categorical time series clustering'. Three methods have been considered in this paper: (i) distance based method, (ii) genetic sequencing method and (iii) periodogram method. Each method has its own advantages and disadvantages to handle categorical time series, so that it is recommended to draw conclusion by considering the results from the above three methods altogether in a comprehensive manner.

Functional clustering for clubfoot data: A case study (클럽발 자료를 위한 함수적 군집 분석: 사례연구)

  • Lee, Miae;Lim, Johan;Park, Chungun;Lee, Kyeong Eun
    • Journal of the Korean Data and Information Science Society
    • /
    • v.25 no.5
    • /
    • pp.1069-1077
    • /
    • 2014
  • A clubfoot is a kind of congenital deformity of foot, which is internally rotated at the ankle. In this paper, we are going to cluster the curves of relative differences between regular and operated feet. Since these curves are irregular and sparsely sampled, general clustering models could not be applied. So the clustering model for sparsely sampled functional data by James and Sugar (2003) are applied and parameters are estimated using EM algorithm. The number of clusters is determined by the distortion function (Sugar and James, 2003) and two clusters of the curves are found.