• Title/Summary/Keyword: 군집의 수

Search Result 3,588, Processing Time 0.035 seconds

Multi-hierarchical Density-based Clustering Method (다계층 밀도기반 군집화 기법)

  • Shin, Dong Mun;Jung, Suk Ho;Yi, Gyeong Min;Lee, Dong Gyu;Sohn, GyoYong;Ryu, Keun Ho
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2009.11a
    • /
    • pp.797-798
    • /
    • 2009
  • 군집화는 대용량의 데이터로부터 유용한 정보를 추출하는 데에 적합한 데이터마이닝 기법들 중 하나이다. 군집화 기법은 주어진 데이터그룹 내에서 사전정보 없이 의미있는 지식을 발견할 수 있으므로 큰 어려움이 없이 실제 응용분야에 적용할 수 있다. 또한, 대용량 데이터를 다룰 때에 개별적인 데이터에 대한 접근 횟수를 줄이고, 알고리즘이 다루어야 할 데이터 구조의 크기를 줄일 수 있다. 본 논문에서는 밀도-기반 군집화 기법을 기반으로 하는 새로운 군집화 기법을 제안한다. 우리가 제안하는 군집화 기법은 반복적인 군집화 과정을 통하여 군집 내 주변 잡음을 제거하고 더 세밀하게 집단을 세분화하는 것이 가능하다. 또한, 군집을 표현하는 데에 계층구조로 나타내어 각 군집의 상관관계를 파악하는 데에 유리하다. 본 논문에서 제안하는 군집화 기법을 통하여 다양한 밀도를 가진 군집들을 효과적으로 분류할 수 있을 거라고 기대된다.

Validation-based Clustering Algorithm (유효성 기반 군집화 알고리즘)

  • ;R.S. Ramakrishna
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2003.10a
    • /
    • pp.19-21
    • /
    • 2003
  • 본 논문에서는 군집화의 가장 중요한 2가지 문제에 대한 새로운 해결책을 제시한다. 첫 번째 문제는 두 객체가 하나의 군집내에 포함될 수 있는지를 결정하는 유사 결정으로써, 이를 해결하기 위해 군집 유효화 지수에 기반한 유사 결정 기법을 제안한다. 이 기법은 정성적인 인지 과정을 정량적인 비교 결정 과정으로 바꾼다 이 기법은 본 논문에서 제안한 랜덤 군집화와 전체 군집화의 두 부분으로 구성된 유효성 기반 군집화 알고리즘의 핵심을 이루며. 기존의 않은 군집화 알고리즘에서 요구되는 복잡한 파라미터를 결정할 필요가 없어지도록 한다. 두 번째 문제는 최적 군집 수 (optimal number of clusters)를 찾는 것으로써, 이것 또한 앞에서 제안한 기법에 의해서 전체 군집화에서 찾을 수 있다. 마지막으로 제안한 기법과 군집화 알고리즘의 효용성 및 효율성을 보여주는 실험 결과가 제시된다.

  • PDF

Enhancing Document Clustering using Important Term of Cluster and Wikipedia (군집의 중요 용어와 위키피디아를 이용한 문서군집 향상)

  • Park, Sun;Lee, Yeon-Woo;Jeong, Min-A;Lee, Seong-Ro
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.49 no.2
    • /
    • pp.45-52
    • /
    • 2012
  • This paper proposes a new enhancing document clustering method using the important terms of cluster and the wikipedia. The proposed method can well represent the concept of cluster topics by means of selecting the important terms in cluster by the semantic features of NMF. It can solve the problem of "bags of words" to be not considered the meaningful relationships between documents and clusters, which expands the important terms of cluster by using of the synonyms of wikipedia. Also, it can improve the quality of document clustering which uses the expanded cluster important terms to refine the initial cluster by re-clustering. The experimental results demonstrate that the proposed method achieves better performance than other document clustering methods.

Document Clustering Method using Coherence of Cluster and Non-negative Matrix Factorization (비음수 행렬 분해와 군집의 응집도를 이용한 문서군집)

  • Kim, Chul-Won;Park, Sun
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.13 no.12
    • /
    • pp.2603-2608
    • /
    • 2009
  • Document clustering is an important method for document analysis and is used in many different information retrieval applications. This paper proposes a new document clustering model using the clustering method based NMF(non-negative matrix factorization) and refinement of documents in cluster by using coherence of cluster. The proposed method can improve the quality of document clustering because the re-assigned documents in cluster by using coherence of cluster based similarity between documents, the semantic feature matrix and the semantic variable matrix, which is used in document clustering, can represent an inherent structure of document set more well. The experimental results demonstrate appling the proposed method to document clustering methods achieves better performance than documents clustering methods.

A Methodology to Establish Operational Strategies for Truck Platoonings on Freeway On-ramp Areas (고속도로 유입연결로 구간 화물차 군집운영전략 수립 방안 연구)

  • LEE, Seolyoung;OH, Cheol
    • Journal of Korean Society of Transportation
    • /
    • v.36 no.2
    • /
    • pp.67-85
    • /
    • 2018
  • Vehicle platooning through wireless communication and automated driving technology has become realized. Platooning is a technique in which several vehicles travel at regular intervals while maintaining a minimum safety distance. Truck platooning is of keen interest because it contributes to preventing truck crashes and reducing vehicle emissions, in addition to the increase in truck flow capacity. However, it should be noted that interactions between vehicle platoons and adjacent manually-driven vehicles (MV) significantly give an impact on the performance of traffic flow. In particular, when vehicles entering from on-ramp attempt to merge into the mainstream of freeway, proper interactions by adjusting platoon size and inter-platoon spacing are required to maximize traffic performance. This study developed a methodology for establishing operational strategies for truck platoonings on freeway on-ramp areas. Average speed and conflict rate were used as measure of effectiveness (MOE) to evaluate operational efficiency and safety. Microscopic traffic simulation experiments using VISSIM were conducted to evaluate the effectiveness of various platooning scenarios. A decision making process for selecting better platoon operations to satisfy operations and safety requirements was proposed. It was revealed that a platoon operating scenario with 50m inter-platoon spacing and the platoon consisting of 6 vehicles outperformed other scenarios. The proposed methodology would effectively support the realization of novel traffic management concepts in the era of automated driving environments.

머신러닝을 위한 베이지안 방법론: 군집분석을 중심으로

  • Kim, Yong-Dae;Jeong, Gu-Hwan
    • Information and Communications Magazine
    • /
    • v.33 no.10
    • /
    • pp.60-64
    • /
    • 2016
  • 본고에서는 베이지안 기계학습 방법론에 대해서 간략히 살펴본다. 특히, 복잡한 자료들 사이의 관계를 규명하는 것이 목적이며 비지도학습(unsupervised learning)의 한 분야인 군집분석에서 베이지안 방법론들이 어떻게 사용되어지는지를 설명한다. 군집의 수를 사전에 아는 경우에 사용되는 모수적 베이지안 방법을 간단하게 설명하고, 군집의 수까지 추론 할 수 있는 비모수 베이지안방법에 대해서 자세하게 다룬다.

Plant Community Structure Analysis in Noinbong area of Odaesan National Park (오대산 국립공원 노인봉지역 식물군집구조분석)

  • 최송현;권전오;민성환
    • Korean Journal of Environment and Ecology
    • /
    • v.9 no.2
    • /
    • pp.156-165
    • /
    • 1996
  • To investigate the forest structure and to suggest the management of vegetation landscape in Noinbong area, Pdaesan National Pa, twelve plots were set up and surveyed. According to the acalysis of classification by TWINSPAN, the community was divided by two groups of Carpinus laxiflora - Quercus mongolica community and the other is Betula costata - schmidtii - C. laxiflora community. It was found out that the successional stage of Noinbong forests was climax and introduced-climax by the analysis of species structure, similarity index and species diversity. The number of individuals was about 120~130 and species was 17 per 100m$^{2}$. Through the analysis of basal area and DBH class distribution, it was estimated that C. laxiflora, B. costata, and B. schmidtii will be clmax species instead of Q. mongolica in tree layer, and in the subtree layer, Acer pseudo-sieboldianum will be dominant species.

  • PDF

A Use of Expectation Maximization Clustering for Constructing a Markov Chain of Human Mobility Model (기대치 최대화 기반의 군집화를 통한 인간 이동 패턴의 마르코프 연쇄모델 도출)

  • Kim, Hyunuk;Song, Ha Yoon
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2012.04a
    • /
    • pp.864-867
    • /
    • 2012
  • 사람들이 휴대용 위치정보 수집 장비나 혹은 스마트폰을 사용하면서 사람의 이동 정보인 위치정보들을 모으는 일이 가능해 졌다. 이러한 위치정보들을 가지고 본 논문에서는 사람의 이동 모델을 나타내고자 하였다. 이동 정보들은 머물러 있는(Stay)상태와 이동하는(Moving) 상태로 나눌 수 있는데 이러한 상태 중 머물러 있는 상태가 군집화가 되어 연쇄 모델속의 하나의 상태(State)로 나타나 질 수 있다. 물론 이동 정보들을 통해 연쇄모델 속 각 상태간의 전이 확률 또한 계산 할 수 있다. 이러한 일련의 과정을 본 논문에서는 기대치 최대화 기반 군집화 과정을 통해 연속시간 연쇄 모델의 형태로 인간의 이동성을 표현하였다. 또한 이러한 모델에서 대표 군집(macro)과 그 부속 군집(micro)을 표현할 수 있었고 이러한 모습은 대표적인 큰 군집 속의 작은 군집의 형태로 나타나게 된다.

Analysis of the Plant Community Structure in Gayasan National Park by the Ordination and Classification Technique (Ordination 및 Classification 방법에 의한 가야산지구의 식물군집구조분석)

  • 이경재;조재창;우종서
    • Korean Journal of Environment and Ecology
    • /
    • v.3 no.1
    • /
    • pp.28-41
    • /
    • 1989
  • A survey of Hongryu-Dong and Chi-in district. Gaya National Park, was conducted using 40 sample sites of 500$m^2$ size. TWINSPAN classification confirmed a complex pattern of both local and geographical variation in the vegetation: Dry and wet community types. Within dry community types, two floristic assocation of Pinus densiflora were defined according to local variation. Within wet community types. two floristic association were defined according to altitude. Those associations can be further subdivided floristically into eight subassociation. The vegetation pattern presented by DCA ordination corresponds to one of TWINSPAN at the first two division. The DCA ordination was successful in separating Pinus densiflora from broad leaf forest. Ordination of samples produced arrangements reflectly environmental gradient of soil. The correlation between the first axe of DCA and soil moisture, soil acid, altitude, maximum species diversity and species diversity was significantly negative. The similarity index between each community was very low level.

  • PDF

A Comparative Study on Statistical Clustering Methods and Kohonen Self-Organizing Maps for Highway Characteristic Classification of National Highway (일반국도 도로특성분류를 위한 통계적 군집분석과 Kohonen Self-Organizing Maps의 비교연구)

  • Cho, Jun Han;Kim, Seong Ho
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.29 no.3D
    • /
    • pp.347-356
    • /
    • 2009
  • This paper is described clustering analysis of traffic characteristics-based highway classification in order to deviate from methodologies of existing highway functional classification. This research focuses on comparing the clustering techniques performance based on the total within-group errors and deriving the optimal number of cluster. This research analyzed statistical clustering method (Hierarchical Ward's minimum-variance method, Nonhierarchical K-means method) and Kohonen self-organizing maps clustering method for highway characteristic classification. The outcomes of cluster techniques compared for the number of samples and traffic characteristics from subsets derived by the optimal number of cluster. As a comprehensive result, the k-means method is superior result to other methods less than 12. For a cluster of more than 20, Kohonen self-organizing maps is the best result in the cluster method. The main contribution of this research is expected to use important the basic road attribution information that produced the highway characteristic classification.