• 제목/요약/키워드: Hierarchical Clustering Analysis

검색결과 247건 처리시간 0.026초

계층적 군집분석방법을 활용한 건물 부하의 전력수요예측 (Load Forecasting using Hierarchical Clustering Method for Building)

  • 황혜미;이성희;박종배;박용기;손성용
    • 전기학회논문지
    • /
    • 제64권1호
    • /
    • pp.41-47
    • /
    • 2015
  • In recent years, energy supply cases to take advantage of EMS(Energy Management System) are increasing according to high interest of energy efficiency. The important factor for essential and economical EMS operation is the supply and demand plan the hourly power demand of building load using the hierarchical clustering method of variety statistical techniques, and use the real historical data of target load. Also the estimated results of study are obtained the reliability through separate tests of validity.

계층적 군집화 기법을 이용한 단일항목 협상전략 수립 (Learning Single - Issue Negotiation Strategies Using Hierarchical Clustering Method)

  • 전진;김창욱;박세진;김성식
    • 대한산업공학회지
    • /
    • 제27권2호
    • /
    • pp.214-225
    • /
    • 2001
  • This research deals with an off-line learning method targeted for systematically constructing negotiation strategies in automated electronic commerce. Single-issue negotiation is assumed. Variants of competitive learning and hierarchical clustering method are devised and applied to extracting negotiation strategies, given historical negotiation data set and tactics. Our research is motivated by the following fact: evidence from both theoretical analysis and observations of human interaction shows that if decision makers have prior knowledge on the behaviors of opponents from negotiation, the overall payoff would increase. Simulation-based experiments convinced us that the proposed method is more effective than human negotiation in terms of the ratio of negotiation settlement and resulting payoff.

  • PDF

A New Approach for Hierarchical Dividing to Passenger Nodes in Passenger Dedicated Line

  • Zhao, Chanchan;Liu, Feng;Hai, Xiaowei
    • Journal of Information Processing Systems
    • /
    • 제14권3호
    • /
    • pp.694-708
    • /
    • 2018
  • China possesses a passenger dedicated line system of large scale, passenger flow intensity with uneven distribution, and passenger nodes with complicated relations. Consequently, the significance of passenger nodes shall be considered and the dissimilarity of passenger nodes shall be analyzed in compiling passenger train operation and conducting transportation allocation. For this purpose, the passenger nodes need to be hierarchically divided. Targeting at problems such as hierarchical dividing process vulnerable to subjective factors and local optimum in the current research, we propose a clustering approach based on self-organizing map (SOM) and k-means, and then, harnessing the new approach, hierarchical dividing of passenger dedicated line passenger nodes is effectuated. Specifically, objective passenger nodes parameters are selected and SOM is used to give a preliminary passenger nodes clustering firstly; secondly, Davies-Bouldin index is used to determine the number of clusters of the passenger nodes; and thirdly, k-means is used to conduct accurate clustering, thus getting the hierarchical dividing of passenger nodes. Through example analysis, the feasibility and rationality of the algorithm was proved.

CLUSTERING DNA MICROARRAY DATA BY STOCHASTIC ALGORITHM

  • Shon, Ho-Sun;Kim, Sun-Shin;Wang, Ling;Ryu, Keun-Ho
    • 대한원격탐사학회:학술대회논문집
    • /
    • 대한원격탐사학회 2007년도 Proceedings of ISRS 2007
    • /
    • pp.438-441
    • /
    • 2007
  • Recently, due to molecular biology and engineering technology, DNA microarray makes people watch thousands of genes and the state of variation from the tissue samples of living body. With DNA Microarray, it is possible to construct a genetic group that has similar expression patterns and grasp the progress and variation of gene. This paper practices Cluster Analysis which purposes the discovery of biological subgroup or class by using gene expression information. Hence, the purpose of this paper is to predict a new class which is unknown, open leukaemia data are used for the experiment, and MCL (Markov CLustering) algorithm is applied as an analysis method. The MCL algorithm is based on probability and graph flow theory. MCL simulates random walks on a graph using Markov matrices to determine the transition probabilities among nodes of the graph. If you look at closely to the method, first, MCL algorithm should be applied after getting the distance by using Euclidean distance, then inflation and diagonal factors which are tuning modulus should be tuned, and finally the threshold using the average of each column should be gotten to distinguish one class from another class. Our method has improved the accuracy through using the threshold, namely the average of each column. Our experimental result shows about 70% of accuracy in average compared to the class that is known before. Also, for the comparison evaluation to other algorithm, the proposed method compared to and analyzed SOM (Self-Organizing Map) clustering algorithm which is divided into neural network and hierarchical clustering. The method shows the better result when compared to hierarchical clustering. In further study, it should be studied whether there will be a similar result when the parameter of inflation gotten from our experiment is applied to other gene expression data. We are also trying to make a systematic method to improve the accuracy by regulating the factors mentioned above.

  • PDF

On the Categorical Variable Clustering

  • Kim, Dae-Hak
    • Journal of the Korean Data and Information Science Society
    • /
    • 제7권2호
    • /
    • pp.219-226
    • /
    • 1996
  • Basic objective in cluster analysis is to discover natural groupings of items or variables. In general, variable clustering was conducted based on some similarity measures between variables which have binary characteristics. We propose a variable clustering method when variables have more categories ordered in some sense. We also consider some measures of association as a similarity between variables. Numerical example is included.

  • PDF

Application of Clustering Methods for Interpretation of Petroleum Spectra from Negative-Mode ESI FT-ICR MS

  • Yeo, In-Joon;Lee, Jae-Won;Kim, Sung-Hwan
    • Bulletin of the Korean Chemical Society
    • /
    • 제31권11호
    • /
    • pp.3151-3155
    • /
    • 2010
  • This study was performed to develop analytical methods to better understand the properties and reactivity of petroleum, which is a highly complex organic mixture, using high-resolution mass spectrometry and statistical analysis. Ten crude oil samples were analyzed using negative-mode electrospray ionization Fourier transform ion cyclotron resonance mass spectrometry (ESI FT-ICR MS). Clustering methods, including principle component analysis (PCA), hierarchical clustering analysis (HCA), and k-means clustering, were used to comparatively interpret the spectra. All the methods were consistent and showed that oxygen and sulfur-containing heteroatom species played important roles in clustering samples or peaks. The oxygen-containing samples had higher acidity than the other samples, and the clustering results were linked to properties of the crude oils. This study demonstrated that clustering methods provide a simple and effective way to interpret complex petroleomic data.

Similarity Analysis of Hospitalization using Crowding Distance

  • Jung, Yong Gyu;Choi, Young Jin;Cha, Byeong Heon
    • International journal of advanced smart convergence
    • /
    • 제5권2호
    • /
    • pp.53-58
    • /
    • 2016
  • With the growing use of big data and data mining, it serves to understand how such techniques can be used to understand various relationships in the healthcare field. This study uses hierarchical methods of data analysis to explore similarities in hospitalization across several New York state counties. The study utilized methods of measuring crowding distance of data for age-specific hospitalization period. Crowding distance is defined as the longest distance, or least similarity, between urban cities. It is expected that the city of Clinton have the greatest distance, while Albany the other cities are closer because they are connected by the shortest distance to each step. Similarities were stronger across hospital stays categorized by age. Hierarchical clustering can be applied to predict the similarity of data across the 10 cities of hospitalization with the measurement of crowding distance. In order to enhance the performance of hierarchical clustering, comparison can be made across congestion distance when crowding distance is applied first through the application of converting text to an attribute vector. Measurements of similarity between two objects are dependent on the measurement method used in clustering but is distinguished from the similarity of the distance; where the smaller the distance value the more similar two things are to one other. By applying this specific technique, it is found that the distance between crowding is reduced consistently in relationship to similarity between the data increases to enhance the performance of the experiments through the application of special techniques. Furthermore, through the similarity by city hospitalization period, when the construction of hospital wards in cities, by referring to results of experiments, or predict possible will land to the extent of the size of the hospital facilities hospital stay is expected to be useful in efficiently managing the patient in a similar area.

복합명사 분리 색인 방법이 문서 클러스터링에 미치는 영향 분석 (An Analysis of the Hierarchical Agglomerative Clustering based on various Compound Noun Indexing Method)

  • 양명석;최성필
    • 한국정보과학회:학술대회논문집
    • /
    • 한국정보과학회 2002년도 가을 학술발표논문집 Vol.29 No.2 (2)
    • /
    • pp.697-699
    • /
    • 2002
  • 본 논문에서는 복합명사에 대한 색인 방법을 다각적으로 적용하여 계층적 결함 문서 클러스터링 시스템의 결과를 분석하고자 한다. 우선 한글 색인 엔진과 HAC(Hierarchical Agglumerative Clustering) 엔진에 대해서 설명하고 한글 색인엔진에서 제공되는 세가지 복합명사 분석 모드에 대해서 설명한다. 또한 구현된 클러스터링 엔진의 특징과 속도 향상을 위한 기법 등을 설명한다. 실험에서는 다양한 요소를 가지고 클러스터링된 문서 집합에 대한 분석 결과를 보인다. 실험 결과에 대한 분석에서 복합명사에 대한 색인 방법이 문서 클러스터링의 결과에 직접적인 영향을 준다는 것을 보여준다.

  • PDF

A Study of HME Model in Time-Course Microarray Data

  • Myoung, Sung-Min;Kim, Dong-Geon;Jo, Jin-Nam
    • 응용통계연구
    • /
    • 제25권3호
    • /
    • pp.415-422
    • /
    • 2012
  • For statistical microarray data analysis, clustering analysis is a useful exploratory technique and offers the promise of simultaneously studying the variation of many genes. However, most of the proposed clustering methods are not rigorously solved for a time-course microarray data cluster and for a fitting time covariate; therefore, a statistical method is needed to form a cluster and represent a linear trend of each cluster for each gene. In this research, we developed a modified hierarchical mixture of an experts model to suggest clustering data and characterize each cluster using a linear mixed effect model. The feasibility of the proposed method is illustrated by an application to the human fibroblast data suggested by Iyer et al. (1999).

RAG 기반 계층 분류 (2) (RAG-based Hierarchical Classification)

  • 이상훈
    • 대한원격탐사학회지
    • /
    • 제22권6호
    • /
    • pp.613-619
    • /
    • 2006
  • 본 연구는 원격 탐사의 영상 처리에서 영상 분할의 상위 수준으로 응집 계층 clustering의 dendrogram을 통한 무감독 영상 분류를 제안한다. 제안된 알고리즘은 분광 영역에서 정의된 RAG (Regional Agency Graph)와 min-heap 자료 구조를 이용하여 MCSNP (Mutual Closest Spectral Neighbor Pair)의 집합을 검색하면서 합병을 수행하는 계층 clustering 방법이다. 계산 시간과 저장 기억의 사용에 대한 효율을 증가시키기 위해 분광적 인접성을 정의하는 분광 공간(spectral space)내의 다중 창을 사용하였고 RNV (Region Neighbor Vector)을 이용하여 합병에 의하여 변하는 RAG 갱신하였고 적정한 단계 수가 주어진다면 제안된 알고리즘은 집단 합병의 계층적 관계를 쉽게 해석 할 수 있는 dendrogram을 생성한다. 본 연구는 simulation 자료를 사용하여 광범위하게 제안된 알고리즘에 대한 평가 실험을 수행 하였으며 실험 결과는 알고리즘의 효율성을 입증하였다. 또한 한반도에서 관측된 방대한 크기의 QuickBird 영상의 적용 결과는 제안된 알고리즘이 무감독 영상 분류를 위한 강력한 수단임을 보여준다.