• 제목/요약/키워드: K-mean cluster analysis

검색결과 304건 처리시간 0.023초

Known-Item Retrieval Performance of a PICO-based Medical Question Answering Engine

  • Vong, Wan-Tze;Then, Patrick Hang Hui
    • Asia pacific journal of information systems
    • /
    • 제25권4호
    • /
    • pp.686-711
    • /
    • 2015
  • The performance of a novel medical question-answering engine called CliniCluster and existing search engines, such as CQA-1.0, Google, and Google Scholar, was evaluated using known-item searching. Known-item searching is a document that has been critically appraised to be highly relevant to a therapy question. Results show that, using CliniCluster, known-items were retrieved on average at rank 2 ($MRR@10{\approx}0.50$), and most of the known-items could be identified from the top-10 document lists. In response to ill-defined questions, the known-items were ranked lower by CliniCluster and CQA-1.0, whereas for Google and Google Scholar, significant difference in ranking was not found between well- and ill-defined questions. Less than 40% of the known-items could be identified from the top-10 documents retrieved by CQA-1.0, Google, and Google Scholar. An analysis of the top-ranked documents by strength of evidence revealed that CliniCluster outperformed other search engines by providing a higher number of recent publications with the highest study design. In conclusion, the overall results support the use of CliniCluster in answering therapy questions by ranking highly relevant documents in the top positions of the search results.

물 인프라 지속가능성 지수 분석: 가중치 분석과 군집분석을 활용하여 (Analysis of Water Infrastructure Sustainability Index: Using Weighting and Cluster Analysis)

  • 류재나;강대운
    • 대한토목학회논문집
    • /
    • 제38권3호
    • /
    • pp.417-428
    • /
    • 2018
  • 본 연구의 목적은 상하수도 시설을 중심으로 한 물 인프라의 지속가능성을 평가하는 지수들을 활용하여 경제, 사회, 환경적 측면에서의 지속가능성을 평가하고 지속가능성 확보 필요성을 제고하기 위함이다. 경제, 사회, 환경적 지수 중 중요하게 고려해야할 세부지수들을 파악하고, 전국 지자체를 유형화하여 집단 간 특성을 비교분석하였다. 세부 지수의 가중치 산출은 주성분 분석을 활용하였으며, 지자체를 유형화하는 과정에는 K-평균 군집분석을 시행하였다. 가중치 분석 결과, 전체 12개의 지수 중 재정자립도, 자본수입비율, 보조금비율, 서비스보급률, 노후화율, 수생태건강성, 하천수질이 지속가능성을 평가하기 위한 주요한 변수로 분석되었으며, 특히, 경제부문 지수의 영향력이 가장 높은 것으로 나타났다. 다음으로는 군집분석 결과를 통해 지자체를 크게 두 가지 유형으로 분류하였고, 각 유형 별 특징을 살펴보았다. 먼저 경제부문의 지속가능성이 우수한 집단에서는 환경부문에 대한 개선이 필요한 것으로 나타났지만, 대체로 지속가능성 상태가 우수한 것으로 나타났다. 환경부문이 우수한 집단에서는 지속가능성 상태가 열악한 지자체가 많이 포함되어 있으며, 특히, 경제부문 상태를 향상시키기 위한 집중적인 노력이 필요한 것으로 보인다.

남성복(男性服) 브랜드이미지 인식(認識)에 관(關)한 연구(硏究) (A Study on the Perception of Men's Wear Brands)

  • 구인숙
    • 패션비즈니스
    • /
    • 제9권5호
    • /
    • pp.1-14
    • /
    • 2005
  • The purpose of this study was to analysis the perception of men's wear brands (Intermezzo and Rogatis), for developing the possibility & strategy of the nichi-market in men's wear market for the apparel marketers and manufactures. For this study, the data obtained from 312 respondents were analyzed by descriptive statistics, ANOVA. The results from the study were as follow ; The perception of the 2 brand images revealed that Intermezzo accounted for 79.8% of the frequencies, and Rogatis accounted for 99%. Also, results revealed the total evaluation of Intermezzo accounted for 3.86 of the mean rated on 5 point Likert-type scales in the 9 features, and Rogatis accounted for 3.28. And then, results revealed that there were signifiant differences in 2 cluster of Rogatis that the purchasing cluster accounted for 3.46 of the mean, and the perceiving cluster accounted for 3.07. The brand images of Intermezzo and Rogatis were evaluated and rated on 5 point Likert-type scales of 17 pair adjectives. As a results, the image characteristic with Intermezzo was considered with more dynamic, trendy than the image characteristic with Rogatis. Also, results revealed that The Image with Intermezzo was considered with urban, lively, chic, modern, and sophsticated image-features, and the Image with Rogatis were evaluated mannish, urban, sophsticated, luxury, and static image-features.

앙상블 방법에 따른 WRF/CMAQ 수치 모의 결과 비교 연구 - 2013년 부산지역 고농도 PM10 사례 (A Comparison Study of Ensemble Approach Using WRF/CMAQ Model - The High PM10 Episode in Busan)

  • 김태희;김유근;손장호;정주희
    • 한국대기환경학회지
    • /
    • 제32권5호
    • /
    • pp.513-525
    • /
    • 2016
  • To propose an effective ensemble methods in predicting $PM_{10}$ concentration, six experiments were designed by different ensemble average methods (e.g., non-weighted, single weighted, and cluster weighted methods). The single weighted method was calculated the weighted value using both multiple regression analysis and singular value decomposition and the cluster weighted method was estimated the weighted value based on temperature, relative humidity, and wind component using multiple regression analysis. The effects of ensemble average methods were significantly better in weighted average than non-weight. The results of ensemble experiments using weighted average methods were distinguished according to methods calculating the weighted value. The single weighted average method using multiple regression analysis showed the highest accuracy for hourly $PM_{10}$ concentration, and the cluster weighted average method based on relative humidity showed the highest accuracy for daily mean $PM_{10}$ concentration. However, the result of ensemble spread analysis showed better reliability in the single weighted average method than the cluster weighted average method based on relative humidity. Thus, the single weighted average method was the most effective method in this study case.

평균회귀 심박변이도의 K-평균 군집화 학습을 통한 심실조기수축 부정맥 신호의 특성분석 (Characterization of Premature Ventricular Contraction by K-Means Clustering Learning Algorithm with Mean-Reverting Heart Rate Variability Analysis)

  • 김정환;김동준;이정환;김경섭
    • 전기학회논문지
    • /
    • 제66권7호
    • /
    • pp.1072-1077
    • /
    • 2017
  • Mean-reverting analysis refers to a way of estimating the underlining tendency after new data has evoked the variation in the equilibrium state. In this paper, we propose a new method to interpret the specular portraits of Premature Ventricular Contraction(PVC) arrhythmia by applying K-means unsupervised learning algorithm on electrocardiogram(ECG) data. Aiming at this purpose, we applied a mean-reverting model to analyse Heart Rate Variability(HRV) in terms of the modified poincare plot by considering PVC rhythm as the component of disrupting the homeostasis state. Based on our experimental tests on MIT-BIH ECG database, we can find the fact that the specular patterns portraited by K-means clustering on mean-reverting HRV data can be more clearly visible and the Euclidean metric can be used to identify the discrepancy between the normal sinus rhythm and PVC beats by the relative distance among cluster-centroids.

New classification of lingual arch form in normal occlusion using three dimensional virtual models

  • Park, Kyung Hee;Bayome, Mohamed;Park, Jae Hyun;Lee, Jeong Woo;Baek, Seung-Hak;Kook, Yoon-Ah
    • 대한치과교정학회지
    • /
    • 제45권2호
    • /
    • pp.74-81
    • /
    • 2015
  • Objective: The purposes of this study were 1) to classify lingual dental arch form types based on the lingual bracket points and 2) to provide a new lingual arch form template based on this classification for clinical application through the analysis of three-dimensional virtual models of normal occlusion sample. Methods: Maxillary and mandibular casts of 115 young adults with normal occlusion were scanned in their occluded positions and lingual bracket points were digitized on the virtual models by using Rapidform 2006 software. Sixty-eight cases (dataset 1) were used in K-means cluster analysis to classify arch forms with intercanine, interpremolar and intermolar widths and width/depth ratios as determinants. The best-fit curves of the mean arch forms were generated. The remaining cases (dataset 2) were mapped into the obtained clusters and a multivariate test was performed to assess the differences between the clusters. Results: Four-cluster classification demonstrated maximum inter-cluster distance. Wide, narrow, tapering, and ovoid types were described according to the intercanine and intermolar widths and their best-fit curves were depicted. No significant differences in arch depths existed among the clusters. Strong to moderate correlations were found between maxillary and mandibular arch widths. Conclusions: Lingual arch forms have been classified into 4 types based on their anterior and posterior dimensions. A template of the 4 arch forms has been depicted. Three-dimensional analysis of the lingual bracket points provides more accurate identification of arch form and, consequently, archwire selection.

국부 확률을 이용한 데이터 분류에 관한 연구 (A Study on Data Clustering Method Using Local Probability)

  • 손창호;최원호;이재국
    • 제어로봇시스템학회논문지
    • /
    • 제13권1호
    • /
    • pp.46-51
    • /
    • 2007
  • In this paper, we propose a new data clustering method using local probability and hypothesis theory. To cluster the test data set we analyze the local area of the test data set using local probability distribution and decide the candidate class of the data set using mean standard deviation and variance etc. To decide each class of the test data, statistical hypothesis theory is applied to the decided candidate class of the test data set. For evaluating, the proposed classification method is compared to the conventional fuzzy c-mean method, k-means algorithm and Discriminator analysis algorithm. The simulation results show more accuracy than results of fuzzy c-mean method, k-means algorithm and Discriminator analysis algorithm.

패션 위험(危險) 지각(知覺)에 의한 패션 상품(商品) 분류(分類) (A Classify Fashion Goods by 'Fashion Risk Perception')

  • 김영란;유태순
    • 패션비즈니스
    • /
    • 제2권2호
    • /
    • pp.37-45
    • /
    • 1998
  • The purpose of this study is to survey and classify the differences of the perceived fashion risk according to the apparels and accessories that consumers purchased. 243 ungraduate were separated into three groups and asked to rate 15 fashion risk concerns about each item on 5-point scale. The number of item was 103 in the total of the three group. Data were analyzed by using Mean, SO, ANOVA, Factor Analysis, Cluster Analysis, Cronbach $\alpha$ with SAS program. The result of this study was high perceived risk in leather Jacket, suit, long coat, sunglasses. The most important factor of the perceived risk structure in the fashion goods was about the perceived risk perception of others. The apparels and accessories which completes the dress were classified into the same cluster. Consumers don't perceive the fashion goods independently, but they make much of the combination of other items.

  • PDF

군집분석을 이용한 우리나라 가뭄특성의 공간적 분석 (Spatial Analysis of Drought Characteristics in Korea Using Cluster Analysis)

  • 유지영;최민하;김태웅
    • 한국수자원학회논문집
    • /
    • 제43권1호
    • /
    • pp.15-24
    • /
    • 2010
  • 최근에는 확률강우량을 산정할 경우 지점빈도해석의 단점을 보완한 지역빈도해석법이 자주 실무에 적용되고 있으나, 가뭄에 관련한 연구에서는 대부분 아직까지 지점자료를 이용한 가뭄분석을 실시하고 있다. 본 연구에서는 가뭄의 지역적 특성 분석을 실시하기 위하여 필요한 동질한 가뭄특성을 지닌 지역을 구분하는 연구를 수행하였다. 본 연구에서는 기상청 강우관측 지점자료 중 30년 이상의 강우자료를 보유한 58개의 관측지점을 대상으로 표준강수지수(SPI)를 산정하여 가뭄의 심도, 지속기간, 강도, 발생빈도 등과 같은 가뭄특성인자를 생성하였다. 가뭄특성인자는 수문학적으로 동질한 특성을 지닌 지역을 구분하는데 중요한 정보를 제공한다. 본 연구에서는 다양한 가뭄특성인자를 효율적으로 활용하여 K-means 기법을 적용한 군집분석을 실시하여 동질한 가뭄특성을 지닌 지역을 6개 지역으로 구분하였다. 이러한 지역구분은 가뭄 특성의 공간적 해석을 가능하게 할 수 있고, 지점빈도 해석의 단점을 보완하는 지역빈도 해석도 가능하게 할 수 있다.

러프 엔트로피를 이용한 범주형 데이터의 클러스터링 (lustering of Categorical Data using Rough Entropy)

  • 박인규
    • 한국인터넷방송통신학회논문지
    • /
    • 제13권5호
    • /
    • pp.183-188
    • /
    • 2013
  • 객체를 분류하기 위하여 유사한 특징을 기반으로 하는 다양한 클러스터해석은 데이터 마이닝에서 필수적이다. 그러나 많은 데이터베이스에 포함되어 있는 범주형 데이터의 경우에 기존의 분할접근방법은 객체간의 불확실성을 처리하는데 한계가 있다. 범주형 데이터의 분할과정에서 식별불가능에 의한 동치류의 불확실성에 대한 접근논리가 러프집합의 대수학적인 논리에만 국한되어서 알고리즘의 안정성과 효율성이 떨어지는 요인으로 작용하고 있다. 본 논문에서는 범주형 데이터에 존재하는 속성의 의존도를 고려하기 위하여 정보이론적인 척도를 기반으로 러프엔트로피를 정의하고 MMMR이라는 알고리즘을 제안하여 분할속성을 추출한다. 제안된 방법의 성능을 분석하고 비교하기 위하여 K-means, 퍼지에 의한 방법과 표준편차를 이용한 기존의 방법과 비교우위를 ZOO데이터에 국한하여 알아본다. ZOO데이터를 이용하여 기존의 범주형 알고리즘과의 비교우위를 살펴보고 제안된 알고리즘의 효율성을 검증한다.