• Title/Summary/Keyword: 계층 군집화

Search Result 141, Processing Time 0.025 seconds

A change of the public's emotion depending on Temperature & Humidity index (온습도에 따른 대중의 감성(감정+감각) 활동 변화)

  • Yang, Junggi;Kim, Geunyoung;Lee, Youngho;Kang, Un-Gu
    • Journal of Digital Convergence
    • /
    • v.12 no.10
    • /
    • pp.243-252
    • /
    • 2014
  • Many researches about the effect on politics, economics and Sociocultural phenomenon using the social media are in progress. Authors utilized NAVER Trend most famous web browsing service in korea, NAVER Blog social media, NAVER Cafe service and Open Data(API) and also used temperature, humidity index data of Korea Meteorological Administration. This study analyzed a change of the public's emotion in korea using Cluster analysis of vocabulary of taste among its of feelings and senses. K-means clustering was followed by decision of the number of groups which was used Chi-square goodness of fit test and ward analysis. Eight groups was made and it represented sensitive vocabulary. By Discriminant analysis, eight groups decided by Cluster analysis has 98.9% accuracy. The change of the public's emotion has capability to predict people's activity so they can share sensibility and a bond of sympathy developed between them.

Active Learning based on Hierarchical Clustering (계층적 군집화를 이용한 능동적 학습)

  • Woo, Hoyoung;Park, Cheong Hee
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.2 no.10
    • /
    • pp.705-712
    • /
    • 2013
  • Active learning aims to improve the performance of a classification model by repeating the process to select the most helpful unlabeled data and include it to the training set through labelling by expert. In this paper, we propose a method for active learning based on hierarchical agglomerative clustering using Ward's linkage. The proposed method is able to construct a training set actively so as to include at least one sample from each cluster and also to reflect the total data distribution by expanding the existing training set. While most of existing active learning methods assume that an initial training set is given, the proposed method is applicable in both cases when an initial training data is given or not given. Experimental results show the superiority of the proposed method.

A Study on Fuzzy Logic based Clustering Method for Radar Data Analysis (레이더 데이터 분석을 위한 Fuzzy Logic 기반 클러스터링 기법에 관한 연구)

  • Lee, Hansoo;Kim, Eun Kyeong;Kim, Sungshin
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.25 no.3
    • /
    • pp.217-222
    • /
    • 2015
  • Clustering is one of important data mining techniques known as exploratory data analysis and is being applied in various engineering and scientific fields such as pattern recognition, remote sensing, and so on. The method organizes data by abstracting underlying structure either as a grouping of individuals or as a hierarchy of groups. Weather radar observes atmospheric objects by utilizing reflected signals and stores observed data in corresponding coordinate. To analyze the radar data, it is needed to be separately organized precipitation and non-precipitation echo based on similarities. Thus, this paper studies to apply clustering method to radar data. In addition, in order to solve the problem when precipitation echo locates close to non-precipitation echo, fuzzy logic based clustering method which can consider both distance and other properties such as reflectivity and Doppler velocity is suggested in this paper. By using actual cases, the suggested clustering method derives better results than previous method in near-located precipitation and non-precipitation echo case.

Development of a Method for Analyzing and Visualizing Concept Hierarchies based on Relational Attributes and its Application on Public Open Datasets

  • Hwang, Suk-Hyung
    • Journal of the Korea Society of Computer and Information
    • /
    • v.26 no.9
    • /
    • pp.13-25
    • /
    • 2021
  • In the age of digital innovation based on the Internet, Information and Communication and Artificial Intelligence technologies, huge amounts of datasets are being generated, collected, accumulated, and opened on the web by various public institutions providing useful and public information. In order to analyse, gain useful insights and information from data, Formal Concept Analysis(FCA) has been successfully used for analyzing, classifying, clustering and visualizing data based on the binary relation between objects and attributes in the dataset. In this paper, we present an approach for enhancing the analysis of relational attributes of data within the extended framework of FCA, which is designed to classify, conceptualize and visualize sets of objects described not only by attributes but also by relations between these objects. By using the proposed tool, RCA wizard, several experiments carried out on some public open datasets demonstrate the validity and usability of our approach on generating and visualizing conceptual hierarchies for extracting more useful knowledge from datasets. The proposed approach can be used as an useful tool for effective data analysis, classifying, clustering, visualization and exploration.

The Comparison of Neural Network and k-NN Algorithm for News Article Classification (신경망 또는 k-NN에 의한 신문 기사 분류와 그의 성능 비교)

  • 조태호
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 1998.10c
    • /
    • pp.363-365
    • /
    • 1998
  • 텍스트 마이닝(Text Mining)이란 텍스트형태의 문서들의 패턴 또는 관계를 추출하여 사용자가 원하는 새로운 정보를 가공하거나 기존의 정보를 변형하는 과정을 말한다. 텍스트 마이닝의 기능에는 문서 범주화(Document Categorization), 문서 군집화(Document Clustering), 그리고 문서 요약(Document Summarization)이 이에 해당된다. 문서 범주화란 문서에게 사전에 정의한 범주를 부여하는 과정을 말하고, 문서 군집화란 문서들을 계층적 구조로 형성하는 과정을 말하고, 문서 요약이란 문서의 전체 내용을 대표할 수 있는 내용의 일부만을 추출하는 과정을 말한다. 이 논문에서는 문서 범주화만을 다룰 것이며 그 대상으로는 신문기사로 설정하였다. 그의 범주는 4가지로 정치, 경제, 스포츠, 그리고 정보통신으로 설정하였다. 문서 범주화는 문서 분류(Document Classification)라고도 하며 문서에 범주를 자동으로 부여하여 기존에 인위적으로 부여함으로써 소요되는 시간과 비용을 절감하는 것이 목적이다. 문서 범주화에 대하여 k-NN(k-Nearest Neighbor)와 신경망을 이용하였으며, 신경망을 이용한 경우가 k-NN을 이용한 경우보다 성능이 우수하였다.

  • PDF

Proposal of Cluster Head Election Method in K-means Clustering based WSN (K-평균 군집화 기반 WSN에서 클러스터 헤드 선택 방법 제안)

  • Yun, Dai Yeol;Park, SeaYoung;Hwang, Chi-Gon
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2021.05a
    • /
    • pp.447-449
    • /
    • 2021
  • Various wireless sensor network protocols have been proposed to maintain the network for a long time by minimizing energy consumption. Using the K-means clustering algorithm takes longer to cluster than traditional hierarchical algorithms because the center point must be moved repeatedly until the final cluster is established. For K-means clustering-based protocols, only the residual energy of nodes or nodes near the center point of the cluster is considered when the cluster head is elected. In this paper, we propose a new wireless sensor network protocol based on K-means clustering to improve the energy efficiency while improving the aforementioned problems.

  • PDF

A Fusion of the Period Characterized and Hierarchical Bayesian Techniques for Efficient Cluster Analysis of Time Series Data (시계열자료의 효율적 군집분석을 위한 구간특징화와 계층적 베이지안 기법의 융합)

  • Jung, Young-Ae;Jeon, Jin-Ho
    • Journal of Digital Convergence
    • /
    • v.13 no.7
    • /
    • pp.169-175
    • /
    • 2015
  • An effective way to understand the dynamic and time series that follows the passage of time, as valuation is to establish a model to analyze the phenomena of the system. Model of the decision process is efficient clustering information of the total mass of the time series data of the relevant population been collected in a particular number of sub-groups than to look at all a time to an understand of the overall data through each community-specific model determination. In this study, a sub-grouping of the group and the first of the two process model of each cluster by determining, in the following in sub-population characterized by a fusion with heuristic Bayesian clustering techniques proposed a process which can reduce calculation time and cost was confirmed by experiments using actual effectiveness valuation.

A Movie Recommendation System based on Fuzzy-AHP with User Preference and Partition Algorithm (사용자 선호도와 군집 알고리즘을 이용한 퍼지-계층적 분석 기법 기반 영화 추천 시스템)

  • Oh, Jae-Taek;Lee, Sang-Yong
    • Journal of Digital Convergence
    • /
    • v.15 no.11
    • /
    • pp.425-432
    • /
    • 2017
  • The current recommendation systems have problems including the difficulty of figuring out whether they recommend items that actual users have preference for or have simple interest in, the scarcity of data to recommend proper items due to the extremely small number of users, and the cold-start issue of the dropping system performance to recommend items that can satisfy users according to the influx of new users. In an effort to solve these problems, this study implemented a movie recommendation system to ensure user satisfaction by using the Fuzzy-Analytic Hierarchy Process, which can reflect uncertain situations and problems, and the data partition algorithm to group similar items among the given ones. The data of a survey on movie preference with 61 users was applied to the system, and the results show that it solved the data scarcity problem based on the Fuzzy-AHP and recommended items fit for a user with the data partition algorithm even with the influx of new users. It is thought that research on the density-based clustering will be needed to filter out future noise data or outlier data.

Proximate Word Filtering by Hierarchical Clustering (계층적 군집화를 이용한 근사 단어 필터링 기법)

  • Kim, Sung-Hwan;Cho, Hwan-Gue
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2012.04a
    • /
    • pp.1101-1104
    • /
    • 2012
  • 단어 필터링은 유해정보를 차단위한 기본적인 기능이다. 그러나 악의적인 사용자는 필터링 시스템을 우회하기 위하여 금지 단어에 의도적인 변형을 가한다. 이에 대응하기 위해 일정 오류를 허용하여 필터링을 수행하는 근사 단어 필터링이 있다. 근사 단어를 검색하기 위한 문자열 색인 방법으로는 주로 기준 단어(Pivot)을 이용한 유클리드 공간에의 사상을 이용하는데, 이는 단어 필터링에 응용하기에는 근본적인 구조상의 한계점이 있다. 본 논문에서는 필터링 대상이 되는 단어 집합 내에서 군집화를 수행하여 계층적인 자료구조를 구성하고, 단어 필터링을 위한 필터링 질의(Filtering query)를 정의한 뒤 그에 적합한 탐색 상의 적용에 관하여 설명한다. 실험 결과 기존의 기준 단어(Pivot)을 이용한 색인 기법에 비하여 16.9%~26.6%의 탐색 속도 향상을 확인할 수 있었다.

Word Sense Disambiguation Using Korean Word Definition Vectors (한국어 단어 정의 벡터를 이용한 단어 의미 모호성 해소)

  • Park, Jeong Yeon;Lee, Jae Sung
    • Annual Conference on Human and Language Technology
    • /
    • 2021.10a
    • /
    • pp.195-198
    • /
    • 2021
  • 기존 연구에 따르면, 시소러스의 계층적 관계를 기반으로 압축한 의미 어휘 태그를 단어 의미 모호성 해소에 사용할 경우, 그 성능이 향상되었다. 본 논문에서는 시소러스를 사용하지 않고, 국어 사전에 포함된 단어의 의미 정의를 군집화하여 압축된 의미 어휘 태그를 만드는 방법을 제안한다. 또, 이를 이용하여 효율적으로 단어 의미 모호성을 해소하는 BERT 기반의 딥러닝 모델을 제안한다. 한국어 세종 의미 부착 말뭉치로 실험한 결과, 제안한 방법의 성능이 F1 97.21%로 기존 방법의 성능 F1 95.58%보다 1.63%p 향상되었다.

  • PDF