• 제목/요약/키워드: Objective clustering

검색결과 224건 처리시간 0.025초

Combining Distributed Word Representation and Document Distance for Short Text Document Clustering

  • Kongwudhikunakorn, Supavit;Waiyamai, Kitsana
    • Journal of Information Processing Systems
    • /
    • 제16권2호
    • /
    • pp.277-300
    • /
    • 2020
  • This paper presents a method for clustering short text documents, such as news headlines, social media statuses, or instant messages. Due to the characteristics of these documents, which are usually short and sparse, an appropriate technique is required to discover hidden knowledge. The objective of this paper is to identify the combination of document representation, document distance, and document clustering that yields the best clustering quality. Document representations are expanded by external knowledge sources represented by a Distributed Representation. To cluster documents, a K-means partitioning-based clustering technique is applied, where the similarities of documents are measured by word mover's distance. To validate the effectiveness of the proposed method, experiments were conducted to compare the clustering quality against several leading methods. The proposed method produced clusters of documents that resulted in higher precision, recall, F1-score, and adjusted Rand index for both real-world and standard data sets. Furthermore, manual inspection of the clustering results was conducted to observe the efficacy of the proposed method. The topics of each document cluster are undoubtedly reflected by members in the cluster.

클러스터링 적응 퍼지 제어기를 이용한 브러시리스 직류 전동기의 토크 제어 (Torque Control of Brushless DC Motor Using a Clustering Adaptive Fuzzy Logic Controller)

  • 권정진;한우용;이창구;김성중
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 제어로봇시스템학회 2000년도 제15차 학술회의논문집
    • /
    • pp.349-349
    • /
    • 2000
  • A Clustering Adaptive Fuzzy Logic Controller(CAFLC) is applied to the torque control of a brushless do motor drive. Objective of this system includes elimination of torque ripple due to cogging at low speeds under loads. The CAFLC implemented has advantages of computational simplicity, and self-tuning characteristics. Simulation results showed that the torque ripple and dynamic response of the system using a CAFLC were superior to the model reference adaptive controlled system.

  • PDF

A k-means++ Algorithm for Internet Shopping Search Engine

  • Jian-Ji Ren;Jae-kee Lee
    • 한국정보처리학회:학술대회논문집
    • /
    • 한국정보처리학회 2008년도 추계학술발표대회
    • /
    • pp.75-77
    • /
    • 2008
  • Nowadays, as the indices of the major search engines grow to a tremendous proportion, vertical search services can help customers to find what they need. Search Engine is one of the reasons for Internet shopping success in today's world. The import one part of search engine is clustering data. The objective of this paper is to explore a k-means++ algorithm to calculate the clustering data which in the Internet shopping environment. The experiment results shows that the k-means++ algorithm is a faster algorithm to achieved a good clustering.

Interference-free Clustering Protocol for Large-Scale and Dense Wireless Sensor Networks

  • Chen, Zhihong;Lin, Hai;Wang, Lusheng;Zhao, Bo
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제13권3호
    • /
    • pp.1238-1259
    • /
    • 2019
  • Saving energy is a big challenge for Wireless Sensor Networks (WSNs), which becomes even more critical in large-scale WSNs. Most energy waste is communication related, such as collision, overhearing and idle listening, so the schedule-based access which can avoid these wastes is preferred for WSNs. On the other hand, clustering technique is considered as the most promising solution for topology management in WSNs. Hence, providing interference-free clustering is vital for WSNs, especially for large-scale WSNs. However, schedule management in cluster-based networks is never a trivial work, since it requires inter-cluster cooperation. In this paper, we propose a clustering method, called Interference-Free Clustering Protocol (IFCP), to partition a WSN into interference-free clusters, making timeslot management much easier to achieve. Moreover, we model the clustering problem as a multi-objective optimization issue and use non-dominated sorting genetic algorithm II to solve it. Our proposal is finally compared with two adaptive clustering methods, HEED-CSMA and HEED-BMA, demonstrating that it achieves the good performance in terms of delay, packet delivery ratio, and energy consumption.

미분류 데이터의 초기예측을 통한 군집기반의 부분지도 학습방법 (A Clustering-based Semi-Supervised Learning through Initial Prediction of Unlabeled Data)

  • 김응구;전치혁
    • 한국경영과학회지
    • /
    • 제33권3호
    • /
    • pp.93-105
    • /
    • 2008
  • Semi-supervised learning uses a small amount of labeled data to predict labels of unlabeled data as well as to improve clustering performance, whereas unsupervised learning analyzes only unlabeled data for clustering purpose. We propose a new clustering-based semi-supervised learning method by reflecting the initial predicted labels of unlabeled data on the objective function. The initial prediction should be done in terms of a discrete probability distribution through a classification method using labeled data. As a result, clusters are formed and labels of unlabeled data are predicted according to the Information of labeled data in the same cluster. We evaluate and compare the performance of the proposed method in terms of classification errors through numerical experiments with blinded labeled data.

Intelligent Clustering in Vehicular ad hoc Networks

  • Aadil, Farhan;Khan, Salabat;Bajwa, Khalid Bashir;Khan, Muhammad Fahad;Ali, Asad
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제10권8호
    • /
    • pp.3512-3528
    • /
    • 2016
  • A network with high mobility nodes or vehicles is vehicular ad hoc Network (VANET). For improvement in communication efficiency of VANET, many techniques have been proposed; one of these techniques is vehicular node clustering. Cluster nodes (CNs) and Cluster Heads (CHs) are elected or selected in the process of clustering. The longer the lifetime of clusters and the lesser the number of CHs attributes to efficient networking in VANETs. In this paper, a novel Clustering algorithm is proposed based on Ant Colony Optimization (ACO) for VANET named ACONET. This algorithm forms optimized clusters to offer robust communication for VANETs. For optimized clustering, parameters of transmission range, direction, speed of the nodes and load balance factor (LBF) are considered. The ACONET is compared empirically with state of the art methods, including Multi-Objective Particle Swarm Optimization (MOPSO) and Comprehensive Learning Particle Swarm Optimization (CLPSO) based clustering techniques. An extensive set of experiments is performed by varying the grid size of the network, the transmission range of nodes, and total number of nodes in network to evaluate the effectiveness of the algorithms in comparison. The results indicate that the ACONET has significantly outperformed the competitors.

K-shape 군집화 기반 블랙-리터만 포트폴리오 구성 (Black-Litterman Portfolio with K-shape Clustering)

  • 김예지;조풍진
    • 산업경영시스템학회지
    • /
    • 제46권4호
    • /
    • pp.63-73
    • /
    • 2023
  • This study explores modern portfolio theory by integrating the Black-Litterman portfolio with time-series clustering, specificially emphasizing K-shape clustering methodology. K-shape clustering enables grouping time-series data effectively, enhancing the ability to plan and manage investments in stock markets when combined with the Black-Litterman portfolio. Based on the patterns of stock markets, the objective is to understand the relationship between past market data and planning future investment strategies through backtesting. Additionally, by examining diverse learning and investment periods, it is identified optimal strategies to boost portfolio returns while efficiently managing associated risks. For comparative analysis, traditional Markowitz portfolio is also assessed in conjunction with clustering techniques utilizing K-Means and K-Means with Dynamic Time Warping. It is suggested that the combination of K-shape and the Black-Litterman model significantly enhances portfolio optimization in the stock market, providing valuable insights for making stable portfolio investment decisions. The achieved sharpe ratio of 0.722 indicates a significantly higher performance when compared to other benchmarks, underlining the effectiveness of the K-shape and Black-Litterman integration in portfolio optimization.

내부클러스터를 이용한 개선된 FCM 알고리즘에 대한 연구 (A Study on the Modified FCM Algorithm using Intracluster)

  • 안강식;조석제
    • 정보처리학회논문지B
    • /
    • 제9B권2호
    • /
    • pp.202-214
    • /
    • 2002
  • 본 논문에서는 FCM알고리즘과 평균내부거리를 적용한 퍼지 클러스터링 알고리즘의 문제점을 해결하기 위하여 개선된 FCM 알고리즘을 제안한다. 개선된 FCM 알고리즘은 내부클러스터를 이용하여 클러스터 크기가 다른 경우에도 크기가 작은 클러스터에 일정한 소속정도를 부여할 수 있다. 그리고 이에 맞는 목적함수를 설계하고 검증한 후 데이터 분류에 사용하기 때문에 목적함수의 수렴성 문제를 극복할 수 있다. 그러므로 클러스터 크기가 다른 경우에 발생하는 FCM 알고리즘의 문제점과 목적함수의 수렴성에 문제가 있는 평균내부거리를 적용한 퍼지 클러스터링 알고리즘의 문제점을 해결할 수 있다. 제안한 알고리즘을 검증하기 위하여 제안한 알고리즘을 이용하여 데이터를 분류한 결과를 FCM 알고리즘, 평균 내부거리를 적용한 퍼지 클러스터링 알고리즘을 이용하여 데이터를 분류한 결과와 각각 비교하였다. 실험을 통하여 제안한 알고리즘으로 데이터를 분류할 경우 분류 엔트로피에 의해 기존의 알고리즘들보다 더 좋은 결과를 나타냄을 알 수 있었다.

A practical application of cluster analysis using SPSS

  • Kim, Dae-Hak
    • Journal of the Korean Data and Information Science Society
    • /
    • 제20권6호
    • /
    • pp.1207-1212
    • /
    • 2009
  • Basic objective in cluster analysis is to discover natural groupings of items or variables. In general, clustering is conducted based on some similarity (or dissimilarity) matrix or the original input text data. Various measures of similarities (or dissimilarities) between objects (or variables) are developed. We introduce a real application problem of clustering procedure in SPSS when the distance matrix of the objects (or variables) is only given as an input data. It will be very helpful for the cluster analysis of huge data set which leads the size of the proximity matrix greater than 1000, particularly. Syntax command for matrix input data in SPSS for clustering is given with numerical examples.

  • PDF

Discovering Community Interests Approach to Topic Model with Time Factor and Clustering Methods

  • Ho, Thanh;Thanh, Tran Duy
    • Journal of Information Processing Systems
    • /
    • 제17권1호
    • /
    • pp.163-177
    • /
    • 2021
  • Many methods of discovering social networking communities or clustering of features are based on the network structure or the content network. This paper proposes a community discovery method based on topic models using a time factor and an unsupervised clustering method. Online community discovery enables organizations and businesses to thoroughly understand the trend in users' interests in their products and services. In addition, an insight into customer experience on social networks is a tremendous competitive advantage in this era of ecommerce and Internet development. The objective of this work is to find clusters (communities) such that each cluster's nodes contain topics and individuals having similarities in the attribute space. In terms of social media analytics, the method seeks communities whose members have similar features. The method is experimented with and evaluated using a Vietnamese corpus of comments and messages collected on social networks and ecommerce sites in various sectors from 2016 to 2019. The experimental results demonstrate the effectiveness of the proposed method over other methods.