• Title/Summary/Keyword: Post Clustering

Search Result 70, Processing Time 0.025 seconds

A Fusion of Data Mining Techniques for Predicting Movement of Mobile Users

  • Duong, Thuy Van T.;Tran, Dinh Que
    • Journal of Communications and Networks
    • /
    • v.17 no.6
    • /
    • pp.568-581
    • /
    • 2015
  • Predicting locations of users with portable devices such as IP phones, smart-phones, iPads and iPods in public wireless local area networks (WLANs) plays a crucial role in location management and network resource allocation. Many techniques in machine learning and data mining, such as sequential pattern mining and clustering, have been widely used. However, these approaches have two deficiencies. First, because they are based on profiles of individual mobility behaviors, a sequential pattern technique may fail to predict new users or users with movement on novel paths. Second, using similar mobility behaviors in a cluster for predicting the movement of users may cause significant degradation in accuracy owing to indistinguishable regular movement and random movement. In this paper, we propose a novel fusion technique that utilizes mobility rules discovered from multiple similar users by combining clustering and sequential pattern mining. The proposed technique with two algorithms, named the clustering-based-sequential-pattern-mining (CSPM) and sequential-pattern-mining-based-clustering (SPMC), can deal with the lack of information in a personal profile and avoid some noise due to random movements by users. Experimental results show that our approach outperforms existing approaches in terms of efficiency and prediction accuracy.

Post Clustering Method using Tag Hierarchy for Blog Search (블로그 검색에서의 태그 계층구조를 이용한 포스트 군집화)

  • Lee, Ki-Jun;Kim, Kyung-Min;Lee, Myung-Jin;Kim, Woo-Ju;Hong, June-S.
    • The Journal of Society for e-Business Studies
    • /
    • v.16 no.4
    • /
    • pp.301-319
    • /
    • 2011
  • Blog plays an important role as new type of knowledge base distinguishing from traditional web resource. While information resources in their existing website dealt with a wide range of topics, information resources of the blog are concentrated in specific units of information depending on the user's interests and have the criteria of classification forresources published by tagging. In this research, we build a tag hierarchy utilizing title keywords and tags of the blog, and propose apost clustering methodology applying the tag hierarchy. We then generate the tag hierarchy reflected the relationship between tags and develop the tag clustering methodology according to tag similarity. In this paper, we analyze the possibility of applying the proposed methodology with real-world examples and evaluate its performances through developed prototype system.

Lung Function Trajectory Types in Never-Smoking Adults With Asthma: Clinical Features and Inflammatory Patterns

  • Kim, Joo-Hee;Chang, Hun Soo;Shin, Seung Woo;Baek, Dong Gyu;Son, Ji-Hye;Park, Choon-Sik;Park, Jong-Sook
    • Allergy, Asthma & Immunology Research
    • /
    • v.10 no.6
    • /
    • pp.614-627
    • /
    • 2018
  • Purpose: Asthma is a heterogeneous disease that responds to medications to varying degrees. Cluster analyses have identified several phenotypes and variables related to fixed airway obstruction; however, few longitudinal studies of lung function have been performed on adult asthmatics. We investigated clinical, demographic, and inflammatory factors related to persistent airflow limitation based on lung function trajectories over 1 year. Methods: Serial post-bronchodilator forced expiratory volume (FEV) 1% values were obtained from 1,679 asthmatics who were followed up every 3 months for 1 year. First, a hierarchical cluster analysis was performed using Ward's method to generate a dendrogram for the optimum number of clusters using the complete post-FEV1 sets from 448 subjects. Then, a trajectory cluster analysis of serial post-FEV1 sets was performed using the k-means clustering for the longitudinal data trajectory method. Next, trajectory clustering for the serial post-FEV1 sets of a total of 1,679 asthmatics was performed after imputation of missing post-FEV1 values using regression methods. Results: Trajectories 1 and 2 were associated with normal lung function during the study period, and trajectory 3 was associated with a reversal to normal of the moderately decreased baseline FEV1 within 3 months. Trajectories 4 and 5 were associated with severe asthma with a marked reduction in baseline FEV1. However, the FEV1 associated with trajectory 4 was increased at 3 months, whereas the FEV1 associated with trajectory 5 was persistently disturbed over 1 year. Compared with trajectory 4, trajectory 5 was associated with older asthmatics with less atopy, a lower immunoglobulin E (IgE) level, sputum neutrophilia and higher dosages of oral steroids. In contrast, trajectory 4 was associated with higher sputum and blood eosinophil counts and more frequent exacerbations. Conclusions: Trajectory clustering analysis of FEV1 identified 5 distinct types, representing well-preserved to severely decreased FEV1. Persistent airflow obstruction may be related to non-atopy, a low IgE level, and older age accompanied by neutrophilic inflammation and low baseline FEV1 levels.

A performance improvement methodology of web document clustering using FDC-TCT (FDC-TCT를 이용한 웹 문서 클러스터링 성능 개선 기법)

  • Ko, Suc-Bum;Youn, Sung-Dae
    • The KIPS Transactions:PartD
    • /
    • v.12D no.4 s.100
    • /
    • pp.637-646
    • /
    • 2005
  • There are various problems while applying classification or clustering algorithm in that document classification which requires post processing or classification after getting as a web search result due to my keyword. Among those, two problems are severe. The first problem is the need to categorize the document with the help of the expert. And, the second problem is the long processing time the document classification takes. Therefore we propose a new method of web document clustering which can dramatically decrease the number of times to calculate a document similarity using the Transitive Closure Tree(TCT) and which is able to speed up the processing without loosing the precision. We also compare the effectivity of the proposed method with those existing algorithms and present the experimental results.

Regime-dependent Characteristics of KOSPI Return

  • Kim, Woohwan;Bang, Seungbeom
    • Communications for Statistical Applications and Methods
    • /
    • v.21 no.6
    • /
    • pp.501-512
    • /
    • 2014
  • Stylized facts on asset return are fat-tail, asymmetry, volatility clustering and structure changes. This paper simultaneously captures these characteristics by introducing a multi-regime models: Finite mixture distribution and regime switching GARCH model. Analyzing the daily KOSPI return from $4^{th}$ January 2000 to $30^{th}$ June 2014, we find that a two-component mixture of t distribution is a good candidate to describe the shape of the KOSPI return from unconditional and conditional perspectives. Empirical results suggest that the equality assumption on the shape parameter of t distribution yields better discrimination of heterogeneity component in return data. We report the strong regime-dependent characteristics in volatility dynamics with high persistence and asymmetry by employing a regime switching GJR-GARCH model with t innovation model. Compared to two sub-samples, Pre-Crisis (January 2003 ~ December 2007) and Post-Crisis (January 2010 ~ June 2014), we find that the degree of persistence in the Pre-Crisis is higher than in the Post-Crisis along with a strong asymmetry in the low-volatility (high-volatility) regime during the Pre-Crisis (Post-Crisis).

User Oriented clustering of news articles using Tweets Heterogeneous Information Network (트위트 이형 정보 망을 이용한 뉴스 기사의 사용자 지향적 클러스터링)

  • Shoaib, Muhammad;Song, Wang-Cheol
    • Journal of Internet Computing and Services
    • /
    • v.14 no.6
    • /
    • pp.85-94
    • /
    • 2013
  • With the emergence of world wide web, in particular web 2.0 the rapidly growing amount of news articles has created a problem for users in selection of news articles according to their requirements. To overcome this problem different clustering mechanism has been proposed to broadly categorize news articles. However these techniques are totally machine oriented techniques and lack users' participation in the process of decision making for membership of clustering. In order to overcome the issue of zero-participation in the process of clustering news articles in this paper we have proposed a framework for clustering news articles by combining users' judgments that they post on twitter with the news articles to cluster the objects. We have employed twitter hash-tags for this purpose. Furthermore we have computed the credibility of users' based on frequency of retweets for their tweets in order to enhance the accuracy of the clustering membership function. In order to test performance of proposed methodology, we performed experiments on tweets messages tweeted during general election 2013 in Pakistan. Our results proved over claim that using users' output better outcome can be achieved then ordinary clustering algorithms.

Subspace Projection-Based Clustering and Temporal ACRs Mining on MapReduce for Direct Marketing Service

  • Lee, Heon Gyu;Choi, Yong Hoon;Jung, Hoon;Shin, Yong Ho
    • ETRI Journal
    • /
    • v.37 no.2
    • /
    • pp.317-327
    • /
    • 2015
  • A reliable analysis of consumer preference from a large amount of purchase data acquired in real time and an accurate customer characterization technique are essential for successful direct marketing campaigns. In this study, an optimal segmentation of post office customers in Korea is performed using a subspace projection-based clustering method to generate an accurate customer characterization from a high-dimensional census dataset. Moreover, a traditional temporal mining method is extended to an algorithm using the MapReduce framework for a consumer preference analysis. The experimental results show that it is possible to use parallel mining through a MapReduce-based algorithm and that the execution time of the algorithm is faster than that of a traditional method.

Automatic Clustering on Trained Self-organizing Feature Maps via Graph Cuts (그래프 컷을 이용한 학습된 자기 조직화 맵의 자동 군집화)

  • Park, An-Jin;Jung, Kee-Chul
    • Journal of KIISE:Software and Applications
    • /
    • v.35 no.9
    • /
    • pp.572-587
    • /
    • 2008
  • The Self-organizing Feature Map(SOFM) that is one of unsupervised neural networks is a very powerful tool for data clustering and visualization in high-dimensional data sets. Although the SOFM has been applied in many engineering problems, it needs to cluster similar weights into one class on the trained SOFM as a post-processing, which is manually performed in many cases. The traditional clustering algorithms, such as t-means, on the trained SOFM however do not yield satisfactory results, especially when clusters have arbitrary shapes. This paper proposes automatic clustering on trained SOFM, which can deal with arbitrary cluster shapes and be globally optimized by graph cuts. When using the graph cuts, the graph must have two additional vertices, called terminals, and weights between the terminals and vertices of the graph are generally set based on data manually obtained by users. The Proposed method automatically sets the weights based on mode-seeking on a distance matrix. Experimental results demonstrated the effectiveness of the proposed method in texture segmentation. In the experimental results, the proposed method improved precision rates compared with previous traditional clustering algorithm, as the method can deal with arbitrary cluster shapes based on the graph-theoretic clustering.

A Study of Post-processing Methods of Clustering Algorithm and Classification of the Segmented Regions (클러스터링 알고리즘의 후처리 방안과 분할된 영역들의 분류에 대한 연구)

  • Oh, Jun-Taek;Kim, Bo-Ram;Kim, Wook-Hyun
    • The KIPS Transactions:PartB
    • /
    • v.16B no.1
    • /
    • pp.7-16
    • /
    • 2009
  • Some clustering algorithms have a problem that an image is over-segmented since both the spatial information between the segmented regions is not considered and the number of the clusters is defined in advance. Therefore, they are difficult to be applied to the applicable fields. This paper proposes the new post-processing methods, a reclassification of the inhomogeneous clusters and a region merging using Baysian algorithm, that improve the segmentation results of the clustering algorithms. The inhomogeneous cluster is firstly selected based on variance and between-class distance and it is then reclassified into the other clusters in the reclassification step. This reclassification is repeated until the optimal number determined by the minimum average within-class distance. And the similar regions are merged using Baysian algorithm based on Kullbeck-Leibler distance between the adjacent regions. So we can effectively solve the over-segmentation problem and the result can be applied to the applicable fields. Finally, we design a classification system for the segmented regions to validate the proposed method. The segmented regions are classified by SVM(Support Vector Machine) using the principal colors and the texture information of the segmented regions. In experiment, the proposed method showed the validity for various real-images and was effectively applied to the designed classification system.

Recognition of damage pattern and evolution in CFRP cable with a novel bonding anchorage by acoustic emission

  • Wu, Jingyu;Lan, Chengming;Xian, Guijun;Li, Hui
    • Smart Structures and Systems
    • /
    • v.21 no.4
    • /
    • pp.421-433
    • /
    • 2018
  • Carbon fiber reinforced polymer (CFRP) cable has good mechanical properties and corrosion resistance. However, the anchorage of CFRP cable is a big issue due to the anisotropic property of CFRP material. In this article, a high-efficient bonding anchorage with novel configuration is developed for CFRP cables. The acoustic emission (AE) technique is employed to evaluate the performance of anchorage in the fatigue test and post-fatigue ultimate bearing capacity test. The obtained AE signals are analyzed by using a combination of unsupervised K-means clustering and supervised K-nearest neighbor classification (K-NN) for quantifying the performance of the anchorage and damage evolutions. An AE feature vector (including both frequency and energy characteristics of AE signal) for clustering analysis is proposed and the under-sampling approaches are employed to regress the influence of the imbalanced classes distribution in AE dataset for improving clustering quality. The results indicate that four classes exist in AE dataset, which correspond to the shear deformation of potting compound, matrix cracking, fiber-matrix debonding and fiber fracture in CFRP bars. The AE intensity released by the deformation of potting compound is very slight during the whole loading process and no obvious premature damage observed in CFRP bars aroused by anchorage effect at relative low stress level, indicating the anchorage configuration in this study is reliable.