• Title/Summary/Keyword: clustering patterns

Search Result 436, Processing Time 0.025 seconds

Web Mining for Discovering Interesting Information using Effective Clustering (효율적인 클러스터링을 이용한 관심 정보 추출을 위한 웹 마이닝)

  • Kim, Sung-Hark;Ahn, Byeong-Tae
    • Journal of Digital Contents Society
    • /
    • v.9 no.2
    • /
    • pp.251-260
    • /
    • 2008
  • In internet being a repository of massive information, we easily may not find our desired information, this issue also exists in e-commerce which gets rapid growth. In most of e-commerce sites, the methods furnishing information have been made use of statistical analysis or simple process by category-oriented, but these can't represent diverse correlation among products information and also hardly reflect users' purchasing patterns precisely. In this thesis, we propose more efficient web mining ways for discovering interesting information using effective clustering in e-commerce, which get achieved more suitable relationship among products information using both sequential patterns and association rules in category-independent, and experiments show the efficiency of our proposed methods. And we propose search using effective clustering rapidly.

  • PDF

An Improved Cat Swarm Optimization Algorithm Based on Opposition-Based Learning and Cauchy Operator for Clustering

  • Kumar, Yugal;Sahoo, Gadadhar
    • Journal of Information Processing Systems
    • /
    • v.13 no.4
    • /
    • pp.1000-1013
    • /
    • 2017
  • Clustering is a NP-hard problem that is used to find the relationship between patterns in a given set of patterns. It is an unsupervised technique that is applied to obtain the optimal cluster centers, especially in partitioned based clustering algorithms. On the other hand, cat swarm optimization (CSO) is a new meta-heuristic algorithm that has been applied to solve various optimization problems and it provides better results in comparison to other similar types of algorithms. However, this algorithm suffers from diversity and local optima problems. To overcome these problems, we are proposing an improved version of the CSO algorithm by using opposition-based learning and the Cauchy mutation operator. We applied the opposition-based learning method to enhance the diversity of the CSO algorithm and we used the Cauchy mutation operator to prevent the CSO algorithm from trapping in local optima. The performance of our proposed algorithm was tested with several artificial and real datasets and compared with existing methods like K-means, particle swarm optimization, and CSO. The experimental results show the applicability of our proposed method.

Machine Learning Approach to the Effects of the Superstore Mandatory Closing Regulation

  • AN, Jiyoung;PARK, Heedae
    • Journal of Distribution Science
    • /
    • v.18 no.2
    • /
    • pp.69-77
    • /
    • 2020
  • Purpose - This paper is aimed to analyze the effects of the mandatory closing regulation targeting large retailers, which has been implemented since 2012 to protect small retailers. We examine the changes in consumers' choice of retailers and their purchasing patterns of agri-food following the implementation of such regulation. Research design, data, and methodology - Household spending patterns were identified through the historical data of household food purchase, consumer panel provided by the Rural Development Administration. Clustering was employed to determine the household spending patterns. Moreover, the different household spending patterns before and after the regulation were comparatively studied. The patterns of consumers' choice of retail stores and shopping baskets by the type of retailers, derived from the respective datasets before and after the regulation, were compared to analyze the effects of the regulation. Results -After the regulation, some consumers who used to shop at large retailers before the regulation changed their shopping places to small retailers. However, the product categories that consumers had mainly purchased before the regulation were rarely changed even after the regulation. Conclusions - Although the regulation helped migrate some of the consumers to small retailers, the regulation seemed to have failed to stimulate consumers to purchase the goods, normally bought at large retailers, from traditional markets. In other words, traditional markets are not effective substitutes for regulation-affected retailers.

An Survey on the Power System Modeling using a Clustering Algorithm (클러스터링 기법을 적용한 전력시스템 모델링에 관한 사례 조사)

  • Park, Young-Soo;Kim, Jin-Ho
    • Proceedings of the KIEE Conference
    • /
    • 2006.07a
    • /
    • pp.410-411
    • /
    • 2006
  • This paper is focused on the survey on the power system modeling using a clustering algorithm. In electricity markets, clustering method is a efficient tool to model the power system. It can be seen that electricity markets can also be classified into several groups which show similar patterns and that the fundamental characteristics of power systems can be widely applicable to other technical problems in power system such as generation scheduling, power flow analysis, short-term load forecasting, and so on. There are several researches on the power system modeling using a clustering algorithm. We specially surveyed their own clustering methods to model the power system.

  • PDF

Multi-scale Cluster Hierarchy for Non-stationary Functional Signals of Mutual Fund Returns (Mutual Fund 수익률의 비정상 함수형 시그널을 위한 다해상도 클러스터 계층구조)

  • Kim, Dae-Lyong;Jung, Uk
    • Korean Management Science Review
    • /
    • v.24 no.2
    • /
    • pp.57-72
    • /
    • 2007
  • Many Applications of scientific research have coupled with functional data signal clustering techniques to discover novel characteristics that can be used for the diagnoses of several issues. In this article we present an interpretable multi-scale cluster hierarchy framework for clustering functional data using its multi-aspect frequency information. The suggested method focuses on how to effectively select transformed features/variables in unsupervised manner so that finally reduce the data dimension and achieve the multi-purposed clustering. Specially, we apply our suggested method to mutual fund returns and make superior-performing funds group based on different aspects such as global patterns, seasonal variations, levels of noise, and their combinations. To promise our method producing a quality cluster hierarchy, we give some empirical results under the simulation study and a set of real life data. This research will contribute to financial market analysis and flexibly fit to other research fields with clustering purposes.

Design and Implementation of the Ensemble-based Classification Model by Using k-means Clustering

  • Song, Sung-Yeol;Khil, A-Ra
    • Journal of the Korea Society of Computer and Information
    • /
    • v.20 no.10
    • /
    • pp.31-38
    • /
    • 2015
  • In this paper, we propose the ensemble-based classification model which extracts just new data patterns from the streaming-data by using clustering and generates new classification models to be added to the ensemble in order to reduce the number of data labeling while it keeps the accuracy of the existing system. The proposed technique performs clustering of similar patterned data from streaming data. It performs the data labeling to each cluster at the point when a certain amount of data has been gathered. The proposed technique applies the K-NN technique to the classification model unit in order to keep the accuracy of the existing system while it uses a small amount of data. The proposed technique is efficient as using about 3% less data comparing with the existing technique as shown the simulation results for benchmarks, thereby using clustering.

Classification of Volatile Chemicals using Fuzzy Clustering Algorithm (퍼지 Clustering 알고리즘을 이용한 휘발성 화학물질의 분류)

  • Byun, Hyung-Gi;Kim, Kab-Il
    • Proceedings of the KIEE Conference
    • /
    • 1996.07b
    • /
    • pp.1042-1044
    • /
    • 1996
  • The use of fuzzy theory in task of pattern recognition may be applicable gases and odours classification and recognition. This paper reports results obtained from fuzzy c-means algorithms to patterns generated by odour sensing system using an array of conducting polymer sensors, for volatile chemicals. For the volatile chemicals clustering problem, the three unsupervise fuzzy c-means algorithms were applied. From among the pattern clustering methods, the FCMAW algorithm, which updated the cluster centres more frequently, consistently outperformed. It has been confirmed as an outstanding clustering algorithm throughout experimental trials.

  • PDF

FCAnalyzer: A Functional Clustering Analysis Tool for Predicted Transcription Regulatory Elements and Gene Ontology Terms

  • Kim, Sang-Bae;Ryu, Gil-Mi;Kim, Young-Jin;Heo, Jee-Yeon;Park, Chan;Oh, Berm-Seok;Kim, Hyung-Lae;Kimm, Ku-Chan;Kim, Kyu-Won;Kim, Young-Youl
    • Genomics & Informatics
    • /
    • v.5 no.1
    • /
    • pp.10-18
    • /
    • 2007
  • Numerous studies have reported that genes with similar expression patterns are co-regulated. From gene expression data, we have assumed that genes having similar expression pattern would share similar transcription factor binding sites (TFBSs). These function as the binding regions for transcription factors (TFs) and thereby regulate gene expression. In this context, various analysis tools have been developed. However, they have shortcomings in the combined analysis of expression patterns and significant TFBSs and in the functional analysis of target genes of significantly overrepresented putative regulators. In this study, we present a web-based A Functional Clustering Analysis Tool for Predicted Transcription Regulatory Elements and Gene Ontology Terms (FCAnalyzer). This system integrates microarray clustering data with similar expression patterns, and TFBS data in each cluster. FCAnalyzer is designed to perform two independent clustering procedures. The first process clusters gene expression profiles using the K-means clustering method, and the second process clusters predicted TFBSs in the upstream region of previously clustered genes using the hierarchical biclustering method for simultaneous grouping of genes and samples. This system offers retrieved information for predicted TFBSs in each cluster using $Match^{TM}$ in the TRANSFAC database. We used gene ontology term analysis for functional annotation of genes in the same cluster. We also provide the user with a combinatorial TFBS analysis of TFBS pairs. The enrichment of TFBS analysis and GO term analysis is statistically by the calculation of P values based on Fisher’s exact test, hypergeometric distribution and Bonferroni correction. FCAnalyzer is a web-based, user-friendly functional clustering analysis system that facilitates the transcriptional regulatory analysis of co-expressed genes. This system presents the analyses of clustered genes, significant TFBSs, significantly enriched TFBS combinations, their target genes and TFBS-TF pairs.

Black-Litterman Portfolio with K-shape Clustering (K-shape 군집화 기반 블랙-리터만 포트폴리오 구성)

  • Yeji Kim;Poongjin Cho
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.46 no.4
    • /
    • pp.63-73
    • /
    • 2023
  • This study explores modern portfolio theory by integrating the Black-Litterman portfolio with time-series clustering, specificially emphasizing K-shape clustering methodology. K-shape clustering enables grouping time-series data effectively, enhancing the ability to plan and manage investments in stock markets when combined with the Black-Litterman portfolio. Based on the patterns of stock markets, the objective is to understand the relationship between past market data and planning future investment strategies through backtesting. Additionally, by examining diverse learning and investment periods, it is identified optimal strategies to boost portfolio returns while efficiently managing associated risks. For comparative analysis, traditional Markowitz portfolio is also assessed in conjunction with clustering techniques utilizing K-Means and K-Means with Dynamic Time Warping. It is suggested that the combination of K-shape and the Black-Litterman model significantly enhances portfolio optimization in the stock market, providing valuable insights for making stable portfolio investment decisions. The achieved sharpe ratio of 0.722 indicates a significantly higher performance when compared to other benchmarks, underlining the effectiveness of the K-shape and Black-Litterman integration in portfolio optimization.

Classification of Daily Precipitation Patterns in South Korea using Mutivariate Statistical Methods

  • Mika, Janos;Kim, Baek-Jo;Park, Jong-Kil
    • Journal of Environmental Science International
    • /
    • v.15 no.12
    • /
    • pp.1125-1139
    • /
    • 2006
  • The cluster analysis of diurnal precipitation patterns is performed by using daily precipitation of 59 stations in South Korea from 1973 to 1996 in four seasons of each year. Four seasons are shifted forward by 15 days compared to the general ones. Number of clusters are 15 in winter, 16 in spring and autumn, and 26 in summer, respectively. One of the classes is the totally dry day in each season, indicating that precipitation is never observed at any station. This is treated separately in this study. Distribution of the days among the clusters is rather uneven with rather low area-mean precipitation occurring most frequently. These 4 (seasons)$\times$2 (wet and dry days) classes represent more than the half (59 %) of all days of the year. On the other hand, even the smallest seasonal clusters show at least $5\sim9$ members in the 24 years (1973-1996) period of classification. The cluster analysis is directly performed for the major $5\sim8$ non-correlated coefficients of the diurnal precipitation patterns obtained by factor analysis In order to consider the spatial correlation. More specifically, hierarchical clustering based on Euclidean distance and Ward's method of agglomeration is applied. The relative variance explained by the clustering is as high as average (63%) with better capability in spring (66%) and winter (69 %), but lower than average in autumn (60%) and summer (59%). Through applying weighted relative variances, i.e. dividing the squared deviations by the cluster averages, we obtain even better values, i.e 78 % in average, compared to the same index without clustering. This means that the highest variance remains in the clusters with more precipitation. Besides all statistics necessary for the validation of the final classification, 4 cluster centers are mapped for each season to illustrate the range of typical extremities, paired according to their area mean precipitation or negative pattern correlation. Possible alternatives of the performed classification and reasons for their rejection are also discussed with inclusion of a wide spectrum of recommended applications.