• Title/Summary/Keyword: Time Series Clustering

Search Result 186, Processing Time 0.072 seconds

Nonparametric clustering of functional time series electricity consumption data (전기 사용량 시계열 함수 데이터에 대한 비모수적 군집화)

  • Kim, Jaehee
    • The Korean Journal of Applied Statistics
    • /
    • v.32 no.1
    • /
    • pp.149-160
    • /
    • 2019
  • The electricity consumption time series data of 'A' University from July 2016 to June 2017 is analyzed via nonparametric functional data clustering since the time series data can be regarded as realization of continuous functions with dependency structure. We use a Bouveyron and Jacques (Advances in Data Analysis and Classification, 5, 4, 281-300, 2011) method based on model-based functional clustering with an FEM algorithm that assumes a Gaussian distribution on functional principal components. Clusterwise analysis is provided with cluster mean functions, densities and cluster profiles.

Clustering non-stationary advanced metering infrastructure data

  • Kang, Donghyun;Lim, Yaeji
    • Communications for Statistical Applications and Methods
    • /
    • v.29 no.2
    • /
    • pp.225-238
    • /
    • 2022
  • In this paper, we propose a clustering method for advanced metering infrastructure (AMI) data in Korea. As AMI data presents non-stationarity, we consider time-dependent frequency domain principal components analysis, which is a proper method for locally stationary time series data. We develop a new clustering method based on time-varying eigenvectors, and our method provides a meaningful result that is different from the clustering results obtained by employing conventional methods, such as K-means and K-centres functional clustering. Simulation study demonstrates the superiority of the proposed approach. We further apply the clustering results to the evaluation of the electricity price system in South Korea, and validate the reform of the progressive electricity tariff system.

Nonlinear damage detection using higher statistical moments of structural responses

  • Yu, Ling;Zhu, Jun-Hua
    • Structural Engineering and Mechanics
    • /
    • v.54 no.2
    • /
    • pp.221-237
    • /
    • 2015
  • An integrated method is proposed for structural nonlinear damage detection based on time series analysis and the higher statistical moments of structural responses in this study. It combines the time series analysis, the higher statistical moments of AR model residual errors and the fuzzy c-means (FCM) clustering techniques. A few comprehensive damage indexes are developed in the arithmetic and geometric mean of the higher statistical moments, and are classified by using the FCM clustering method to achieve nonlinear damage detection. A series of the measured response data, downloaded from the web site of the Los Alamos National Laboratory (LANL) USA, from a three-storey building structure considering the environmental variety as well as different nonlinear damage cases, are analyzed and used to assess the performance of the new nonlinear damage detection method. The effectiveness and robustness of the new proposed method are finally analyzed and concluded.

EXTENDED ONLINE DIVISIVE AGGLOMERATIVE CLUSTERING

  • Musa, Ibrahim Musa Ishag;Lee, Dong-Gyu;Ryu, Keun-Ho
    • Proceedings of the KSRS Conference
    • /
    • 2008.10a
    • /
    • pp.406-409
    • /
    • 2008
  • Clustering data streams has an importance over many applications like sensor networks. Existing hierarchical methods follow a semi fuzzy clustering that yields duplicate clusters. In order to solve the problems, we propose an extended online divisive agglomerative clustering on data streams. It builds a tree-like top-down hierarchy of clusters that evolves with data streams using geometric time frame for snapshots. It is an enhancement of the Online Divisive Agglomerative Clustering (ODAC) with a pruning strategy to avoid duplicate clusters. Our main features are providing update time and memory space which is independent of the number of examples on data streams. It can be utilized for clustering sensor data and network monitoring as well as web click streams.

  • PDF

Grouping stocks using dynamic linear models

  • Sihyeon, Kim;Byeongchan, Seong
    • Communications for Statistical Applications and Methods
    • /
    • v.29 no.6
    • /
    • pp.695-708
    • /
    • 2022
  • Recently, several studies have been conducted using state space model. In this study, a dynamic linear model with state space model form is applied to stock data. The monthly returns for 135 Korean stocks are fitted to a dynamic linear model, to obtain an estimate of the time-varying 𝛽-coefficient time-series. The model formula used for the return is a capital asset pricing model formula explained in economics. In particular, the transition equation of the state space model form is appropriately modified to satisfy the assumptions of the error term. k-shape clustering is performed to classify the 135 estimated 𝛽 time-series into several groups. As a result of the clustering, four clusters are obtained, each consisting of approximately 30 stocks. It is found that the distribution is different for each group, so that it is well grouped to have its own characteristics. In addition, a common pattern is observed for each group, which could be interpreted appropriately.

Design of Fuzzy System with Hierarchical Classifying Structures and its Application to Time Series Prediction (계층적 분류구조의 퍼지시스템 설계 및 시계열 예측 응용)

  • Bang, Young-Keun;Lee, Chul-Heui
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.19 no.5
    • /
    • pp.595-602
    • /
    • 2009
  • Fuzzy rules, which represent the behavior of their system, are sensitive to fuzzy clustering techniques. If the classification abilities of such clustering techniques are improved, their systems can work for the purpose more accurately because the capabilities of the fuzzy rules and parameters are enhanced by the clustering techniques. Thus, this paper proposes a new hierarchically structured clustering algorithm that can enhance the classification abilities. The proposed clustering technique consists of two clusters based on correlationship and statistical characteristics between data, which can perform classification more accurately. In addition, this paper uses difference data sets to reflect the patterns and regularities of the original data clearly, and constructs multiple fuzzy systems to consider various characteristics of the differences suitably. To verify effectiveness of the proposed techniques, this paper applies the constructed fuzzy systems to the field of time series prediction, and performs prediction for nonlinear time series examples.

A Determination of an Optimal Clustering Method Based on Data Characteristics

  • Kim, Jeong-Hun;Yoo, Kwan-Hee;Nasridinov, Aziz
    • Asia-pacific Journal of Multimedia Services Convergent with Art, Humanities, and Sociology
    • /
    • v.7 no.8
    • /
    • pp.305-314
    • /
    • 2017
  • Clustering is a method that collects data objects into groups based on their similary. Performance of the state-of-the-art clustering methods is different according to the data characteristics. There have been numerous studies that performed experiments to compare the accuracy of the state-of-the-art clustering methods by applying various kinds of datasets. A common problem of these studies is that they only consider clustering algorithms that yield the most accurate results for a particular dataset. They do not consider what factors affect the execution time of each clustering method and how they are affected. Nevertheless, execution time is an important factor in clustering performance if there is no significant difference in accuracy. In order to solve the problems of the existing research, through a series of experiments using various types of datasets, we compare the accuracy of four representative clustering methods. In addition, we perform practical clustering performance comparisons by deriving time complexity and identifying factors that influences to its performance.

Design of HCBKA-Based IT2TSK Fuzzy Prediction System (HCBKA 기반 IT2TSK 퍼지 예측시스템 설계)

  • Bang, Young-Keun;Lee, Chul-Heui
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.60 no.7
    • /
    • pp.1396-1403
    • /
    • 2011
  • It is not easy to analyze the strong nonlinear time series and effectively design a good prediction system especially due to the difficulties in handling the potential uncertainty included in data and prediction method. To solve this problem, a new design method for fuzzy prediction system is suggested in this paper. The proposed method contains the followings as major parts ; the first-order difference detection to extract the stable information from the nonlinear characteristics of time series, the fuzzy rule generation based on the hierarchically classifying clustering technique to reduce incorrectness of the system parameter identification, and the IT2TSK fuzzy logic system to reasonably handle the potential uncertainty of the series. In addition, the design of the multiple predictors is considered to reflect sufficiently the diverse characteristics concealed in the series. Finally, computer simulations are performed to verify the performance and the effectiveness of the proposed prediction system.

Classification of Time-Series Data Based on Several Lag Windows

  • Kim, Hee-Young;Park, Man-Sik
    • Communications for Statistical Applications and Methods
    • /
    • v.17 no.3
    • /
    • pp.377-390
    • /
    • 2010
  • In the case of time-series analysis, it is often more convenient to rely on the frequency domain than the time domain. Spectral density is the core of the frequency-domain analysis that describes autocorrelation structures in a time-series process. Possible ways to estimate spectral density are to compute a periodogram or to average the periodogram over some frequencies with (un)equal weights. This can be an attractive tool to measure the similarity between time-series processes. We employ the metrics based on a smoothed periodogram proposed by Park and Kim (2008) for the classification of different classes of time-series processes. We consider several lag windows with unequal weights instead of a modified Daniel's window used in Park and Kim (2008). We evaluate the performance under various simulation scenarios. Simulation results reveal that the metrics used in this study split the time series into the preassigned clusters better than do the raw-periodogram based ones proposed by Caiado et al. 2006. Our metrics are applied to an economic time-series dataset.

An Adaption of Pattern Sequence-based Electricity Load Forecasting with Match Filtering

  • Chu, Fazheng;Jung, Sung-Hwan
    • Journal of Korea Multimedia Society
    • /
    • v.20 no.5
    • /
    • pp.800-807
    • /
    • 2017
  • The Pattern Sequence-based Forecasting (PSF) is an approach to forecast the behavior of time series based on similar pattern sequences. The innovation of PSF method is to convert the load time series into a label sequence by clustering technique in order to lighten computational burden. However, it brings about a new problem in determining the number of clusters and it is subject to insufficient similar days occasionally. In this paper we proposed an adaption of the PSF method, which introduces a new clustering index to determine the number of clusters and imposes a threshold to solve the problem caused by insufficient similar days. Our experiments showed that the proposed method reduced the mean absolute percentage error (MAPE) about 15%, compared to the PSF method.