• Title/Summary/Keyword: time series & cluster analysis

Search Result 76, Processing Time 0.024 seconds

Evolutionary Computation-based Hybird Clustring Technique for Manufacuring Time Series Data (제조 시계열 데이터를 위한 진화 연산 기반의 하이브리드 클러스터링 기법)

  • Oh, Sanghoun;Ahn, Chang Wook
    • Smart Media Journal
    • /
    • v.10 no.3
    • /
    • pp.23-30
    • /
    • 2021
  • Although the manufacturing time series data clustering technique is an important grouping solution in the field of detecting and improving manufacturing large data-based equipment and process defects, it has a disadvantage of low accuracy when applying the existing static data target clustering technique to time series data. In this paper, an evolutionary computation-based time series cluster analysis approach is presented to improve the coherence of existing clustering techniques. To this end, first, the image shape resulting from the manufacturing process is converted into one-dimensional time series data using linear scanning, and the optimal sub-clusters for hierarchical cluster analysis and split cluster analysis are derived based on the Pearson distance metric as the target of the transformation data. Finally, by using a genetic algorithm, an optimal cluster combination with minimal similarity is derived for the two cluster analysis results. And the performance superiority of the proposed clustering is verified by comparing the performance with the existing clustering technique for the actual manufacturing process image.

A Study on the Response Plan by Station Area Cluster through Time Series Analysis of Urban Rail Riders Before and After COVID-19 (COVID-19 전후 도시철도 승차인원 시계열 군집분석을 통한 역세권 군집별 대응방안 고찰)

  • Li, Cheng Xi;Jung, Hun Young
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.43 no.3
    • /
    • pp.363-370
    • /
    • 2023
  • Due to the spread of COVID-19, the use of public transportation such as urban railroads has changed significantly since the beginning of 2020. Therefore, in this study, daily time series data for each urban railway station were collected for three years before COVID-19 and after the spread of COVID-19, and the similarity of time series analysis was evaluated through DTW (Dynamic Time Warping) distance method to derive regression centers for each cluster, and the effect of various external events such as COVID-19 on changes in the number of users was diagnosed as a time series impact detection function. In addition, the characteristics of use by cluster of urban railway stations were analyzed, and the change in passenger volume due to external shocks was identified. The purpose was to review measures for the maintenance and recovery of usage in the event of re-proliferation of COVID-19.

Evaluation of Combustion Mechanism of Droplet Cluster in Premixed Spray Flame by Simultaneous Time-Series Measurement (동시 시계열 계측에 의한 예혼합 분무화염 내 유적군 연소기구의 평가)

  • Hwang, Seung-Min
    • Journal of Korean Society of Environmental Engineers
    • /
    • v.31 no.6
    • /
    • pp.442-448
    • /
    • 2009
  • To evaluate the combustion mechanism of each droplet cluster downstream of the premixed spray flame, the simultaneous time-series measurements were conducted by using optical measurement system consisting of laser tomography, multi-color integrated Cassegrain receiving optics (MICRO) and phase Doppler anemometer (PDA). Furthermore, the group combustion number of droplet cluster was estimated experimentally, and the combustion mechanism of droplet cluster was examined applying the theoretical analysis. The group combustion number, $G_c$, was experimentally estimated about all droplet cluster verified by planar images, and it was classified into the internal group combustion mode and the external group combustion mode according to the theoretical analysis. It is found that there are cases in which the group combustion number estimated experimentally for droplet cluster agree or disagree with the classification by theoretical analysis. The reason of disagreement is considered due to that the group combustion number was only estimated by the geometrical arrangement of droplets in cluster, and that the actual phenomenon is three-dimensional but the measurement system is two-dimensional.

Cluster Analysis of Daily Electricity Demand with t-SNE

  • Min, Yunhong
    • Journal of the Korea Society of Computer and Information
    • /
    • v.23 no.5
    • /
    • pp.9-14
    • /
    • 2018
  • For an efficient management of electricity market and power systems, accurate forecasts for electricity demand are essential. Since there are many factors, either known or unknown, determining the realized loads, it is difficult to forecast the demands with the past time series only. In this paper we perform a cluster analysis on electricity demand data collected from Jan. 2000 to Dec. 2017. Our purpose of clustering on electricity demand data is that each cluster is expected to consist of data whose latent variables are same or similar values. Then, if properly clustered, it is possible to develop an accurate forecasting model for each cluster separately. To validate the feasibility of this approach for building better forecasting models, we clustered data with t-SNE. To apply t-SNE to time series data effectively, we adopt the dynamic time warping as a similarity measure. From the result of experiments, we found that several clusters are well observed and each cluster can be interpreted as a mix of well-known factors such as trends, seasonality and holiday effects and other unknown factors. These findings can motivate the approaches which build forecasting models with respect to each cluster independently.

Classifying Alley Markets through Cluster Analysis Using Dynamic Time Warping and Analyzing Possibility of Opening New Stores

  • Kang, Hyun Mo;Lee, Sang-Kyeong
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.35 no.5
    • /
    • pp.329-338
    • /
    • 2017
  • This study attempts to classify 1008 alley markets in Seoul through cluster analysis using Dynamic Time Warping, one of the methods used to analyze the similarity of time series, and evaluate the possibility of opening new stores. The sequence of the gross sales of an alley market and that of gross sales per store stand for the potential of growth and profitability of the market, respectively and are used as variables for cluster analysis. Five clusters are obtained for the gross sales and four clusters for the gross sales per store. These two types of clusters are again classified as rising and falling trends, respectively, and the combination of these trends produces four categories. These categories are used to evaluate the possibility of opening new stores in alley markets. The results show that the southeast which is relatively wealthy inferior to other regions in opening new stores. Alley markets in the northeast and the southwest are better than other regions such that opening a new store is justified. In the northwest, there are many markets with trend of gross sales and that of gross sales per store moving in opposite directions, and new store openings in these markets should be postponed.

Comparison of time series clustering methods and application to power consumption pattern clustering

  • Kim, Jaehwi;Kim, Jaehee
    • Communications for Statistical Applications and Methods
    • /
    • v.27 no.6
    • /
    • pp.589-602
    • /
    • 2020
  • The development of smart grids has enabled the easy collection of a large amount of power data. There are some common patterns that make it useful to cluster power consumption patterns when analyzing s power big data. In this paper, clustering analysis is based on distance functions for time series and clustering algorithms to discover patterns for power consumption data. In clustering, we use 10 distance measures to find the clusters that consider the characteristics of time series data. A simulation study is done to compare the distance measures for clustering. Cluster validity measures are also calculated and compared such as error rate, similarity index, Dunn index and silhouette values. Real power consumption data are used for clustering, with five distance measures whose performances are better than others in the simulation.

Exploring COVID-19 in mainland China during the lockdown of Wuhan via functional data analysis

  • Li, Xing;Zhang, Panpan;Feng, Qunqiang
    • Communications for Statistical Applications and Methods
    • /
    • v.29 no.1
    • /
    • pp.103-125
    • /
    • 2022
  • In this paper, we analyze the time series data of the case and death counts of COVID-19 that broke out in China in December, 2019. The study period is during the lockdown of Wuhan. We exploit functional data analysis methods to analyze the collected time series data. The analysis is divided into three parts. First, the functional principal component analysis is conducted to investigate the modes of variation. Second, we carry out the functional canonical correlation analysis to explore the relationship between confirmed and death cases. Finally, we utilize a clustering method based on the Expectation-Maximization (EM) algorithm to run the cluster analysis on the counts of confirmed cases, where the number of clusters is determined via a cross-validation approach. Besides, we compare the clustering results with some migration data available to the public.

Nonparametric clustering of functional time series electricity consumption data (전기 사용량 시계열 함수 데이터에 대한 비모수적 군집화)

  • Kim, Jaehee
    • The Korean Journal of Applied Statistics
    • /
    • v.32 no.1
    • /
    • pp.149-160
    • /
    • 2019
  • The electricity consumption time series data of 'A' University from July 2016 to June 2017 is analyzed via nonparametric functional data clustering since the time series data can be regarded as realization of continuous functions with dependency structure. We use a Bouveyron and Jacques (Advances in Data Analysis and Classification, 5, 4, 281-300, 2011) method based on model-based functional clustering with an FEM algorithm that assumes a Gaussian distribution on functional principal components. Clusterwise analysis is provided with cluster mean functions, densities and cluster profiles.

Nonlinear damage detection using linear ARMA models with classification algorithms

  • Chen, Liujie;Yu, Ling;Fu, Jiyang;Ng, Ching-Tai
    • Smart Structures and Systems
    • /
    • v.26 no.1
    • /
    • pp.23-33
    • /
    • 2020
  • Majority of the damage in engineering structures is nonlinear. Damage sensitive features (DSFs) extracted by traditional methods from linear time series models cannot effectively handle nonlinearity induced by structural damage. A new DSF is proposed based on vector space cosine similarity (VSCS), which combines K-means cluster analysis and Bayesian discrimination to detect nonlinear structural damage. A reference autoregressive moving average (ARMA) model is built based on measured acceleration data. This study first considers an existing DSF, residual standard deviation (RSD). The DSF is further advanced using the VSCS, and then the advanced VSCS is classified using K-means cluster analysis and Bayes discriminant analysis, respectively. The performance of the proposed approach is then verified using experimental data from a three-story shear building structure, and compared with the results of existing RSD. It is demonstrated that combining the linear ARMA model and the advanced VSCS, with cluster analysis and Bayes discriminant analysis, respectively, is an effective approach for detection of nonlinear damage. This approach improves the reliability and accuracy of the nonlinear damage detection using the linear model and significantly reduces the computational cost. The results indicate that the proposed approach is potential to be a promising damage detection technique.

Real Estate Price Forecasting by Exploiting the Regional Analysis Based on SOM and LSTM (SOM과 LSTM을 활용한 지역기반의 부동산 가격 예측)

  • Shin, Eun Kyung;Kim, Eun Mi;Hong, Tae Ho
    • The Journal of Information Systems
    • /
    • v.30 no.2
    • /
    • pp.147-163
    • /
    • 2021
  • Purpose The study aims to predict real estate prices by utilizing regional characteristics. Since real estate has the characteristic of immobility, the characteristics of a region have a great influence on the price of real estate. In addition, real estate prices are closely related to economic development and are a major concern for policy makers and investors. Accurate house price forecasting is necessary to prepare for the impact of house price fluctuations. To improve the performance of our predictive models, we applied LSTM, a widely used deep learning technique for predicting time series data. Design/methodology/approach This study used time series data on real estate prices provided by the Ministry of Land, Infrastructure and Transport. For time series data preprocessing, HP filters were applied to decompose trends and SOM was used to cluster regions with similar price directions. To build a real estate price prediction model, SVR and LSTM were applied, and the prices of regions classified into similar clusters by SOM were used as input variables. Findings The clustering results showed that the region of the same cluster was geographically close, and it was possible to confirm the characteristics of being classified as the same cluster even if there was a price level and a similar industry group. As a result of predicting real estate prices in 1, 2, and 3 months, LSTM showed better predictive performance than SVR, and LSTM showed better predictive performance in long-term forecasting 3 months later than in 1-month short-term forecasting.