• Title/Summary/Keyword: piecewise aggregate approximation

Search Result 5, Processing Time 0.021 seconds

Time series representation for clustering using unbalanced Haar wavelet transformation (불균형 Haar 웨이블릿 변환을 이용한 군집화를 위한 시계열 표현)

  • Lee, Sehun;Baek, Changryong
    • The Korean Journal of Applied Statistics
    • /
    • v.31 no.6
    • /
    • pp.707-719
    • /
    • 2018
  • Various time series representation methods have been proposed for efficient time series clustering and classification. Lin et al. (DMKD, 15, 107-144, 2007) proposed a symbolic aggregate approximation (SAX) method based on symbolic representations after approximating the original time series using piecewise local mean. The performance of SAX therefore depends heavily on how well the piecewise local averages approximate original time series features. SAX equally divides the entire series into an arbitrary number of segments; however, it is not sufficient to capture key features from complex, large-scale time series data. Therefore, this paper considers data-adaptive local constant approximation of the time series using the unbalanced Haar wavelet transformation. The proposed method is shown to outperforms SAX in many real-world data applications.

Efficient Time-Series Subsequence Matching Using MBR-Safe Property of Piecewise Aggregation Approximation (부분 집계 근사법의 MBR-안전 성질을 이용한 효율적인 시계열 서브시퀀스 매칭)

  • Moon, Yang-Sae
    • Journal of KIISE:Databases
    • /
    • v.34 no.6
    • /
    • pp.503-517
    • /
    • 2007
  • In this paper we address the MBR-safe property of Piecewise Aggregation Approximation(PAA), and propose an of efficient subsequence matching method based on the MBR-safe PAA. A transformation is said to be MBR-safe if a low-dimensional MBR to which a high- dimensional MBR is transformed by the transformation contains every individual low-dimensional sequence to which a high-dimensional sequence is transformed. Using an MBR-safe transformation we can reduce the number of lower-dimensional transformations required in similar sequence matching, since it transforms a high-dimensional MBR itself to a low-dimensional MBR directly. Furthermore, PAA is known as an excellent lower-dimensional transformation single its computation is very simple, and its performance is superior to other transformations. Thus, to integrate these advantages of PAA and MBR-safeness, we first formally confirm the MBR-safe property of PAA, and then improve subsequence matching performance using the MBR-safe PAA. Contributions of the paper can be summarized as follows. First, we propose a PAA-based MBR-safe transformation, called mbrPAA, and formally prove the MBR-safeness of mbrPAA. Second, we propose an mbrPAA-based subsequence matching method, and formally prove its correctness of the proposed method. Third, we present the notion of entry reuse property, and by using the property, we propose an efficient method of constructing high-dimensional MBRs in subsequence matching. Fourth, we show the superiority of mbrPAA through extensive experiments. Experimental results show that, compared with the previous approach, our mbrPAA is 24.2 times faster in the low-dimensional MBR construction and improves subsequence matching performance by up to 65.9%.

Clustering Algorithm for Time Series with Similar Shapes

  • Ahn, Jungyu;Lee, Ju-Hong
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.12 no.7
    • /
    • pp.3112-3127
    • /
    • 2018
  • Since time series clustering is performed without prior information, it is used for exploratory data analysis. In particular, clusters of time series with similar shapes can be used in various fields, such as business, medicine, finance, and communications. However, existing time series clustering algorithms have a problem in that time series with different shapes are included in the clusters. The reason for such a problem is that the existing algorithms do not consider the limitations on the size of the generated clusters, and use a dimension reduction method in which the information loss is large. In this paper, we propose a method to alleviate the disadvantages of existing methods and to find a better quality of cluster containing similarly shaped time series. In the data preprocessing step, we normalize the time series using z-transformation. Then, we use piecewise aggregate approximation (PAA) to reduce the dimension of the time series. In the clustering step, we use density-based spatial clustering of applications with noise (DBSCAN) to create a precluster. We then use a modified K-means algorithm to refine the preclusters containing differently shaped time series into subclusters containing only similarly shaped time series. In our experiments, our method showed better results than the existing method.

A Study on the Air Pollution Monitoring Network Algorithm Using Deep Learning (심층신경망 모델을 이용한 대기오염망 자료확정 알고리즘 연구)

  • Lee, Seon-Woo;Yang, Ho-Jun;Lee, Mun-Hyung;Choi, Jung-Moo;Yun, Se-Hwan;Kwon, Jang-Woo;Park, Ji-Hoon;Jung, Dong-Hee;Shin, Hye-Jung
    • Journal of Convergence for Information Technology
    • /
    • v.11 no.11
    • /
    • pp.57-65
    • /
    • 2021
  • We propose a novel method to detect abnormal data of specific symptoms using deep learning in air pollution measurement system. Existing methods generally detect abnomal data by classifying data showing unusual patterns different from the existing time series data. However, these approaches have limitations in detecting specific symptoms. In this paper, we use DeepLab V3+ model mainly used for foreground segmentation of images, whose structure has been changed to handle one-dimensional data. Instead of images, the model receives time-series data from multiple sensors and can detect data showing specific symptoms. In addition, we improve model's performance by reducing the complexity of noisy form time series data by using 'piecewise aggregation approximation'. Through the experimental results, it can be confirmed that anomaly data detection can be performed successfully.

Noise Averaging Effect on Privacy-Preserving Clustering of Time-Series Data (시계열 데이터의 프라이버시 보호 클러스터링에서 노이즈 평준화 효과)

  • Moon, Yang-Sae;Kim, Hea-Suk
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.16 no.3
    • /
    • pp.356-360
    • /
    • 2010
  • Recently, there have been many research efforts on privacy-preserving data mining. In privacy-preserving data mining, accuracy preservation of mining results is as important as privacy preservation. Random perturbation privacy-preserving data mining technique is known to well preserve privacy. However, it has a problem that it destroys distance orders among time-series. In this paper, we propose a notion of the noise averaging effect of piecewise aggregate approximation(PAA), which can be preserved the clustering accuracy as high as possible in time-series data clustering. Based on the noise averaging effect, we define the PAA distance in computing distance. And, we show that our PAA distance can alleviate the problem of destroying distance orders in random perturbing time series.