• 제목/요약/키워드: time series data

검색결과 3,624건 처리시간 0.031초

훼손된 시계열 데이터 분석을 위한 퍼지 시스템 융합 연구 (Fused Fuzzy Logic System for Corrupted Time Series Data Analysis)

  • 김동원
    • 사물인터넷융복합논문지
    • /
    • 제4권1호
    • /
    • pp.1-5
    • /
    • 2018
  • 본 논문에서는 노이즈에 의해 훼손된 시계열 데이터의 모델링에 대하여 다룬다. 모델링 기법으로, 논싱글톤 퍼지 시스템을 사용한다. 논싱글톤 퍼지 시스템의 주요특징은 미지의 비선형시스템의 입력이 퍼지값으로 모델링 된다는데 있다. 그러므로 퍼지시스템에 인가되는 학습데이터나 입력데이터 등이 노이즈나 외부 환경에 의해 변형된 경우에 매우 유용하게 적용될 수 있다. 성능비교를 위해 벤치마크 데이터로 잘 알려진 Mackey-Glass 데이터를 사용한다. 이들 데이터 모델링을 통하여 결과를 비교, 분석하여 논싱글톤 퍼지시스템이 잡음에 대하여 보다 강인하고 효율적임을 본 논문에서 보인다.

A Biclustering Method for Time Series Analysis

  • Lee, Jeong-Hwa;Lee, Young-Rok;Jun, Chi-Hyuck
    • Industrial Engineering and Management Systems
    • /
    • 제9권2호
    • /
    • pp.131-140
    • /
    • 2010
  • Biclustering is a method of finding meaningful subsets of objects and attributes simultaneously, which may not be detected by traditional clustering methods. It is popularly used for the analysis of microarray data representing the expression levels of genes by conditions. Usually, biclustering algorithms do not consider a sequential relation between attributes. For time series data, however, bicluster solutions should keep the time sequence. This paper proposes a new biclustering algorithm for time series data by modifying the plaid model. The proposed algorithm introduces a parameter controlling an interval between two selected time points. Also, the pruning step preventing an over-fitting problem is modified so as to eliminate only starting or ending points. Results from artificial data sets show that the proposed method is more suitable for the extraction of biclusters from time series data sets. Moreover, by using the proposed method, we find some interesting observations from real-world time-course microarray data sets and apartment price data sets in metropolitan areas.

The use of linear stochastic estimation for the reduction of data in the NIST aerodynamic database

  • Chen, Y.;Kopp, G.A.;Surry, D.
    • Wind and Structures
    • /
    • 제6권2호
    • /
    • pp.107-126
    • /
    • 2003
  • This paper describes a simple and practical approach through the application of Linear Stochastic Estimation (LSE) to reconstruct wind-induced pressure time series from the covariance matrix for structural load analyses on a low building roof. The main application of this work would be the reduction of the data storage requirements for the NIST aerodynamic database. The approach is based on the assumption that a random pressure field can be estimated as a linear combination of some other known pressure time series by truncating nonlinear terms of a Taylor series expansion. Covariances between pressure time series to be simulated and reference time series are used to calculate the estimation coefficients. The performance using different LSE schemes with selected reference time series is demonstrated by the reconstruction of structural load time series in a corner bay for three typical wind directions. It is shown that LSE can simulate structural load time series accurately, given a handful of reference pressure taps (or even a single tap). The performance of LSE depends on the choice of the reference time series, which should be determined by considering the balance between the accuracy, data-storage requirements and the complexity of the approach. The approach should only be used for the determination of structural loads, since individual reconstructed pressure time series (for local load analyses) will have larger errors associated with them.

Clustering Algorithm for Time Series with Similar Shapes

  • Ahn, Jungyu;Lee, Ju-Hong
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제12권7호
    • /
    • pp.3112-3127
    • /
    • 2018
  • Since time series clustering is performed without prior information, it is used for exploratory data analysis. In particular, clusters of time series with similar shapes can be used in various fields, such as business, medicine, finance, and communications. However, existing time series clustering algorithms have a problem in that time series with different shapes are included in the clusters. The reason for such a problem is that the existing algorithms do not consider the limitations on the size of the generated clusters, and use a dimension reduction method in which the information loss is large. In this paper, we propose a method to alleviate the disadvantages of existing methods and to find a better quality of cluster containing similarly shaped time series. In the data preprocessing step, we normalize the time series using z-transformation. Then, we use piecewise aggregate approximation (PAA) to reduce the dimension of the time series. In the clustering step, we use density-based spatial clustering of applications with noise (DBSCAN) to create a precluster. We then use a modified K-means algorithm to refine the preclusters containing differently shaped time series into subclusters containing only similarly shaped time series. In our experiments, our method showed better results than the existing method.

Extending the Scope of Automatic Time Series Model Selection: The Package autots for R

  • Jang, Dong-Ik;Oh, Hee-Seok;Kim, Dong-Hoh
    • Communications for Statistical Applications and Methods
    • /
    • 제18권3호
    • /
    • pp.319-331
    • /
    • 2011
  • In this paper, we propose automatic procedures for the model selection of various univariate time series data. Automatic model selection is important, especially in data mining with large number of time series, for example, the number (in thousands) of signals accessing a web server during a specific time period. Several methods have been proposed for automatic model selection of time series. However, most existing methods focus on linear time series models such as exponential smoothing and autoregressive integrated moving average(ARIMA) models. The key feature that distinguishes the proposed procedures from previous approaches is that the former can be used for both linear time series models and nonlinear time series models such as threshold autoregressive(TAR) models and autoregressive moving average-generalized autoregressive conditional heteroscedasticity(ARMA-GARCH) models. The proposed methods select a model from among the various models in the prediction error sense. We also provide an R package autots that implements the proposed automatic model selection procedures. In this paper, we illustrate these algorithms with the artificial and real data, and describe the implementation of the autots package for R.

Effects of Overdispersion on Testing for Serial Dependence in the Time Series of Counts Data

  • Kim, Hee-Young;Park, You-Sung
    • Communications for Statistical Applications and Methods
    • /
    • 제17권6호
    • /
    • pp.829-843
    • /
    • 2010
  • To test for the serial dependence in time series of counts data, Jung and Tremayne (2003) evaluated the size and power of several tests under the class of INARMA models based on binomial thinning operations for Poisson marginal distributions. The overdispersion phenomenon(i.e., a variance greater than the expectation) is common in the real world. Overdispersed count data can be modeled by using alternative thinning operations such as random coefficient thinning, iterated thinning, and quasi-binomial thinning. Such thinning operations can lead to time series models of counts with negative binomial or generalized Poisson marginal distributions. This paper examines whether the test statistics used by Jung and Tremayne (2003) on serial dependence in time series of counts data are affected by overdispersion.

퍼지 이론을 이용한 악보의 모델링 (Fuzzy Logic-based Modeling of a Score)

  • 손세호;권순학
    • 한국지능시스템학회논문지
    • /
    • 제11권3호
    • /
    • pp.264-269
    • /
    • 2001
  • 본 논문에서는 악보를 시계열로 해석하여 퍼지 로직을 이용한 모델링에 대하여 다루고자 한다. 악보에 나타난 음악적 기호들은 음의 길이와 높이 등의 많은 정보들은 나타낸다. 본 논문에서는 멜로디, 음높이와 음색들을 사용하여 악보의 시각적 정보를 시계열 자료로 변환한다. 시계열 자료의 특징을 추출하기 위해 시계열 자료에 슬라이딩 윈도우를 통과시켜 다시 한번 새로운 시계열 자료로 변환한다. 변환된 시계열 자료를 분석하기 위해 Box-Jenkins의 시계열 분석 방법을 사용하고 분석된 시계열의 특징을 바탕으로 퍼지 모델을 구성한다.

  • PDF

시계열 분해 및 데이터 증강 기법 활용 건화물운임지수 예측 (Forecasting Baltic Dry Index by Implementing Time-Series Decomposition and Data Augmentation Techniques)

  • 한민수;유성진
    • 품질경영학회지
    • /
    • 제50권4호
    • /
    • pp.701-716
    • /
    • 2022
  • Purpose: This study aims to predict the dry cargo transportation market economy. The subject of this study is the BDI (Baltic Dry Index) time-series, an index representing the dry cargo transport market. Methods: In order to increase the accuracy of the BDI time-series, we have pre-processed the original time-series via time-series decomposition and data augmentation techniques and have used them for ANN learning. The ANN algorithms used are Multi-Layer Perceptron (MLP), Recurrent Neural Network (RNN), and Long Short-Term Memory (LSTM) to compare and analyze the case of learning and predicting by applying time-series decomposition and data augmentation techniques. The forecast period aims to make short-term predictions at the time of t+1. The period to be studied is from '22. 01. 07 to '22. 08. 26. Results: Only for the case of the MAPE (Mean Absolute Percentage Error) indicator, all ANN models used in the research has resulted in higher accuracy (1.422% on average) in multivariate prediction. Although it is not a remarkable improvement in prediction accuracy compared to uni-variate prediction results, it can be said that the improvement in ANN prediction performance has been achieved by utilizing time-series decomposition and data augmentation techniques that were significant and targeted throughout this study. Conclusion: Nevertheless, due to the nature of ANN, additional performance improvements can be expected according to the adjustment of the hyper-parameter. Therefore, it is necessary to try various applications of multiple learning algorithms and ANN optimization techniques. Such an approach would help solve problems with a small number of available data, such as the rapidly changing business environment or the current shipping market.

Time Series Data Cleaning Method Based on Optimized ELM Prediction Constraints

  • Guohui Ding;Yueyi Zhu;Chenyang Li;Jinwei Wang;Ru Wei;Zhaoyu Liu
    • Journal of Information Processing Systems
    • /
    • 제19권2호
    • /
    • pp.149-163
    • /
    • 2023
  • Affected by external factors, errors in time series data collected by sensors are common. Using the traditional method of constraining the speed change rate to clean the errors can get good performance. However, they are only limited to the data of stable changing speed because of fixed constraint rules. Actually, data with uneven changing speed is common in practice. To solve this problem, an online cleaning algorithm for time series data based on dynamic speed change rate constraints is proposed in this paper. Since time series data usually changes periodically, we use the extreme learning machine to learn the law of speed changes from past data and predict the speed ranges that change over time to detect the data. In order to realize online data repair, a dual-window mechanism is proposed to transform the global optimal into the local optimal, and the traditional minimum change principle and median theorem are applied in the selection of the repair strategy. Aiming at the problem that the repair method based on the minimum change principle cannot correct consecutive abnormal points, through quantitative analysis, it is believed that the repair strategy should be the boundary of the repair candidate set. The experimental results obtained on the dataset show that the method proposed in this paper can get a better repair effect.

컨벌루션 신경망과 변종데이터를 이용한 시계열 패턴 인식 (Convolutional Neural Network and Data Mutation for Time Series Pattern Recognition)

  • 안명호;류미현
    • 한국정보통신학회:학술대회논문집
    • /
    • 한국정보통신학회 2016년도 춘계학술대회
    • /
    • pp.727-730
    • /
    • 2016
  • TSC(Time Series Classification)은 시계열데이터를 패턴에 따라 분류하는 것으로, 시계열이 매우 흔한 데이터형태이고, 또한 활용도가 높기 때문에 오랜 시간동안 Data Mining 과 Machine Learning 분야의 주요한 이슈였다. 전통적인 방법에서는 Distance와 Dictionary 기반의 방법들을 많이 활용하였으나, Time Scale과 Random Noise의 문제로 인해 분류의 정확도가 제한되었다. 본 논문에서는 Deep Learning의 CNN(Convolutional Neural Network)과 변종데이터(Data Mutation)을 이용해 정확도를 향상시킨 방법을 제시한다. CNN은 이미지분야에서 이미 검증된 신경망 모델로써 시계열데이터의 특성을 나타내는 Feature를 인식하는데 효과적으로 활용할 수 있고, 변종데이터는 하나의 데이터를 다양한 방식으로 변종을 만들어 CNN이 특정 패턴의 가능한 변형에 대해서도 학습할 수 있도록 데이터를 제공한다. 제시한 방식은 기존의 방식보다 우수한 정확도를 보여준다.

  • PDF