• Title/Summary/Keyword: Time-Series data

Search Result 3,625, Processing Time 0.035 seconds

Fused Fuzzy Logic System for Corrupted Time Series Data Analysis (훼손된 시계열 데이터 분석을 위한 퍼지 시스템 융합 연구)

  • Kim, Dong Won
    • Journal of Internet of Things and Convergence
    • /
    • v.4 no.1
    • /
    • pp.1-5
    • /
    • 2018
  • This paper is concerned with the modeling and identification of time series data corrupted by noise. As modeling techniques, nonsingleton fuzzy logic system (NFLS) is employed for the modeling of corrupted time series. Main characteristic of the NFLS is a fuzzy system whose inputs are modeled as fuzzy number. So the NFLS is especially useful in cases where the available training data or the input data to the fuzzy logic system are corrupted by noise. Simulation results of the Mackey-Glass time series data will be demonstrated to show the performance of the modeling methods. As a result, NFLS does a much better job of modeling noisy time series data than does a traditional Mamdani FLS.

A Biclustering Method for Time Series Analysis

  • Lee, Jeong-Hwa;Lee, Young-Rok;Jun, Chi-Hyuck
    • Industrial Engineering and Management Systems
    • /
    • v.9 no.2
    • /
    • pp.131-140
    • /
    • 2010
  • Biclustering is a method of finding meaningful subsets of objects and attributes simultaneously, which may not be detected by traditional clustering methods. It is popularly used for the analysis of microarray data representing the expression levels of genes by conditions. Usually, biclustering algorithms do not consider a sequential relation between attributes. For time series data, however, bicluster solutions should keep the time sequence. This paper proposes a new biclustering algorithm for time series data by modifying the plaid model. The proposed algorithm introduces a parameter controlling an interval between two selected time points. Also, the pruning step preventing an over-fitting problem is modified so as to eliminate only starting or ending points. Results from artificial data sets show that the proposed method is more suitable for the extraction of biclusters from time series data sets. Moreover, by using the proposed method, we find some interesting observations from real-world time-course microarray data sets and apartment price data sets in metropolitan areas.

The use of linear stochastic estimation for the reduction of data in the NIST aerodynamic database

  • Chen, Y.;Kopp, G.A.;Surry, D.
    • Wind and Structures
    • /
    • v.6 no.2
    • /
    • pp.107-126
    • /
    • 2003
  • This paper describes a simple and practical approach through the application of Linear Stochastic Estimation (LSE) to reconstruct wind-induced pressure time series from the covariance matrix for structural load analyses on a low building roof. The main application of this work would be the reduction of the data storage requirements for the NIST aerodynamic database. The approach is based on the assumption that a random pressure field can be estimated as a linear combination of some other known pressure time series by truncating nonlinear terms of a Taylor series expansion. Covariances between pressure time series to be simulated and reference time series are used to calculate the estimation coefficients. The performance using different LSE schemes with selected reference time series is demonstrated by the reconstruction of structural load time series in a corner bay for three typical wind directions. It is shown that LSE can simulate structural load time series accurately, given a handful of reference pressure taps (or even a single tap). The performance of LSE depends on the choice of the reference time series, which should be determined by considering the balance between the accuracy, data-storage requirements and the complexity of the approach. The approach should only be used for the determination of structural loads, since individual reconstructed pressure time series (for local load analyses) will have larger errors associated with them.

Clustering Algorithm for Time Series with Similar Shapes

  • Ahn, Jungyu;Lee, Ju-Hong
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.12 no.7
    • /
    • pp.3112-3127
    • /
    • 2018
  • Since time series clustering is performed without prior information, it is used for exploratory data analysis. In particular, clusters of time series with similar shapes can be used in various fields, such as business, medicine, finance, and communications. However, existing time series clustering algorithms have a problem in that time series with different shapes are included in the clusters. The reason for such a problem is that the existing algorithms do not consider the limitations on the size of the generated clusters, and use a dimension reduction method in which the information loss is large. In this paper, we propose a method to alleviate the disadvantages of existing methods and to find a better quality of cluster containing similarly shaped time series. In the data preprocessing step, we normalize the time series using z-transformation. Then, we use piecewise aggregate approximation (PAA) to reduce the dimension of the time series. In the clustering step, we use density-based spatial clustering of applications with noise (DBSCAN) to create a precluster. We then use a modified K-means algorithm to refine the preclusters containing differently shaped time series into subclusters containing only similarly shaped time series. In our experiments, our method showed better results than the existing method.

Extending the Scope of Automatic Time Series Model Selection: The Package autots for R

  • Jang, Dong-Ik;Oh, Hee-Seok;Kim, Dong-Hoh
    • Communications for Statistical Applications and Methods
    • /
    • v.18 no.3
    • /
    • pp.319-331
    • /
    • 2011
  • In this paper, we propose automatic procedures for the model selection of various univariate time series data. Automatic model selection is important, especially in data mining with large number of time series, for example, the number (in thousands) of signals accessing a web server during a specific time period. Several methods have been proposed for automatic model selection of time series. However, most existing methods focus on linear time series models such as exponential smoothing and autoregressive integrated moving average(ARIMA) models. The key feature that distinguishes the proposed procedures from previous approaches is that the former can be used for both linear time series models and nonlinear time series models such as threshold autoregressive(TAR) models and autoregressive moving average-generalized autoregressive conditional heteroscedasticity(ARMA-GARCH) models. The proposed methods select a model from among the various models in the prediction error sense. We also provide an R package autots that implements the proposed automatic model selection procedures. In this paper, we illustrate these algorithms with the artificial and real data, and describe the implementation of the autots package for R.

Effects of Overdispersion on Testing for Serial Dependence in the Time Series of Counts Data

  • Kim, Hee-Young;Park, You-Sung
    • Communications for Statistical Applications and Methods
    • /
    • v.17 no.6
    • /
    • pp.829-843
    • /
    • 2010
  • To test for the serial dependence in time series of counts data, Jung and Tremayne (2003) evaluated the size and power of several tests under the class of INARMA models based on binomial thinning operations for Poisson marginal distributions. The overdispersion phenomenon(i.e., a variance greater than the expectation) is common in the real world. Overdispersed count data can be modeled by using alternative thinning operations such as random coefficient thinning, iterated thinning, and quasi-binomial thinning. Such thinning operations can lead to time series models of counts with negative binomial or generalized Poisson marginal distributions. This paper examines whether the test statistics used by Jung and Tremayne (2003) on serial dependence in time series of counts data are affected by overdispersion.

Fuzzy Logic-based Modeling of a Score (퍼지 이론을 이용한 악보의 모델링)

  • 손세호;권순학
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.11 no.3
    • /
    • pp.264-269
    • /
    • 2001
  • In this paper, we interpret a score as a time series and deal with the fuzzy logic-based modeling of it. The musical notes in a score represent a lot of information about the length of a sound and pitches, etc. In this paper, using melodies, tones and pitches in a score, we transform data on a score into a time series. Once more, we foml the new Lime series by sliding a window through the time series. For analyzing the time series data, we make use of the Box-Jenkins s time series analysis. On the basis of the identified characteristics of time series, we construct the fuzzy model.

  • PDF

Forecasting Baltic Dry Index by Implementing Time-Series Decomposition and Data Augmentation Techniques (시계열 분해 및 데이터 증강 기법 활용 건화물운임지수 예측)

  • Han, Min Soo;Yu, Song Jin
    • Journal of Korean Society for Quality Management
    • /
    • v.50 no.4
    • /
    • pp.701-716
    • /
    • 2022
  • Purpose: This study aims to predict the dry cargo transportation market economy. The subject of this study is the BDI (Baltic Dry Index) time-series, an index representing the dry cargo transport market. Methods: In order to increase the accuracy of the BDI time-series, we have pre-processed the original time-series via time-series decomposition and data augmentation techniques and have used them for ANN learning. The ANN algorithms used are Multi-Layer Perceptron (MLP), Recurrent Neural Network (RNN), and Long Short-Term Memory (LSTM) to compare and analyze the case of learning and predicting by applying time-series decomposition and data augmentation techniques. The forecast period aims to make short-term predictions at the time of t+1. The period to be studied is from '22. 01. 07 to '22. 08. 26. Results: Only for the case of the MAPE (Mean Absolute Percentage Error) indicator, all ANN models used in the research has resulted in higher accuracy (1.422% on average) in multivariate prediction. Although it is not a remarkable improvement in prediction accuracy compared to uni-variate prediction results, it can be said that the improvement in ANN prediction performance has been achieved by utilizing time-series decomposition and data augmentation techniques that were significant and targeted throughout this study. Conclusion: Nevertheless, due to the nature of ANN, additional performance improvements can be expected according to the adjustment of the hyper-parameter. Therefore, it is necessary to try various applications of multiple learning algorithms and ANN optimization techniques. Such an approach would help solve problems with a small number of available data, such as the rapidly changing business environment or the current shipping market.

Time Series Data Cleaning Method Based on Optimized ELM Prediction Constraints

  • Guohui Ding;Yueyi Zhu;Chenyang Li;Jinwei Wang;Ru Wei;Zhaoyu Liu
    • Journal of Information Processing Systems
    • /
    • v.19 no.2
    • /
    • pp.149-163
    • /
    • 2023
  • Affected by external factors, errors in time series data collected by sensors are common. Using the traditional method of constraining the speed change rate to clean the errors can get good performance. However, they are only limited to the data of stable changing speed because of fixed constraint rules. Actually, data with uneven changing speed is common in practice. To solve this problem, an online cleaning algorithm for time series data based on dynamic speed change rate constraints is proposed in this paper. Since time series data usually changes periodically, we use the extreme learning machine to learn the law of speed changes from past data and predict the speed ranges that change over time to detect the data. In order to realize online data repair, a dual-window mechanism is proposed to transform the global optimal into the local optimal, and the traditional minimum change principle and median theorem are applied in the selection of the repair strategy. Aiming at the problem that the repair method based on the minimum change principle cannot correct consecutive abnormal points, through quantitative analysis, it is believed that the repair strategy should be the boundary of the repair candidate set. The experimental results obtained on the dataset show that the method proposed in this paper can get a better repair effect.

Convolutional Neural Network and Data Mutation for Time Series Pattern Recognition (컨벌루션 신경망과 변종데이터를 이용한 시계열 패턴 인식)

  • Ahn, Myong-ho;Ryoo, Mi-hyeon
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2016.05a
    • /
    • pp.727-730
    • /
    • 2016
  • TSC means classifying time series data based on pattern. Time series data is quite common data type and it has high potential in many fields, so data mining and machine learning have paid attention for long time. In traditional approach, distance and dictionary based methods are quite popular. but due to time scale and random noise problems, it has clear limitation. In this paper, we propose a novel approach to deal with these problems with CNN and data mutation. CNN is regarded as proven neural network model in image recognition, and could be applied to time series pattern recognition by extracting pattern. Data mutation is a way to generate mutated data with different methods to make CNN more robust and solid. The proposed method shows better performance than traditional approach.

  • PDF