• Title/Summary/Keyword: TimeSeries Data

Search Result 3,624, Processing Time 0.036 seconds

BST-IGT Model: Synthetic Benchmark Generation Technique Maintaining Trend of Time Series Data

  • Kim, Kyung Min;Kwak, Jong Wook
    • Journal of the Korea Society of Computer and Information
    • /
    • v.25 no.2
    • /
    • pp.31-39
    • /
    • 2020
  • In this paper, we introduce a technique for generating synthetic benchmarks based on time series data. Many of the data measured on IoT devices have a time series characteristic that measures numerical changes over time. However, there is a problem that it is difficult to model the data measured over a long period as generalized time series data. To solve this problem, this paper introduces the BST-IGT model. The BST-IGT model separates the entire data into sections that can be easily time-series modeled, collects the generated data into templates, and produces new synthetic benchmarks that share or modify characteristics based on them. As a result of making a new benchmark using the proposed modeling method, we could create a benchmark with multiple aspects by mixing the composite benchmark with the statistical features of the existing data and other benchmarks.

Efficient Time-Series Similarity Measurement and Ranking Based on Anomaly Detection (이상탐지 기반의 효율적인 시계열 유사도 측정 및 순위화)

  • Ji-Hyun Choi;Hyun Ahn
    • Journal of Internet Computing and Services
    • /
    • v.25 no.2
    • /
    • pp.39-47
    • /
    • 2024
  • Time series analysis is widely employed by many organizations to solve business problems, as it extracts various information and insights from chronologically ordered data. Among its applications, measuring time series similarity is a step to identify time series with similar patterns, which is very important in time series analysis applications such as time series search and clustering. In this study, we propose an efficient method for measuring time series similarity that focuses on anomalies rather than the entire series. In this regard, we validate the proposed method by measuring and analyzing the rank correlation between the similarity measure for the set of subsets extracted by anomaly detection and the similarity measure for the whole time series. Experimental results, especially with stock time series data and an anomaly proportion of 10%, demonstrate a Spearman's rank correlation coefficient of up to 0.9. In conclusion, the proposed method can significantly reduce computation cost of measuring time series similarity, while providing reliable time series search and clustering results.

VaR(Value at Risk) for Korean Financial Time Series

  • Hwang, S.Y.;Park, J.
    • Journal of the Korean Data and Information Science Society
    • /
    • v.16 no.2
    • /
    • pp.283-288
    • /
    • 2005
  • Value at Risk(VaR) has been proven useful in finance literature as a tool of risk management(cf. Jorion(2001)). This article is concerned with introducing VaR to various Korean financial time series. Five daily data sets with sample period ranging from 2000 and 2004 such as KOSPI, KOSPI 200, KOSDAQ, KOSDAQ 50 and won-dollar exchange rate are analyzed using GARCH modeling and in turn VaR is obtained for each data.

  • PDF

Statistical Inference for Space Time Series Model with Application to Mumps Data

  • Jeong, Ae-Ran;Kim, Sun-Woo;Lee, Sung-Duck
    • Journal of the Korean Data and Information Science Society
    • /
    • v.17 no.2
    • /
    • pp.475-486
    • /
    • 2006
  • Space time series data can be viewed either as a set of time series collected simultaneously at a number of spatial locations or as sets of spatial data collected at a number of time points. The major purpose of this article is to formulate a class of space time autoregressive moving average (STARMA) model, to discuss some of the their statistical properties such as model identification approaches, some procedure for estimation and the predictions. For illustration, we apply this STARMA model to the mumps data. The data set of mumps cases consists of the number of cases of mumps reported from twelve states monthly over the years 1969-1988.

  • PDF

The Evaluation of the Annual Time Series Data for the Mean Sea Level of the West Coast by Regression Model (회귀모형에 의한 서해안 평균해면의 연시계열자료의 평가)

  • 조기태;박영기;이장춘
    • Journal of Environmental Science International
    • /
    • v.9 no.1
    • /
    • pp.19-25
    • /
    • 2000
  • As the tideland reclamation is done on a large scale these days, construction work is active in the coastal areas. Facilities in the coastal areas must be built with the tide characteristics taken into consideration. Thus the tide characteristics affect the overall reclamation plan. The analysis of the tide data boils down to a harmonic analysis of the hourly changes of long-term tide data and extraction of unharmonic coefficients from the results. Since considerable amount of tide data of the West Coast are available, the existing data can be collected and can be used to obtain the temporal changes of the tide by being fitted into the tide prediction model. The goal of this thesis lies in assessing whether the mean sea level used in the field agrees with the analysis results from the long-term observation data obtained with their homogeneity guaranteed. To achieve this goal, the research was conducted as follows. First the present conditions of the observation stations, the land level standard, and the sea level standard were analyzed to set up a time series model formula for representing them. To secure the homogeneity of the time series, each component was separated. Lastly the mean sea level used in the field was assessed based on the results obtained form the analysis of the time series.

  • PDF

A Study on the Test and Visualization of Change in Structures Associated with the Occurrence of Non-Stationary of Long-Term Time Series Data Based on Unit Root Test (Unit Root Test를 기반으로 한 장기 시계열 데이터의 Non-Stationary 발생에 따른 구조 변화 검정 및 시각화 연구)

  • Yoo, Jaeseong;Choo, Jaegul
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.8 no.7
    • /
    • pp.289-302
    • /
    • 2019
  • Structural change of time series means that the distribution of observations is relatively stable in the period of constituting the entire time series data, but shows a sudden change of the distribution characteristic at a specific time point. Within a non-stationary long-term time series, it is important to determine in a timely manner whether the change in short-term trends is transient or structurally changed. This is because it is necessary to always detect the change of the time series trend and to take appropriate measures to cope with the change. In this paper, we propose a method for decision makers to easily grasp the structural changes of time series by visualizing the test results based on the unit root test. Particularly, it is possible to grasp the short-term structural changes even in the long-term time series through the method of dividing the time series and testing it.

A Study on the Health Index Based on Degradation Patterns in Time Series Data Using ProphetNet Model (ProphetNet 모델을 활용한 시계열 데이터의 열화 패턴 기반 Health Index 연구)

  • Sun-Ju Won;Yong Soo Kim
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.46 no.3
    • /
    • pp.123-138
    • /
    • 2023
  • The Fourth Industrial Revolution and sensor technology have led to increased utilization of sensor data. In our modern society, data complexity is rising, and the extraction of valuable information has become crucial with the rapid changes in information technology (IT). Recurrent neural networks (RNN) and long short-term memory (LSTM) models have shown remarkable performance in natural language processing (NLP) and time series prediction. Consequently, there is a strong expectation that models excelling in NLP will also excel in time series prediction. However, current research on Transformer models for time series prediction remains limited. Traditional RNN and LSTM models have demonstrated superior performance compared to Transformers in big data analysis. Nevertheless, with continuous advancements in Transformer models, such as GPT-2 (Generative Pre-trained Transformer 2) and ProphetNet, they have gained attention in the field of time series prediction. This study aims to evaluate the classification performance and interval prediction of remaining useful life (RUL) using an advanced Transformer model. The performance of each model will be utilized to establish a health index (HI) for cutting blades, enabling real-time monitoring of machine health. The results are expected to provide valuable insights for machine monitoring, evaluation, and management, confirming the effectiveness of advanced Transformer models in time series analysis when applied in industrial settings.

Finding associations between genes by time-series microarray sequential patterns analysis

  • Nam, Ho-Jung;Lee, Do-Heon
    • Proceedings of the Korean Society for Bioinformatics Conference
    • /
    • 2005.09a
    • /
    • pp.161-164
    • /
    • 2005
  • Data mining techniques can be applied to identify patterns of interest in the gene expression data. One goal in mining gene expression data is to determine how the expression of any particular gene might affect the expression of other genes. To find relationships between different genes, association rules have been applied to gene expression data set [1]. A notable limitation of association rule mining method is that only the association in a single profile experiment can be detected. It cannot be used to find rules across different condition profiles or different time point profile experiments. However, with the appearance of time-series microarray data, it became possible to analyze the temporal relationship between genes. In this paper, we analyze the time-series microarray gene expression data to extract the sequential patterns which are similar to the association rules between genes among different time points in the yeast cell cycle. The sequential patterns found in our work can catch the associations between different genes which express or repress at diverse time points. We have applied sequential pattern mining method to time-series microarray gene expression data and discovered a number of sequential patterns from two groups of genes (test, control) and more sequential patterns have been discovered from test group (same CO term group) than from the control group (different GO term group). This result can be a support for the potential of sequential patterns which is capable of catching the biologically meaningful association between genes.

  • PDF

Threshold-asymmetric volatility models for integer-valued time series

  • Kim, Deok Ryun;Yoon, Jae Eun;Hwang, Sun Young
    • Communications for Statistical Applications and Methods
    • /
    • v.26 no.3
    • /
    • pp.295-304
    • /
    • 2019
  • This article deals with threshold-asymmetric volatility models for over-dispersed and zero-inflated time series of count data. We introduce various threshold integer-valued autoregressive conditional heteroscedasticity (ARCH) models as incorporating over-dispersion and zero-inflation via conditional Poisson and negative binomial distributions. EM-algorithm is used to estimate parameters. The cholera data from Kolkata in India from 2006 to 2011 is analyzed as a real application. In order to construct the threshold-variable, both local constant mean which is time-varying and grand mean are adopted. It is noted via a data application that threshold model as an asymmetric version is useful in modelling count time series volatility.

Prediction of arrhythmia using multivariate time series data (다변량 시계열 자료를 이용한 부정맥 예측)

  • Lee, Minhai;Noh, Hohsuk
    • The Korean Journal of Applied Statistics
    • /
    • v.32 no.5
    • /
    • pp.671-681
    • /
    • 2019
  • Studies on predicting arrhythmia using machine learning have been actively conducted with increasing number of arrhythmia patients. Existing studies have predicted arrhythmia based on multivariate data of feature variables extracted from RR interval data at a specific time point. In this study, we consider that the pattern of the heart state changes with time can be important information for the arrhythmia prediction. Therefore, we investigate the usefulness of predicting the arrhythmia with multivariate time series data obtained by extracting and accumulating the multivariate vectors of the feature variables at various time points. When considering 1-nearest neighbor classification method and its ensemble for comparison, it is confirmed that the multivariate time series data based method can have better classification performance than the multivariate data based method if we select an appropriate time series distance function.