• 제목/요약/키워드: Time Series Data Analysis

검색결과 1,862건 처리시간 0.037초

Efficient Anomaly Detection Through Confidence Interval Estimation Based on Time Series Analysis

  • Kim, Yeong-Ju;Jeong, Min-A
    • International journal of advanced smart convergence
    • /
    • 제4권2호
    • /
    • pp.46-53
    • /
    • 2015
  • This paper suggests a method of real time confidence interval estimation to detect abnormal states of sensor data. For real time confidence interval estimation, the mean square errors of the exponential smoothing method and moving average method, two of the time series analysis method, were compared, and the moving average method with less errors was applied. When the sensor data passes the bounds of the confidence interval estimation, the administrator is notified through alarms. As the suggested method is for real time anomaly detection in a ship, an Android terminal was adopted for better communication between the wireless sensor network and users. For safe navigation, an administrator can make decisions promptly and accurately upon emergency situation in a ship by referring to the anomaly detection information through real time confidence interval estimation.

Kernel-Based Fuzzy Regression Machine For Predicting Turbulent Flows

  • 홍덕헌;황창하
    • 한국데이터정보과학회:학술대회논문집
    • /
    • 한국데이터정보과학회 2004년도 춘계학술대회
    • /
    • pp.91-101
    • /
    • 2004
  • The turbulent flow is of fundamental interest because the conservation equations for thermodynamics, mass and momentum are linked together. This turbulent flow consists of some coherent time- and space-organized vortical structures. Research has already shown that some dynamic systems and experimental models still cannot provide a good nonlinear analysis of turbulent time series. In the real turbulent flow, very complicated nonlinear behaviors, which are affected by many vague factors are present. In this paper, a kernel-based machine for fuzzy nonlinear regression analysis is proposed to predict the nonlinear time series of turbulent flows. In order to show the practicality and usefulness of this model, we present an example of predicting the near-wall turbulence time series as a verifiable model and compare with fuzzy piecewise regression. The results of practical applications show that the proposed method is appropriate and appears to be useful in nonlinear analysis and in fuzzy environments to predict the turbulence time series.

  • PDF

시계열에서의 연속이상치가 예측에 미치는 영향 (The effect of patchy outliers in time series forecasting)

  • 이재준;편영숙
    • 응용통계연구
    • /
    • 제9권1호
    • /
    • pp.125-137
    • /
    • 1996
  • 시계열 자료는 흔히 반복되지 않는 비정상적인 사건의 영향으로 이상치를 포함한다. 시계열 자료는 관측치들 사이에 종속구조를 갖기 때문에, 이상치의 영향은 다른 통계적 분석에서 보다 더 심각할 수 있다. 본 논문에서는 연속이상치가 예측에 미치는 영향을 파악하는 데에 촛점을 두었다. 특히, l 시점 후 예측오차의 평균제곱의 증가량을 유도하고, 이 증가량으로 연속이상치가 예측에 미치는 영향을 측정하였다. 일반적으로, 연속이상치가 예측 원점에서 아주 가까운 시점에서 발생하지 않았으며 그 증가량은 크지 않음을 밝히고, 실제 자료를 분석하여 확인하였다.

  • PDF

의료비 결정요인 분석을 위한 계량적 모형 고안 (A Quantitative Model for the Projection of Health Expenditure)

  • 김한중;이영두;남정모
    • Journal of Preventive Medicine and Public Health
    • /
    • 제24권1호
    • /
    • pp.29-36
    • /
    • 1991
  • A multiple regression analysis using ordinary least square (OLS) is frequently used for the projection of health expenditure as well as for the identification of factors affecting health care costs. Data for the analysis often have mixed characteristics of time series and cross section. Parameters as a result of OLS estimation, in this case, are no longer the best linear unbiased estimators (BLUE) because the data do not satisfy basic assumptions of regression analysis. The study theoretically examined statistical problems induced when OLS estimation was applied with the time series cross section data. Then both the OLS regression and time series cross section regression (TSCS regression) were applied to the same empirical da. Finally, the difference in parameters between the two estimations were explained through residual analysis.

  • PDF

Comparison of time series clustering methods and application to power consumption pattern clustering

  • Kim, Jaehwi;Kim, Jaehee
    • Communications for Statistical Applications and Methods
    • /
    • 제27권6호
    • /
    • pp.589-602
    • /
    • 2020
  • The development of smart grids has enabled the easy collection of a large amount of power data. There are some common patterns that make it useful to cluster power consumption patterns when analyzing s power big data. In this paper, clustering analysis is based on distance functions for time series and clustering algorithms to discover patterns for power consumption data. In clustering, we use 10 distance measures to find the clusters that consider the characteristics of time series data. A simulation study is done to compare the distance measures for clustering. Cluster validity measures are also calculated and compared such as error rate, similarity index, Dunn index and silhouette values. Real power consumption data are used for clustering, with five distance measures whose performances are better than others in the simulation.

Time Series Classification of Cryptocurrency Price Trend Based on a Recurrent LSTM Neural Network

  • Kwon, Do-Hyung;Kim, Ju-Bong;Heo, Ju-Sung;Kim, Chan-Myung;Han, Youn-Hee
    • Journal of Information Processing Systems
    • /
    • 제15권3호
    • /
    • pp.694-706
    • /
    • 2019
  • In this study, we applied the long short-term memory (LSTM) model to classify the cryptocurrency price time series. We collected historic cryptocurrency price time series data and preprocessed them in order to make them clean for use as train and target data. After such preprocessing, the price time series data were systematically encoded into the three-dimensional price tensor representing the past price changes of cryptocurrencies. We also presented our LSTM model structure as well as how to use such price tensor as input data of the LSTM model. In particular, a grid search-based k-fold cross-validation technique was applied to find the most suitable LSTM model parameters. Lastly, through the comparison of the f1-score values, our study showed that the LSTM model outperforms the gradient boosting model, a general machine learning model known to have relatively good prediction performance, for the time series classification of the cryptocurrency price trend. With the LSTM model, we got a performance improvement of about 7% compared to using the GB model.

시계열 데이터 활용에 관한 동향 연구 (A Study on Trend Using Time Series Data)

  • 최신형
    • 산업과 과학
    • /
    • 제3권1호
    • /
    • pp.17-22
    • /
    • 2024
  • 인류의 출현과 함께 시작된 역사에는 기록이라는 수단이 있기에 현재에 사는 우리는 데이터를 통해 과거를 확인할 수 있다. 생성되는 데이터는 일정 순간에만 발생하여 저장될 수도 있지만, 과거로부터 현재까지 일정 시간 간격 동안 계속해서 생성될 뿐만 아니라 다가올 미래에도 발생함으로써 이를 활용하여 예측하는 것 또한 중요한 작업이다. 본 논문은 수많은 데이터 중에서 시계열 데이터의 활용 동향을 알아보기 위해서 시계열 데이터의 개념에서부터 머신러닝 분야에서 시계열 데이터 분석에 주로 사용되는 Recurrent Neural Network와 Long-Short Term Memory에 대해 분석하고, 이런 모델들을 활용한 사례의 조사를 통해 의료 진단, 주식 시세 분석, 기후 예측 등 다양한 분야에 활용되어 높은 예측 결과를 보이고 있음을 확인하였고, 이를 바탕으로 향후 활용방안에 대하여 모색해본다.

시계열 분석을 이용한 소프트웨어 미래 고장 시간 예측에 관한 연구 (The Study for Software Future Forecasting Failure Time Using Time Series Analysis.)

  • 김희철;신현철
    • 융합보안논문지
    • /
    • 제11권3호
    • /
    • pp.19-24
    • /
    • 2011
  • 소프트웨어 고장 시간은 테스팅 시간과 관계없이 일정하거나, 단조증가 혹은 단조 감소 추세를 가지고 있다. 이러한 소프트웨어 신뢰모형들을 분석하기 위한 자료척도로 자료에 대한 추세 검정이 개발되어 있다. 추세 분석에는 산술평균 검정과 라플라스 추세 검정 등이 있다. 추세분석들은 전체적인 자료의 개요의 정보만 제공한다. 본 논문에서는 고장시간을 측정하다가 시간 절단이 될 경우에 미래의 고장 시간 예측에 관하여 연구 하였다. 시계열 분석에 이용되는 단순이동 평균법과 가중이동평균법, 지수평활법을 이용하여 미래고장 시간을 예측하여 비교하고자 한다. 실증분석에서는 고장간격 자료를 이용하여 모형들에 대한 예측값을 평균자승오차를 이용하여 비교하고 효율적 모형을 선택 하였다.

Uncertain Rule-based Fuzzy Technique: Nonsingleton Fuzzy Logic System for Corrupted Time Series Analysis

  • Kim, Dongwon;Park, Gwi-Tae
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • 제4권3호
    • /
    • pp.361-365
    • /
    • 2004
  • In this paper, we present the modeling of time series data which are corrupted by noise via nonsingleton fuzzy logic system. Nonsingleton fuzzy logic system (NFLS) is useful in cases where the available data are corrupted by noise. NFLS is a fuzzy system whose inputs are modeled as fuzzy number. The abilities of NFLS to approximate arbitrary functions, and to effectively deal with noise and uncertainty, are used to analyze corrupted time series data. In the simulation results, we compare the results of the NFLS approach with the results of using only a traditional fuzzy logic system.

RIMS 데이터 시계열 분석을 통한 도시철도 운용효율 향상 (RIMS data time a series analysis a city railroad a use efficiency improve)

  • 이도선;전형준;박수중
    • 한국철도학회:학술대회논문집
    • /
    • 한국철도학회 2008년도 춘계학술대회 논문집
    • /
    • pp.1308-1314
    • /
    • 2008
  • In this paper, Seoulmetro that is the first operation organization which operates a city railroad rolling-stock maintenance RIMS(rolling stock information maintenance system) collected and analyzed a light maintenance data and introduced time a series analysis technique to find the way how to contribute to a use efficiency improvement of a city railroad. The purpose of time a series analysis is to remove a seasonal change including data and to check an irregular fluctuation. First of all, a collection range of the data comes under a light maintenance, however it needs a data of more than 3 years to check the seasonal change. We put a study for an accumulated scope that the data satisfy a period like this and are able to extend a range of the study when time flys forward. The data used for study is filtered using a movement average method after passing proper selection working and is solved with a method which looks for season index. Using the season index that was getten in here, we predict a light working frequency, if it has an irregular change, we will contribute it to a city railroad a use efficiency improvement and establish the cause by carrying out prevent maintenance in advance.

  • PDF