• Title/Summary/Keyword: The time-series data

Search Result 3,618, Processing Time 0.044 seconds

A Study of Big Time Series Data Compression based on CNN Algorithm (CNN 기반 대용량 시계열 데이터 압축 기법연구)

  • Sang-Ho Hwang;Sungho Kim;Sung Jae Kim;Tae Geun Kim
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.18 no.1
    • /
    • pp.1-7
    • /
    • 2023
  • In this paper, we implement a lossless compression technique for time-series data generated by IoT (Internet of Things) devices to reduce the disk spaces. The proposed compression technique reduces the size of the encoded data by selectively applying CNN (Convolutional Neural Networks) or Delta encoding depending on the situation in the Forecasting algorithm that performs prediction on time series data. In addition, the proposed technique sequentially performs zigzag encoding, splitting, and bit packing to increase the compression ratio. We showed that the proposed compression method has a compression ratio of up to 1.60 for the original data.

A Study on the Test of Homogeneity for Nonlinear Time Series Panel Data Using Bilinear Models (중선형 모형을 이용한 비선형 시계열 패널자료의 동질성검정에 대한 연구)

  • Kim, Inkyu
    • Journal of Digital Convergence
    • /
    • v.12 no.7
    • /
    • pp.261-266
    • /
    • 2014
  • When the number of parameters in the time series model are diverse, it is hard to forecast because of the increasing error by a parameter estimation. If the homogeneity hypothesis which was obtained from the same model about severeal data for the time series is selected, it is easy to get the predictive value better. Nonlinear time-series panel data for each parameter for each time series, since there are so many parameters that are present, and the large number of parameters according to the parameter estimation error increases the accuracy of the forecast deteriorated. Panel present in the time series of multiple independent homogeneity is satisfied by a comprehensive time series to estimate and to test of the parameters. For studying about the homogeneity test for the m independent non-linear of the time series panel data, it needs to set the model and to make the normal conditions for the model, and to derive the homogeneity test statistic. Finally, it shows to obtain the limit distribution according to ${\chi}^2$ distribution. In actual analysis,, we can examine the result for the homogeneity test about nonlinear time series panel data which are 2 groups of stock price data.

DQB (Dynamic Query Band): Dynamic Query Device for Efficient Exploration of Time-series Data (DQB (Dynamic Query Band): 시계열 데이터의 효율적인 탐색을 위한 동적 쿼리 장치)

  • Jo, Myeong-Su;Seo, Jin-Ok
    • 한국HCI학회:학술대회논문집
    • /
    • 2009.02a
    • /
    • pp.715-718
    • /
    • 2009
  • Time series data is a sequence of data points, measured typically at successive, spaced at time intervals. Many devices for an efficient exploration is developed according as the items of time series data increase. Among these devices, there is a Timebox widget as a representative device of dynamic query for interactive data exploration. Timeboxes are rectangular query region of interest. The users can draw the region of interest using simple mouse manipulation and the query result sets is displayed. But there is a limitation to represent the concrete query region and Timeboxes visualize the query region inconsistent with the mental model of users. To resolve these problems, we propose a new device called DQB(Dynamic Query Band). DQB is a qeury region consisting of user defined polyline with a thickness on time series data. This device is possible to concretely specify the query region. Also, it provides a simple and convenient interface and a good conceptual model.

  • PDF

Time Series Classification of Cryptocurrency Price Trend Based on a Recurrent LSTM Neural Network

  • Kwon, Do-Hyung;Kim, Ju-Bong;Heo, Ju-Sung;Kim, Chan-Myung;Han, Youn-Hee
    • Journal of Information Processing Systems
    • /
    • v.15 no.3
    • /
    • pp.694-706
    • /
    • 2019
  • In this study, we applied the long short-term memory (LSTM) model to classify the cryptocurrency price time series. We collected historic cryptocurrency price time series data and preprocessed them in order to make them clean for use as train and target data. After such preprocessing, the price time series data were systematically encoded into the three-dimensional price tensor representing the past price changes of cryptocurrencies. We also presented our LSTM model structure as well as how to use such price tensor as input data of the LSTM model. In particular, a grid search-based k-fold cross-validation technique was applied to find the most suitable LSTM model parameters. Lastly, through the comparison of the f1-score values, our study showed that the LSTM model outperforms the gradient boosting model, a general machine learning model known to have relatively good prediction performance, for the time series classification of the cryptocurrency price trend. With the LSTM model, we got a performance improvement of about 7% compared to using the GB model.

A Machine Learning Model for Predicting Silica Concentrations through Time Series Analysis of Mining Data (광업 데이터의 시계열 분석을 통해 실리카 농도를 예측하기 위한 머신러닝 모델)

  • Lee, Seung Hoon;Yoon, Yeon Ah;Jung, Jin Hyeong;Sim, Hyun su;Chang, Tai-Woo;Kim, Yong Soo
    • Journal of Korean Society for Quality Management
    • /
    • v.48 no.3
    • /
    • pp.511-520
    • /
    • 2020
  • Purpose: The purpose of this study was to devise an accurate machine learning model for predicting silica concentrations following the addition of impurities, through time series analysis of mining data. Methods: The mining data were preprocessed and subjected to time series analysis using the machine learning model. Through correlation analysis, valid variables were selected and meaningless variables were excluded. To reflect changes over time, dependent variables at baseline were treated as independent variables at later time points. The relationship between independent variables and the dependent variable after n point was subjected to Pearson correlation analysis. Results: The correlation (R2) was strongest after 3 hours, which was adopted as a dependent variable. According to root mean square error (RMSE) data, the proposed method was superior to the other machine learning methods. The XGboost algorithm showed the best predictive performance. Conclusion: This study is important given the current lack of machine learning studies pertaining to the domestic mining industry. In addition, using time series analysis in mining data will show further improvement. Before establishing a predictive model for the proposed method, predictions should be made using data with time series characteristics. After doing this work, it should also improve prediction accuracy in other domains.

Analysis of Multivariate Financial Time Series Using Cointegration : Case Study

  • Choi, M.S.;Park, J.A.;Hwang, S.Y.
    • Journal of the Korean Data and Information Science Society
    • /
    • v.18 no.1
    • /
    • pp.73-80
    • /
    • 2007
  • Cointegration(together with VARMA(vector ARMA)) has been proven to be useful for analyzing multivariate non-stationary data in the field of financial time series. It provides a linear combination (which turns out to be stationary series) of non-stationary component series. This linear combination equation is referred to as long term equilibrium between the component series. We consider two sets of Korean bivariate financial time series and then illustrate cointegration analysis. Specifically estimated VAR(vector AR) and VECM(vector error correction model) are obtained and CV(cointegrating vector) is found for each data sets.

  • PDF

Construction of Gene Interaction Networks from Gene Expression Data Based on Evolutionary Computation (진화연산에 기반한 유전자 발현 데이터로부터의 유전자 상호작용 네트워크 구성)

  • Jung Sung Hoon;Cho Kwang-Hyun
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.10 no.12
    • /
    • pp.1189-1195
    • /
    • 2004
  • This paper investigates construction of gene (interaction) networks from gene expression time-series data based on evolutionary computation. To illustrate the proposed approach in a comprehensive way, we first assume an artificial gene network and then compare it with the reconstructed network from the gene expression time-series data generated by the artificial network. Next, we employ real gene expression time-series data (Spellman's yeast data) to construct a gene network by applying the proposed approach. From these experiments, we find that the proposed approach can be used as a useful tool for discovering the structure of a gene network as well as the corresponding relations among genes. The constructed gene network can further provide biologists with information to generate/test new hypotheses and ultimately to unravel the gene functions.

A Study on Prediction the Movement Pattern of Time Series Data using Information Criterion and Effective Data Length (정보기준과 효율적 자료길이를 활용한 시계열자료 운동패턴 예측 연구)

  • Jeon, Jin-Ho;Kim, Min-Soo
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.13 no.1
    • /
    • pp.101-107
    • /
    • 2013
  • Is generated in real time in the real world, a large amount of time series data from a wide range of business areas. But it is not easy to determine the optimal model for the description and understanding of the time series data is represented as a dynamic feature. In this study, through the HMM suitable for estimating the short and long-term forecasting model of time-series data to estimate a model that can explain the characteristics of these time series data, it was estimated to predict future patterns of movement. The actual stock market through various materials, information criterion and optimal model estimation for the length of the most efficient data was found to accurately estimate the state of the model. Similar movement patterns predictive than the long-term prediction is more similar to the short-term prediction of the experimental result were found to be.

Exploratory Data Analysis for Korean Stock Data with Recurrence Plots (재현그림을 통한 우리나라 주식 자료에 대한 탐색적 자료분석)

  • Jang, Dae-Heung
    • The Korean Journal of Applied Statistics
    • /
    • v.26 no.5
    • /
    • pp.807-819
    • /
    • 2013
  • A recurrence plot can be used as a graphical exploratory data analysis tool before confirmatory time series analysis. With the recurrence plot, we can obtain the structural pattern of the time series and recognize the structural change points in a time series at a glance. Korean stock data shows the usefulness of the recurrence plot as a graphical exploratory data analysis tool for time series data.

A Fuzzy Time-Series Prediction with Preprocessing (전처리과정을 갖는 시계열데이터의 퍼지예측)

  • Yoon, Sang-Hun;Lee, Chul-Hee
    • Proceedings of the KIEE Conference
    • /
    • 2000.11d
    • /
    • pp.666-668
    • /
    • 2000
  • In this paper, a fuzzy prediction method is proposed for time series data having uncertainty and non-stationary characteristics. Conventional methods, which use past data directly in prediction procedure, cannot properly handle non-stationary data whose long-term mean is floating. To cope with this problem, a data preprocessing technique utilizing the differences of original time series data is suggested. The difference sets are established from data. And the optimal difference set is selected for input of fuzzy predictor. The proposed method based the Takigi-Sugeno-Kang(TSK or TS) fuzzy rule. Computer simulations show improved results for various time series.

  • PDF