DOI QR코드

DOI QR Code

주성분 분석 기법을 활용한 시계열 데이터 분석 및 예측 시스템

Time Series Data Analysis and Prediction System Using PCA

  • 진영훈 (백석대학교 스마트IT공학부) ;
  • 지세현 (백석대학교 스마트IT공학부) ;
  • 한군희 (백석대학교 컴퓨터공학부)
  • Jin, Young-Hoon (Division of Smart IT, Baekseok University) ;
  • Ji, Se-Hyun (Division of Smart IT, Baekseok University) ;
  • Han, Kun-Hee (Division of Computer Engineering, Baekseok University)
  • 투고 : 2021.10.03
  • 심사 : 2021.11.20
  • 발행 : 2021.11.28

초록

우리는 무수히 많은 데이터 속에서 살고 있다. 다양한 데이터는 우리가 활동하는 모든 상황 속에서 만들어지는데 빅데이터 기술을 통해 데이터의 유의미를 발굴한다. 유의미한 데이터를 발굴하기 위해 많은 노력이 진행 중이다. 본 논문은 주성분 분석(Principal component analysis) 기법으로 시계열 데이터의 추이 및 예측을 통해 인간이 더 나은 선택을 가능케 하는 분석 기법을 소개한다. 주성분 분석은 입력된 데이터를 통해 공분산을 구성하고, 데이터의 방향성을 추론할 수 있는 고유벡터와 고윳값을 제시한다. 제안하는 방법은 비슷한 방향성을 갖는 시계열 데이터 집합에서 기준 축을 구성하고, 데이터 집합을 이루는 각 시계열 데이터들의 방향성이 기준 축과 이루는 사잇각을 통해 다음 구간에 존재하게 될 데이터의 방향성을 예측한다. 본 논문에서는 가상화폐의 추이를 통해 제시한 알고리즘의 정확도를 LSTM(Long Short-Term Memory)과 비교 검증한다. 비교/검증 결과 제안된 방법은 변동성이 큰 데이터에서 LSTM에 비해 상대적으로 적은 트랜잭션과 높은 수익(112%)을 기록하였다. 이는 상대적으로 정확하게 신호를 분석하여 예측했다는 의미로 볼 수 있으며, 보다 정확한 임계치 설정을 통해 더 나은 결과를 도출할 수 있을 것으로 기대된다.

We live in a myriad of data. Various data are created in all situations in which we work, and we discover the meaning of data through big data technology. Many efforts are underway to find meaningful data. This paper introduces an analysis technique that enables humans to make better choices through the trend and prediction of time series data as a principal component analysis technique. Principal component analysis constructs covariance through the input data and presents eigenvectors and eigenvalues that can infer the direction of the data. The proposed method computes a reference axis in a time series data set having a similar directionality. It predicts the directionality of data in the next section through the angle between the directionality of each time series data constituting the data set and the reference axis. In this paper, we compare and verify the accuracy of the proposed algorithm with LSTM (Long Short-Term Memory) through cryptocurrency trends. As a result of comparative verification, the proposed method recorded relatively few transactions and high returns(112%) compared to LSTM in data with high volatility. It can mean that the signal was analyzed and predicted relatively accurately, and it is expected that better results can be derived through a more accurate threshold setting.

키워드

과제정보

This research was supported by 2021 Baekseok University Research Fund.

참고문헌

  1. E. H. Lee & J. P. Woo. (2019). A model for predicting the number of movie audiences and sales through big data analysis. Journal of the Korean Big Data Society, 4(2), 185-194. DOI : 10.36498/kbigdt.2019.4.2.185
  2. S. H. Song, K. S. Shin, J. H. Lee, Y. J. Jeong, J. S. Lee & S. M. Yoon. (2020). A study on the development of a short-term heat demand forecasting model using real-time calorimeter information. Journal of the Korean Big Data Society, 5(2), 17-27. DOI : 10.36498/kbigdt.2020.5.2.17
  3. T. H. Kim, S. W. Lim, J. K. Ko & J. H. Lee. (2020). A study on predictive analysis of win/loss of Korean professional baseball according to artificial intelligence model. Journal of the Korean Society for Big Data, 5(2), 77-84. DOI : 10.36498/kbigdt.2020.5.2.77
  4. K. T. Na, J. Y. Lee, E. C. Kim & H. C. Lee. (2020). Predicting and inferring reasons for churn of securities financial instruments trading customers. Journal of the Korean Society for Big Data, 5(2), 215-229. DOI : 10.36498/kbigdt.2020.5.2.215
  5. B. S. Pramod & M. S. PM. (2020). Stock Price Prediction Using LSTM.
  6. A. S. Saud & S. Shakya. (2020). Analysis of look back period for stock price prediction with RNN variants: A case study on banking sector of NEPSE. Procedia Computer Science, 167, 788-798. DOI : 10.1016/j.procs.2020.03.419
  7. A. Moghar & M. Hamiche. (2020). Stock market prediction using LSTM recurrent neural network. Procedia Computer Science, 170, 1168-1173. DOI : 10.1016/j.procs.2020.03.049
  8. Y. Li & H. Cao. (2018). Prediction for tourism flow based on LSTM neural network. Procedia Computer Science, 129, 277-283. DOI : 10.1016/j.procs.2018.03.076
  9. J. Qiu, B. Wang & C. Zhou. (2020). Forecasting stock prices with long-short term memory neural network based on attention mechanism. PloS one, 15(1), e0227222. DOI : 10.1371/journal.pone.0227222
  10. M. Roondiwala, H. Patel & S. Varma. (2017). Predicting stock prices using LSTM. International Journal of Science and Research (IJSR), 6(4), 1754-1756. DOI : 10.21275/ART20172755
  11. M. J. Kim, S. H. Min & I. Han. (2006). An evolutionary approach to the combination of multiple classifiers to predict a stock price index. Expert Systems with Applications, 31(2), 241-247. DOI : 10.1016/j.eswa.2005.09.020
  12. S. H. Kim, D. H. Kim, C. H. Han & W. I. Kim. (2008). Stock index relationship and stock prediction using genetic algorithm. Journal of the Korean Institute of Intelligent Systems, 18(6). DOI : 10.5391/JKIIS.2008.18.6.781
  13. N. H. M. Radzi, I. S. B. Gwari, N. H. Mustaffa & R. Sallehuddin. (2019, August). Support Vector Machine with Principle Component Analysis for Road Traffic Crash Severity Classification. In IOP Conference Series: Materials Science and Engineering, 551(1), pp. 012068. IOP Publishing. DOI : 10.1088/1757-899X/551/1/012068
  14. T. K. Lin. (2018). PCA/SVM-based method for pattern detection in a multisensor system. Mathematical Problems in Engineering, 2018. DOI : 10.1155/2018/6486345
  15. M. Mustaqeem & M. Saqib. (2021). Principal component based support vector machine (PC-SVM): a hybrid technique for software defect detection. Cluster Computing, 1-15. DOI : 10.1007/s10586-021-03282-8
  16. Se. H.Baek, J. Y. Oh, J. S. Lee, J. K. Hong & S. C. Hong. (2019). Sales forecast model for temperature change using big data analysis. Journal of the Korean Big Data Society, 4(1), 29-38. DOI : 10.36498/kbigdt.2019.4.1.29
  17. Y. J. Lee & G. Woo. (2013). Stock market network analysis for stock portfolio recommendations. Journal of the Korean Contents Association, 13(11), 48-58. DOI : 10.5392/JKCA.2013.13.11.048
  18. R. Pascanu, T. Mikolov & Y. Bengio. (2013, May). On the difficulty of training recurrent neural networks. In International conference on machine learning (pp. 1310-1318). PMLR.
  19. http://colah.github.io/posts/2015-08-Understanding-LSTMs/
  20. L. I. Smith. (Feb. 2002.). A tutorial on principal components analysis. Cornell University, USA
  21. https://en.wikipedia.org/wiki/Principal_component_analysis