DOI QR코드

DOI QR Code

Style-Based Transformer for Time Series Forecasting

시계열 예측을 위한 스타일 기반 트랜스포머

  • 김동건 (성균관대학교 소프트웨어학과) ;
  • 김광수 (성균관대학교 소프트웨어학과)
  • Received : 2021.09.13
  • Accepted : 2021.11.03
  • Published : 2021.12.31

Abstract

Time series forecasting refers to predicting future time information based on past time information. Accurately predicting future information is crucial because it is used for establishing strategies or making policy decisions in various fields. Recently, a transformer model has been mainly studied for a time series prediction model. However, the existing transformer model has a limitation in that it has an auto-regressive structure in which the output result is input again when the prediction sequence is output. This limitation causes a problem in that accuracy is lowered when predicting a distant time point. This paper proposes a sequential decoding model focusing on the style transformation technique to handle these problems and make more precise time series forecasting. The proposed model has a structure in which the contents of past data are extracted from the transformer-encoder and reflected in the style-based decoder to generate the predictive sequence. Unlike the decoder structure of the conventional auto-regressive transformer, this structure has the advantage of being able to more accurately predict information from a distant view because the prediction sequence is output all at once. As a result of conducting a prediction experiment with various time series datasets with different data characteristics, it was shown that the model presented in this paper has better prediction accuracy than other existing time series prediction models.

시계열 예측은 과거 시점의 정보를 토대로 미래 시점의 정보를 예측하는 것을 말한다. 향후 시점의 정보를 정확하게 예측하는 것은 다양한 분야 전략 수립, 정책 결정 등을 위해 활용되기 때문에 매우 중요하다. 최근에는 트랜스포머 모델이 시계열 예측 모델로서 주로 연구되고 있다. 그러나 기존의 트랜스포머의 모델은 예측 순차를 출력할 때 출력 결과를 다시 입력하는 자가회귀 구조로 되어 있다는 한계점이 있다. 이 한계점은 멀리 떨어진 시점을 예측할 때 정확도가 떨어진다는 문제점을 초래한다. 본 논문에서는 이러한 문제점을 개선하고 더 정확한 시계열 예측을 위해 스타일 변환 기법에 착안한 순차 디코딩 모델을 제안한다. 제안하는 모델은 트랜스포머-인코더에서 과거 정보의 특성을 추출하고, 이를 스타일-기반 디코더에 반영하여 예측 시계열을 생성하는 구조로 되어 있다. 이 구조는 자가회귀 방식의 기존의 트랜스포머의 디코더 구조와 다르게, 예측 순차를 한꺼번에 출력하기 때문에 더 먼 시점의 정보를 좀 더 정확히 예측할 수 있다는 장점이 있다. 서로 다른 데이터 특성을 가지는 다양한 시계열 데이터셋으로 예측 실험을 진행한 결과, 본 논문에서 제시한 모델이 기존의 다른 시계열 예측 모델보다 예측 정확도가 우수하다는 것을 보인다.

Keywords

Acknowledgement

본 연구는 과학기술정보통신부 및 정보통신기획평가원의 지역지능화혁신인재양성(Grand ICT연구센터) 사업의 연구결과로 수행되었음(IITP-2021-2015-0-00742).

References

  1. A. Vaswani, et al., "Attention is all you need," Advances in Neural Information Processing Systems, 2017.
  2. S. Li, et al., "Enhancing the locality and breaking the memory bottleneck of transformer on time series forecasting," Advances in Neural Information Processing Systems, Vol.32, pp.5243-5253, 2019.
  3. L. Cai, K. Janowicz, G. Mai, B. Yan, and R. Zhu, "Traffic transformer: Capturing the continuity and periodicity of time series for traffic forecasting," Transactions in GIS, Vol.24., No.3, pp.736-755, 2020. https://doi.org/10.1111/tgis.12644
  4. F. Giuliari, I. Hasan, M. Cristani, and F. Galasso, "Transformer networks for trajectory forecasting," 2020 25th International Conference on Pattern Recognition (ICPR). IEEE, 2021.
  5. L. A. Gatys, A. S. Ecker, and M. Bethge, "Image style transfer using convolutional neural networks," Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016.
  6. B. L. Bowerman and R. T. O'connell, "Time series and forecasting," North Scituate, MA: Duxbury Press, 1979.
  7. J. Contreras, R. Espinola, F. J. Nogales, and A. J. Conejo, "ARIMA models to predict next-day electricity prices," IEEE Transactions on Power Systems, Vol.18, No.3, pp.1014-1020, 2003. https://doi.org/10.1109/TPWRS.2002.804943
  8. A. I. McLeod, "Diagnostic checking of periodic autoregression models with application," Journal of Time Series Analysis, Vol.15, No.2, pp.221-233, 1994. https://doi.org/10.1111/j.1467-9892.1994.tb00186.x
  9. R. B. Darlington and A. F. Hayes, "Regression analysis and linear models," New York, NY: Guilford, 2017.
  10. J. T. Connor, R. D. Martin, and L. E. Atlas, "Recurrent neural networks and robust time series prediction," IEEE Transactions on Neural Networks, Vol.5, No.2, pp.240-254, 1994. https://doi.org/10.1109/72.279188
  11. R. J. Frank, N. Davey, and S. P. Hunt, "Time series prediction and neural networks," Journal of Intelligent and Robotic Systems, Vol.31, No.1, pp.91-103, 2001. https://doi.org/10.1023/A:1012074215150
  12. A. B. Geva, "ScaleNet-multiscale neural-network architecture for time series prediction," IEEE Transactions on Neural Networks, Vol.9, No.6, pp.1471-1482, 1998. https://doi.org/10.1109/72.728396
  13. A. Tokgoz and G. unal, "A RNN based time series approach for forecasting turkish electricity load," 2018 26th Signal Processing and Communications Applications Conference (SIU), IEEE, 2018.
  14. I. Koprinska, D. Wu, and Z. Wang, "Convolutional neural networks for energy time series forecasting," 2018 International Joint Conference on Neural Networks (IJCNN), IEEE, 2018.
  15. C.-J. Huang and P.-H. Kuo, "A deep cnn-lstm model for particulate matter (PM2. 5) forecasting in smart cities," Sensors, Vol.18, No.7, pp.2220, 2018. https://doi.org/10.3390/s18072220
  16. J. Pamina and B. Raja, "Survey on deep learning algorithms," International Journal of Emerging Technology and Innovative Engineering, Vol.5, No.1, 2019.
  17. J. J. Dabrowski, Y. Zhang, and A. Rahman, "ForecastNet: A time-variant deep feed-forward neural network architecture for multi-step-ahead time-series forecasting," International Conference on Neural Information Processing, Springer, Cham, 2020.
  18. Y. G. Cinar, H. Mirisaee, P. Goswami, E. Gaussier, A. Ait-Bachir, and V. Strijov, "Position-based content attention for time series forecasting with sequence-to-sequence rnns," International Conference on neural Information Processing, Springer, Cham, 2017.
  19. Y. Lin, I. Koprinska, and M. Rana, "SpringNet: Transformer and spring DTW for time series forecasting," International Conference on Neural Information Processing. Springer, Cham, 2020.
  20. N. Chen, S. Watanabe, J. Villalba, P. Zelasko, and N. Dehak, "Non-autoregressive transformer for speech recognition," IEEE Signal Processing Letters, Vol.28, pp.121-125, 2020.
  21. J. L. Ba, J. R. Kiros, and G. E. Hinton, "Layer normalization," arXiv preprint arXiv:1607.06450, 2016.
  22. I. Goodfellow, et al., "Generative adversarial nets," Advances in Neural Information Processing Systems, Vol.27, 2014.
  23. T. Karras, S. Laine, and T. Aila, "A style-based generator architecture for generative adversarial networks," Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019.
  24. G. Lai, W.-C. Chang, Y. Yang, and H. Liu, "Modeling long-and short-term temporal patterns with deep neural networks," The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval, 2018.
  25. D. P. Kingma and J. Ba, "Adam: A method for stochastic optimization," arXiv preprint arXiv:1412.6980, 2014.