DOI QR코드

DOI QR Code

최대 절대값 기반 시계열 데이터 예측 모델 평가 기법

Estimation Method of Predicted Time Series Data Based on Absolute Maximum Value

  • 투고 : 2018.09.21
  • 심사 : 2018.11.07
  • 발행 : 2018.12.31

초록

본 논문에서는 Mean Absolute Percentage Error (이하 MAPE)와 Symmetric Mean Absolute Percentage (이하 sMAPE)의 새로운 접근법을 이용한 시계열 예측 모델의 평가 방법을 소개한다. MAPE, sMAPE에는 다음과 같은 문제점이 있다. 데이터 집합에서 관측 값이 0일 경우 평가할 수 없고, 관측 값이 0에 매우 가깝다면 과도한 평가 값을 측정한다. 관측 값과 예측 값 간에 동일한 오차를 가지더라도 다른 값으로 평가하는 문제도 가지고 있다. 동일한 오류 값이 과대 예측되었는지 아니면 과소 예측되었는지에 따라 다른 평가 값을 측정하거나 관측 값의 부호와 예측 값의 부호가 서로 다르면 그 오차는 평가 값에 반영되지 않는다. 이러한 문제는 Maximum Mean Absolute Percentage Error (이하 mMAPE)에 의해 해결하였다. 우리는 MAPE 평가 방법의 분모에서 관측 값을 사용하는 대신 최대 절대 값을 사용했다. 최대 절대 값이 1보다 작으면 분모를 제거하여 0 값이 정의되지 않은 문제와 미세한 값일 경우 과대 측정되는 문제를 해결하였다. Beijing PM2.5의 온도 데이터와 시뮬레이션 데이터를 통해 mMAPE와 다른 평가 방법들의 결과 값을 비교하였으며, 위의 문제들을 해결할 수 있음을 검증하였다.

In this paper, we introduce evaluation method of time series prediction model with new approach of Mean Absolute Percentage Error(hereafter MAPE) and Symmetric Mean Absolute Percentage Error(hereafter sMAPE). There are some problems using MAPE and sMAPE. First MAPE can't evaluate Zero observation of dataset. Moreover, when the observed value is very close to zero it evaluate heavier than other methods. Finally it evaluate different measure even same error between observations and predicted values. And sMAPE does different evaluations are made depending on whether the same error value is over-predicted or under-predicted. And it has different measurement according to the each sign, even if error is the same distance. These problems were solved by Maximum Mean Absolute Percentage Error(hereafter mMAPE). we used the absolute maximum of observed value as denominator instead of the observed value in MAPE, when the value is less than 1, removed denominator then solved the problem that the zero value is not defined. and were able to prevent heavier measurement problem. Also, if the absolute maximum of observed value is greater than 1, the evaluation values of mMAPE were compared with those of the other evaluations. With Beijing PM2.5 temperature data and our simulation data, we compared the evaluation values of mMAPE with other evaluations. And we proved that mMAPE can solve the problems that we mentioned.

키워드

OGJGBN_2018_v27n4_103_f0001.png 이미지

Fig. 1. Observed data, prediction data MAPE value of simulation dataset having 0.05 error

OGJGBN_2018_v27n4_103_f0002.png 이미지

Fig. 2. Observed data, prediction data and sMAPE value of Beijing temperature dataset.

OGJGBN_2018_v27n4_103_f0003.png 이미지

Fig. 3. Beijing temperature dataset. Blue line is used for train dataset and orange line is used for test data set

OGJGBN_2018_v27n4_103_f0004.png 이미지

Fig. 4. ARIMA Prediction of Beijing temperature dataset. Blue is observed and green is prediction

OGJGBN_2018_v27n4_103_f0005.png 이미지

Fig. 5. MAPE and mMAPE evaluation of Beijing temperature dataset ARIMA prediction with scale

OGJGBN_2018_v27n4_103_f0006.png 이미지

Fig. 6. Observed data, prediction data mMAPE value of simulation dataset having 0.05 error

OGJGBN_2018_v27n4_103_f0007.png 이미지

Fig. 7. MAPE, sMAPE and sMAPE evaluation of Beijing temperature dataset ARIMA prediction with scale. Blue is observed, green is prediction, red is MAPE and yellow is mMAPE

Table 1. Beijing temperature data, predict value, MAPE value

OGJGBN_2018_v27n4_103_t0001.png 이미지

Table 2. Beijing temperature data, predict value, sMAPE value

OGJGBN_2018_v27n4_103_t0002.png 이미지

Table 3. Four comparison case of MAPE, sMAPE, mMAPE value

OGJGBN_2018_v27n4_103_t0003.png 이미지

Table 4. Four comparison case of MAPE, sMAPE, mMAPE value

OGJGBN_2018_v27n4_103_t0004.png 이미지

참고문헌

  1. Arnaud de Myttenaer et al., 2016, Mean Absolute Percentage Error for regression models, Neurocomputing, Vol. 192, pp. 38-48 https://doi.org/10.1016/j.neucom.2015.12.114
  2. Fumiya Akashi, Shuyang Bai, Murad S. Taqqu, 2018, Robust regression on stationary time series: a self-normalized resampling approach., JOURNAL OF TIME SERIESANALYSIS, 39, pp. 417-432 https://doi.org/10.1111/jtsa.12295
  3. Rob J Hyndman, George Athanasopoulos, 2014, Forecasting : principles and practice, OTexts, Heathmont, Vic, p. 291
  4. Rob J Hyndman, Anne B Koehler, 2006, Another look at measures of forecast accuracy, International journal of forecasting, 22(4), pp. 679-688 https://doi.org/10.1016/j.ijforecast.2006.03.001
  5. Ji, Y.M., Yoo, J.J., 2017, Intelligent IoT based building automatic control system, Magazine of the SAREK, Vol. 46, No. 7, pp. 32-40
  6. J. Scott Armstrong, Fred L. Collopy, 1992, Error measures for generalizing about forecasting methods: Empirical comparisons, International Journal of Forecasting, 8, pp. 69-80 https://doi.org/10.1016/0169-2070(92)90008-W
  7. Kim, S.I., Kim, H.Y., 2016, A new metric of absolute percentage error for intermittent demand forecasts, International Journal of Forecasting, Vol. 32, No. 3, pp. 669-679 https://doi.org/10.1016/j.ijforecast.2015.12.003
  8. Peter J. Huber, 1964, Robust Estimation of a Location Parameter, The Annals of Mathematical Statistics, Vol. 35, No. 1, pp. 73-101 https://doi.org/10.1214/aoms/1177703732
  9. Spyros Makridakis, 1993, Accuracy measures: theoretical and practical concerns, International Journal of Forecasting, Vol. 9, pp. 527-529 https://doi.org/10.1016/0169-2070(93)90079-3
  10. Spyros Makridakis, Michele Hibon, 1997, ARMA Models and the Box-Jenkins Methodology, Journal of forecasting, Vol. 16, pp. 147-163 https://doi.org/10.1002/(SICI)1099-131X(199705)16:3<147::AID-FOR652>3.0.CO;2-X
  11. Xuan Liang et al, 2015, Assessing Beijing's PM2.5 pollution: severity, weather impact, APEC and winter heating, Proc. R. Soc., Vol. 471, No. 2181, http://rspa.royalsocietypublishing.org/content/471/2182/20150257