DOI QR코드

DOI QR Code

Nonparametric clustering of functional time series electricity consumption data

전기 사용량 시계열 함수 데이터에 대한 비모수적 군집화

  • Kim, Jaehee (Department of Statistics, Duksung Women's University)
  • 김재희 (덕성여자대학교 정보통계학과)
  • Received : 2018.11.15
  • Accepted : 2018.12.22
  • Published : 2019.02.28

Abstract

The electricity consumption time series data of 'A' University from July 2016 to June 2017 is analyzed via nonparametric functional data clustering since the time series data can be regarded as realization of continuous functions with dependency structure. We use a Bouveyron and Jacques (Advances in Data Analysis and Classification, 5, 4, 281-300, 2011) method based on model-based functional clustering with an FEM algorithm that assumes a Gaussian distribution on functional principal components. Clusterwise analysis is provided with cluster mean functions, densities and cluster profiles.

본 연구는 2016년 7월부터 2017년 6월까지 인천 소재 A 대학교의 15분 단위의 일일 전기 사용량 시계열 데이터에 대해 functional data analysis 기법을 적용하여 군집화하고 각 군집의 특성을 파악하고 예측에 활용하고자 한다. 하루동안의 A 대학교의 전기 사용량은 패턴은 주중과 주말 에 큰 차이를 보이며 스플라인 기저함수로 FPCA 구한 후 이들에 대한 가우시안 분포의 혼합모형 기반 군집분석으로 3개의 군집화가 적절해 보인다. 각 군집에 대해 평균 함수, 확률밀도함수, 일들의 분포 등을 정리해 각 군집에 대한 정보와 특징을 보여준다.

Keywords

GCGHDE_2019_v32n1_149_f0001.png 이미지

Figure 2.1. A University electricity consumption data.

GCGHDE_2019_v32n1_149_f0002.png 이미지

Figure 2.2. Everyday electricity consumption plot.

GCGHDE_2019_v32n1_149_f0003.png 이미지

Figure 4.1. Spline smoothing with 16 bases.

GCGHDE_2019_v32n1_149_f0004.png 이미지

Figure 4.2. Cluster plot with smooth curves (left: 365 curves, right: cluster means).

GCGHDE_2019_v32n1_149_f0005.png 이미지

Figure 4.3. Derivative plots in clusters.

GCGHDE_2019_v32n1_149_f0006.png 이미지

Figure 4.4. Density plot in each cluster.

GCGHDE_2019_v32n1_149_f0007.png 이미지

Figure 4.5. Smooth functions in each cluster (16 spline bases).

GCGHDE_2019_v32n1_149_f0008.png 이미지

Figure 4.6. Silhouette plot for cluster validity.

Table 4.1. Comparison of clusters via distribution according to weekdays

GCGHDE_2019_v32n1_149_t0001.png 이미지

Table 4.2. Comparison of clusters via distribution according to month

GCGHDE_2019_v32n1_149_t0002.png 이미지

Table 4.3. Classification according to functional linear discriminant functions obtained

GCGHDE_2019_v32n1_149_t0003.png 이미지

References

  1. Abdel-Aal, R. E and Al-Garni, A. Z. (1997). Forecasting monthly electric energy consumption in eastern Saudi Arabia using univariate time-series analysis, Energy, 22, 1059-1069. https://doi.org/10.1016/S0360-5442(97)00032-7
  2. Andersson, J. and Lillestol, J. (2010). Modeling and forecasting electricity consumption by functional data analysis, Journal of Energy Markets, 3, 3-14. https://doi.org/10.21314/JEM.2010.038
  3. Antoch, J., Prchal, L., De Rosa, M., and Sarda, P. (2010). Electricity consumption prediction with functional linear regression using spline estimators, Journal of Applied Statistics, 37, 2027-2041. https://doi.org/10.1080/02664760903214395
  4. Araki, Y., Konishi, S., Kawano, S., and Matsui, H. (2009). Functional logistic discrimination via regularized basis expansions, Communications in Statistics-Theory and Methods, 38, 2944-2957. https://doi.org/10.1080/03610920902947246
  5. Bianchi, L., Jarrett, J., and Hanumara, R. C. (1998). Improving forecasting for telemarketing centers by ARIMA modeling with intervention, International Journal of Forecasting, 14, 497-504. https://doi.org/10.1016/S0169-2070(98)00037-5
  6. Bouveyron, C. and Jacques, J. (2011). Model-based clustering of time series in group-specific functional subspaces, Advances in Data Analysis and Classification, 5, 281-300. https://doi.org/10.1007/s11634-011-0095-6
  7. Chang, C., Chen, Y., and Ogden, R. T. (2014). Functional data classification: a wavelet approach. Computational Statistics, 29, 1497-1513. https://doi.org/10.1007/s00180-014-0503-4
  8. Chiou, J. M. (2012). Dynamical functional prediction and classification, with application to traffic flow prediction, Annals of Applied Statistics, 6, 1588-1614. https://doi.org/10.1214/12-AOAS595
  9. Chiou, J. M. and Li, P. L. (2007). Functional clustering and identifying substructures of longitudinal data, Journal of the Royal Statistical Society, Series B, 69, 679-699. https://doi.org/10.1111/j.1467-9868.2007.00605.x
  10. Chujai, P., Kerdprasop, N., and Kerdprasop, K. (2013). Time series analysis of household electric consumption with ARIMA and ARMA models, The International MultiConference of Engineers and Computer Scientists, 1, 295-300.
  11. Chang, C., Chen, Y., and Ogden, R. T. (2014). Functional data classification: a wavelet approach, Computational Statistics, 29, 1497-1513. https://doi.org/10.1007/s00180-014-0503-4
  12. Cryer, J. and Chan, K. (2008). Time Series Analysis (2nd ed), Springer, New York.
  13. Ferraty, F. and Vieu, P. (2003). Curves discrimination: a nonparametric functional approach, Computational Statistics & Data Analysis, 44, 161-173. https://doi.org/10.1016/S0167-9473(03)00032-X
  14. Ferraty, F. and Vieu, P. (2006). Nonparametric Functional Data Analysis, Springer, New York.
  15. Fumi, A., Pepe, A., Scarabotti, L., and Schiraldi, M. M. (2013). Fourier analysis for demand forecasting in a fashion company, International Journal of Engineering Business Management, 5, 1-10. https://doi.org/10.5772/52800
  16. Gardner, E. S. (1985). Exponential smoothing: the state of the art, Journal of Forecasting, 4, 1-28. https://doi.org/10.1002/for.3980040103
  17. Goia, A., May, C., and Fusai, G. (2010). Functional clustering and linear regression for peak load forecasting, International Journal of Forecasting, 26, 700-711. https://doi.org/10.1016/j.ijforecast.2009.05.015
  18. Hall, P., Muller, H. G., and Wang, J. L. (2006). Properties of principal component methods for functional and longitudinal data analysis, Annals of Statistics, 34, 1493-1517. sampled curves, Journal of the Royal Statistical Society Series B-Statistical Methodology, 63, 533-550. https://doi.org/10.1214/009053606000000272
  19. James, G. M., Hastie, T., and Sugar, C. (2000). Principal component models for sparse functional data, Biometrika, 87, 587-602. https://doi.org/10.1093/biomet/87.3.587
  20. Karhunen, K. (1947). On linear methods in probability theory, Annales Academiae Scientiarum Fennicae, AI 37, 3-79.
  21. Kim, B. and Kim, J. (2013). Time series models for daily exchange rate data, The Korean Journal of Applied Statistics, 26, 14-27.
  22. Kimball, B. A. (1974). Smoothing data with Fourier transformations, Agronomy Journal, 66, 259-262. https://doi.org/10.2134/agronj1974.00021962006600020023x
  23. Leng, X. and Muller, H. G. (2006). Time ordering of gene co-expression, Biostatistics, 7, 569-584. https://doi.org/10.1093/biostatistics/kxj026
  24. Loeve, M. (1945). Nouvelles classes de lois limits, Bulletin de la S. M. F., 73, 107-126.
  25. Matsui, H., Araki, T., and Konishi, S. (2011). Multiclass functional discriminant analysis and its application to gesture recognition, Journal of Classification, 28, 227-243. https://doi.org/10.1007/s00357-011-9082-z
  26. McCullagh, P. (1983). Quasi-likelihood functions, Annals of Statistics, 11, 59-67. https://doi.org/10.1214/aos/1176346056
  27. Ramsay, J. and Silverman, B. (1997). Functional Data Analysis. Springer.
  28. Rincon, M. and Ruiz-Medina, M. D. (2012). Wavelet-RKHS-based functional statistical classification, Advances in Data Analysis and Classification, 6, 201-217. https://doi.org/10.1007/s11634-012-0112-4
  29. Tan, Z., Zhang, J., Wang, J., and Xu, J. (2010). Day-ahead electricity price forecasting using wavelet transform combined with ARIMA and GARCH models, Applied Energy, 87, 3606-3610. https://doi.org/10.1016/j.apenergy.2010.05.012
  30. Wang, X. H., Ray, S., and Mallick, B. K. (2007). Bayesian curve classification using wavelets, Journal of the American Statistical Association, 102, 962-973. https://doi.org/10.1198/016214507000000455
  31. Zhu, H. X., Brown, P. J., and Morris, J. S. (2012). Robust classification of functional and quantitative image data using functional mixed models, Biometrics, 68, 260-1268.
  32. Zhu, H. X., Vannucci, M., and Cox, D. D. (2010). A Bayesian hierarchical model for classification with selection of functional predictors, Biometrics, 66, 463-473. https://doi.org/10.1111/j.1541-0420.2009.01283.x