DOI QR코드

DOI QR Code

Local quantile ensemble for machine learning methods

  • Suin Kim (Department of Statistics, Korea University) ;
  • Yoonsuh Jung (Department of Statistics, Korea University)
  • Received : 2024.07.03
  • Accepted : 2024.08.01
  • Published : 2024.11.30

Abstract

Quantile regression models have become popular due to their benefits in obtaining robust estimates. Some machine learning (ML) models can estimate conditional quantiles. However, current ML methods mainly focus on just adapting quantile regression. In this paper, we propose a local quantile ensemble based on ML methods, which averages multiple estimated quantiles near the target quantile. It is designed to enhance the stability and accuracy of the quantile fits. This approach extends the composite quantile regression algorithm that typically considers the central tendency under a linear model. The proposed methods can be applied to various types of data having nonlinear and heterogeneous trend. We provide an empirical rule for choosing quantiles around the target quantile. The bias-variance tradeoff inherent in this method offers performance benefits. Through empirical studies using Monte Carlo simulations and real data sets, we demonstrate that the proposed method can significantly improve quantile estimation accuracy and stabilize the quantile fits.

Keywords

Acknowledgement

Yoonsuh Jung's work was partially supported by National Research Foundation of Korea (NRF) grant funded by Korea government (MIST)(No. 2022M3J6A1063595 and No. 2022R1F1A1071126).

References

  1. Alaa AM, Hussain Z, and Sontag D (2023). Conformalized unconditional quantile regression, International Conference on Artificial Intelligence and Statistics, 206, 10690-10702.
  2. Bloznelis D, Claeskens G, and Zhou J (2019). Composite versus model-averaged quantile regression, Journal of Statistical Planning and Inference, 200, 32-46.
  3. Buza K (2013). Feedback prediction for blogs. In Proceedings of Spiliopoulou M, Schmidt-Thieme L, Janning R (eds) Data analysis, machine learning and knowledge discovery, Springer International Publishing, Cham, 45-152.
  4. Cannon AJ (2018). Non-crossing nonlinear regression quantiles by monotone composite quantile regression neural network, with application to rainfall extremes, Stochastic Environmental Research and Risk Assessment, 32, 3207-3225.
  5. Chen T and Guestrin C (2016). XGBoost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York, 785-794.
  6. Das P and Ghosal S (2018). Bayesian non-parametric simultaneous quantile regression for complete and grid data, Computational Statistics & Data Analysis, 127, 172-186.
  7. Dietterich TG (2000). Ensemble methods in machine learning, In International Workshop on Multiple Classifier Systems (pp. 1-15), Berlin, Heidelberg.
  8. Fanaee-TH and Gama J (2014). Event labeling combining ensemble detectors and background knowledge, Progress in Artificial Intelligence, 2, 113-127.
  9. Feldman S, Bates S, and Romano Y (2021). Improving conditional coverage via orthogonal quantile regression, Advances in Neural Information Processing Systems, 34, 2060-2071.
  10. Foresi S and Peracchi F (1995). The conditional distribution of excess returns: An empirical analysis, Journal of the American Statistical Association, 90, 430, 451-466.
  11. Friedman JH (2001). Greedy function approximation: A gradient boosting machine, Annals of Statistics, 29, 1189-1232.
  12. Geurts P, Ernst D, and Wehenkel L (2006). Extremely randomized trees, Machine Learning, 63, 3-42.
  13. Hatalis K, Lamadrid AJ, Scheinberg K, and Kishore S (2019). A novel smoothed loss and penalty function for noncrossing composite quantile estimation via deep neural networks, Available from: arXiv preprint arXiv:1909.12122
  14. He X (1997). Quantile curves without crossing, The American Statistician, 51, 186-192.
  15. Jiang X, Jiang J, and Song X (2012). Oracle model selection for nonlinear models based on weighted composite quantile regression, Statistica Sinica, 22, 1479-1506.
  16. Ke G, Meng Q, Finley T, Wang T, Chen W, Ma W, Ye Q, and Liu T-Y (2017). Lightgbm: A highly efficient gradient boosting decision tree, Advances in Neural Information Processing Systems, 30, 3146-3154.
  17. Keilbar G and Wang W (2022). Modelling systemic risk using neural network quantile regression, Empirical Economics, 62, 93-118.
  18. Klein N, Kneib T, KLang S, and Sohn A (2015). Bayesian structured additive distributional regression with an application to regional income inequality in Germany, The Annals of Applied Statistics, 9, 1024-1052.
  19. Koenker R and Bassett Jr G (1978). Regression quantiles, Econometrica: Journal of the Econometric Society, 46, 33-50.
  20. Koenker R, Ng P, and Portnoy S (1994). Quantile smoothing splines, Biometrika, 81, 673-680.
  21. Koenker R, Portnoy S, Ng PT, Zeileis A, Grosjean P, and Ripley BD (2018). Package quantreg, Reference manual available at R-CRAN: https://cran.rproject.org/web/packages/quantreg/quantreg.pdf. Accessed 12 Jan 2023
  22. Koenker R (2005). Quantile regression, Cambridge University Press, Cambridge.
  23. Li D and Wang HJ (2019). Extreme quantile estimation for autoregressive models, Journal of Business & Economic Statistics, 37, 661-670.
  24. Li Y, Liu Y, and Zhu J (2007). Quantile regression in reproducing kernel Hilbert spaces, Journal of the American Statistical Association, 102, 255-268.
  25. Liu Y and Wu Y (2011). Simultaneous multiple non-crossing quantile regression estimation using kernel constraints, Journal of Nonparametric Statistics, 23, 415-437.
  26. Meinshausen N and Ridgeway G (2006). Quantile regression forests, Journal of Machine Learning Research, 7, 6.
  27. Pedregosa F, Varoquaux G, Gramfort A et al. (2011). Scikit-learn: Machine learning in Python, The Journal of Machine Learning Research, 12, 2825-2830.
  28. Powell D (2020). Quantile treatment effects in the presence of covariates, Review of Economics and Statistics, 102, 994-1005.
  29. Roebroek J (2022). Sklearn-quantile, Available from: https://github.com/jasperroebroek/sklearn-quantile. Accessed 10 Aug 2023
  30. Romano Y, Patterson E, and Candes E (2019). Conformalized quantile regression, Advances in Neural Information Processing Systems, 32, Available from: https://proceedings.neurips.cc/paperfiles/paper/2019/file/5103c3584b063c431bd1268e9b5e76fb-Paper.pdf.
  31. Sangnier M, Fercoq O, and d'Alche-Buc F (2016). Joint quantile regression in vector-valued RKHSs, ' Advances in Neural Information Processing Systems, 29, Available from: https://proceedings.neurips.cc/paperfiles/paper/2016/file/dfce06801e1a85d6d06f1fdd4475dacd-Paper.pdf.
  32. Schallhorn N, Kraus D, Nagler T, and Czado C (2017). D-vine quantile regression with discrete variables, Available from: arXiv preprint arXiv:1705.08310
  33. Shin W and Jung Y (2023). Deep support vector quantile regression with non-crossing constraints, Computational Statistics, 38, 1947-1976.
  34. Takeuchi I, Le QV, Sears TD, and Smola AJ (2006). Nonparametric quantile estimation, Journal of Machine Learning Research, 7, 1231-1264.
  35. Xie Z and Wen H (2019). Composite quantile regression long short-term memory network. In Artificial Neural Networks and Machine Learning-ICANN 2019: Text and Time Series: 28th International Conference on Artificial Neural Networks (pp. 513-524), Munich, Germany.
  36. Yeh IC (1998). Modeling of strength of high-performance concrete using artificial neural networks, Cement and Concrete Research, 28, 1797-1808.
  37. Zhao Z and Xiao Z (2014). Efficient regressions via optimally combining quantile information, Econometric Theory, 30, 1272-1314.
  38. Zou H and Yuan M (2008). Composite quantile regression and the oracle model selection theory, The Annals of Statistics, 36, 1108-1126.