DOI QR코드

DOI QR Code

통합 비교차 다중 분위수회귀나무 모형을 활용한 AI 면접체계 자료 분석

Analysis of AI interview data using unified non-crossing multiple quantile regression tree model

  • Kim, Jaeoh (Center for Army Analysis and Simulation, ROK Army HQs) ;
  • Bang, Sungwan (Department of Mathematics, Korea Military Academy)
  • 투고 : 2020.08.04
  • 심사 : 2020.09.21
  • 발행 : 2020.12.31

초록

본 연구는 대한민국 육군이 선도적으로 도입하고자 노력하고 있는 AI 면접체계의 자료를 통합 비교차 다중 분위수 회귀나무 모형(unified non-crossing multiple quantile tree; UNQRT)을 활용하여 분석한 것이다. 분위수 회귀가 일반적인 선형회귀에 비하여 많은 장점을 가지지만, 선형성 가정은 여전히 많은 현실 문제해결에 있어 지나치게 강한 가정이다. 선형성을 완화한 모형의 하나인 기존 나무모형 기반의 분위수 회귀는 추정된 분위수 함수별로 교차하는 문제와 분위수별로 나무모형을 제시하여 해석력을 저하시키는 문제가 있다. 통합 비교차 다중 분위수회귀나무 모형은 비교차 제약식을 부여한 상태로 다중 분위수 함수를 동시에 추정함으로서 분위수 함수의 교차 문제를 해결하며, 극단 분위수에서 안정된 결과를 기대할 수 있고, 하나의 통합된 나무모형을 제시하여 우수한 해석력이 있다. 본 연구에서는 통합 비교차 다중 분위수회귀나무 모형을 활용하여 육군 AI 면접체계의 결과와 기존 인사자료간 관계를 충분히 탐색하여 의미있는 다양한 결과를 도출하였다.

With an increasing interest in integrating artificial intelligence (AI) into interview processes, the Republic of Korea (ROK) army is trying to lead and analyze AI-powered interview platform. This study is to analyze the AI interview data using a unified non-crossing multiple quantile tree (UNQRT) model. Compared to the UNQRT, the existing models, such as quantile regression and quantile regression tree model (QRT), are inadequate for the analysis of AI interview data. Specially, the linearity assumption of the quantile regression is overly strong for the aforementioned application. While the QRT model seems to be applicable by relaxing the linearity assumption, it suffers from crossing problems among estimated quantile functions and leads to an uninterpretable model. The UNQRT circumvents the crossing problem of quantile functions by simultaneously estimating multiple quantile functions with a non-crossing constraint and is robust from extreme quantiles. Furthermore, the single tree construction from the UNQRT leads to an interpretable model compared to the QRT model. In this study, by using the UNQRT, we explored the relationship between the results of the Army AI interview system and the existing personnel data to derive meaningful results.

키워드

참고문헌

  1. Bondell, H. D., Reich, B. J., and Wang, H. (2010). Noncrossing quantile regression curve estimation, Biometrika, 97, 825-838. https://doi.org/10.1093/biomet/asq048
  2. Breiman, L., Friedman, J. H., Olshen, R. A., and Stone, C. J. (1984). Classification and Regression Trees, Wadsworth, Belmont.
  3. Chang, Y. (2016). Variable selection with quantile regression tree, The Korean Journal of Applied Statistics, 29, 1095-1106. https://doi.org/10.5351/KJAS.2016.29.6.1095
  4. Chaudhuri, P. and Loh, W. Y. (2002). Nonparametric estimation of conditional quantiles using quantile regression trees, Bernoulli, 8, 561-576.
  5. Farcomeni, A. (2012). Quantile regression for longitudinal data based on latent Markov subject-specific parameters, Statistics and Computing, 22, 141-152. https://doi.org/10.1007/s11222-010-9213-0
  6. Kim, J., Cho H., and Bang, S. (2019). Unified noncrossing multiple quantile regressions tree, Journal of Computational and Graphical Statistics, 28, 454-465. https://doi.org/10.1080/10618600.2018.1546592
  7. Koenker, R. and Bassett Jr, G. (1978). Regression quantiles, Econometrica: Journal of the Econometric Society, 33-50.
  8. Liu, Y. and Wu, Y. (2011). Simultaneous multiple non-crossing quantile regression estimation using kernel constraints, Journal of nonparametric statistics, 23, 415-437. https://doi.org/10.1080/10485252.2010.537336
  9. Loh, W. Y. (2002). Regression trees with unbiased variable selection and interaction detection, Statistica Sinica, 12, 361-386.
  10. Luo, X., Huang, C. Y., and Wang, L. (2013). Quantile regression for recurrent gap time data, Biometrics, 69, 375-385. https://doi.org/10.1111/biom.12010
  11. Portnoy, S. (2003). Censored regression quantiles, Journal of the American Statistical Association, 98, 1001-1012. https://doi.org/10.1198/016214503000000954
  12. Sun, X., Peng, L., Huang, Y., and Lai, H. J. (2016). Generalizing quantile regression for counting processes with applications to recurrent events, Journal of the American Statistical Association, 111, 145-156. https://doi.org/10.1080/01621459.2014.995795
  13. Theil, H. (1970). On the estimation of relationships involving qualitative variables, American Journal of Sociology, 76, 103-154. https://doi.org/10.1086/224909
  14. Wang, H. J. and Fygenson, M. (2009). Inference for censored quantile regression models in longitudinal studies, The Annals of Statistics, 37, 756-781. https://doi.org/10.1214/07-AOS564