Browse > Article
http://dx.doi.org/10.13088/jiis.2019.25.1.127

A Recidivism Prediction Model Based on XGBoost Considering Asymmetric Error Costs  

Won, Ha-Ram (Graduate School of Business IT, Kookmin University)
Shim, Jae-Seung (Graduate School of Business IT, Kookmin University)
Ahn, Hyunchul (Graduate School of Business IT, Kookmin University)
Publication Information
Journal of Intelligence and Information Systems / v.25, no.1, 2019 , pp. 127-137 More about this Journal
Abstract
Recidivism prediction has been a subject of constant research by experts since the early 1970s. But it has become more important as committed crimes by recidivist steadily increase. Especially, in the 1990s, after the US and Canada adopted the 'Recidivism Risk Assessment Report' as a decisive criterion during trial and parole screening, research on recidivism prediction became more active. And in the same period, empirical studies on 'Recidivism Factors' were started even at Korea. Even though most recidivism prediction studies have so far focused on factors of recidivism or the accuracy of recidivism prediction, it is important to minimize the prediction misclassification cost, because recidivism prediction has an asymmetric error cost structure. In general, the cost of misrecognizing people who do not cause recidivism to cause recidivism is lower than the cost of incorrectly classifying people who would cause recidivism. Because the former increases only the additional monitoring costs, while the latter increases the amount of social, and economic costs. Therefore, in this paper, we propose an XGBoost(eXtream Gradient Boosting; XGB) based recidivism prediction model considering asymmetric error cost. In the first step of the model, XGB, being recognized as high performance ensemble method in the field of data mining, was applied. And the results of XGB were compared with various prediction models such as LOGIT(logistic regression analysis), DT(decision trees), ANN(artificial neural networks), and SVM(support vector machines). In the next step, the threshold is optimized to minimize the total misclassification cost, which is the weighted average of FNE(False Negative Error) and FPE(False Positive Error). To verify the usefulness of the model, the model was applied to a real recidivism prediction dataset. As a result, it was confirmed that the XGB model not only showed better prediction accuracy than other prediction models but also reduced the cost of misclassification most effectively.
Keywords
Recidivism Prediction; Asymmetric Error Cost; Threshold Optimization; Data Mining; XGBoost;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Breiman, L., "Bagging Predictors," Machine Learning, Vol.24, No.2(1996), 123-140.   DOI
2 Chen, T., and C, Guestrin, "Xgboost: A scalable tree boosting system," Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining, (2016).
3 Joo, D., Hong, T., and I. Han, "The neural network models for IDS based on the asymmetric costs of false negative errors and false positive errors," Expert Systems with Applications, Vol.25(2003), 69-75.   DOI
4 Jung, S., "A Study on the Use of Big data in Criminal Law," Journal of Public Policy Studies, Vol.29, No. 2(2012), 161-184.
5 King, R. S., and B. Elderbroom, Improving recidivism as a performance measure, Washington, DC: Urban Institute, 2014.
6 Lee, H.-U., and H. Ahn, "An intelligent intrusion detection model based on support vector machines and the classification threshold optimization for considering the asymmetric error cost," Journal of Intelligence and Information Systems, Vol.17, No.4(2011), 157-173.   DOI
7 Nam, S., and S. Park, "Study on recidivism factors of prisoners," Corrections Review, Vol.50 (2011), 115-139.
8 Seong, H. G., "Methods and tasks in the prediction of criminal recidivism," Proceeding of the 2006 Annual Conference of Korean Psychological Association, (2006), 404-405.
9 Prison Education News, The Cost of Recidivism: Victims, the Economy, and American Prisons, 2014, Available at https://prisoneducation.com/prison-education-news/the-cost-of-recidivism-victims-the-economy-and-american-pris-html (Accessed 21 January, 2019).
10 Schmidt, P., and A. D. Witte, "Predicting criminal recidivism using 'Split Population' survival time models", Journal of Econometrics, Vol.40, No.1(1989) 141-159.   DOI
11 Sharkey A.J., Combining Artificial Neural Nets: ensemble and modular multi-net systems, (Ed.), Springer Science & Business Media, 2012.
12 Turgut O., "Predicting recidivism through machine learning," Ph.D. dissertation, University of Texas at Dallas, 2017.
13 New York Times, Recidivism's high cost and a way to cut it, 2011, Available at https://www.nytimes.com/2011/04/28/opinion/28thu3.html (Accessed 21 January 2019).