DOI QR코드

DOI QR Code

Improving Automobile Insurance Repair Claims Prediction Using Gradient Decent and Location-based Association Rules

  • Seongsu Jeong (The Business Informatics, Hanyang University) ;
  • Jong Woo Kim (School of Business, Hanyang University)
  • 투고 : 2023.12.21
  • 심사 : 2024.04.12
  • 발행 : 2024.06.30

초록

More than 1 million automobile insurance repairs occur per year globally, and the related repair costs add up to astronomical amounts. Insurance companies and repair shops are spending a great deal of money on manpower every year to claim reasonable insurance repair costs. For this reason, promptly predicting insurance claims for vehicles in accidents can help reduce social costs related to auto insurance. Several recent studies have been conducted in auto insurance repair prediction using variables such as photos of vehicle damage. We propose a new model that reflects auto insurance repair characteristics to predict auto insurance repair claims through an association rule method that combines gradient descent and location information. This method searches for the appropriate number of rules by applying the gradient descent method to results generated by association rules and eventually extracting main rules with a distance filter that reflects automobile part location information to find items suitable for insurance repair claims. According to our results, predictive performance could be improved by applying the rule set extracted by the proposed method. Therefore, a model combining the gradient descent method and a location-based association rule method is suitable for predicting auto insurance repair claims.

키워드

참고문헌

  1. Agrawal, R., Imieli'nski, T., and Swami, A. (1993). Mining association rules between sets of items in large databases. In Proceedings of the 1993 ACM SIGMOD international conference on Management of data (pp. 207-216). 
  2. Agrawal, R., and Srikant, R. (1994). Fast algorithms for mining association rules. In Proceedings of the 20th International Conference Very Large Data Bases (VLDB) (pp. 487-499). 
  3. Bailey, R. A., and Simon, L. J. (1960). Two studies in automobile insurance ratemaking. ASTIN Bulletin: The Journal of the IAA, 1(4), 192-217. https://doi.org/10.1017/S0515036100009569 
  4. Bhowmik, R. (2011). Detecting auto insurance fraud by data mining techniques. Journal of Emerging Trends in Computing and Information Sciences, 2, 156-162. 
  5. Czado, C., Kastenmeier, R., Brechmann, E. C., and Min, A. (2012). A mixed copula model for insurance claims and claim sizes. Scandinavian Actuarial Journal, 2012(4), 278-305. https://doi.org/10.1080/03461238.2010.546147 
  6. Frees, E. W., and Valdez, E. A. (2008). Hierarchical insurance claims modeling. Journal of the American Statistical Association, 103(484), 1457-1469. https://doi.org/10.1198/016214508000000823 
  7. Friedman, J. H. (2001). Greedy function approximation: A gradient boosting machine. Annals of Statistics, 29(5), 1189-1232. https://doi.org/10.1214/aos/1013203451 
  8. Fauzan, M. A., and Murfi, H. (2018). The accuracy of XGBoost for insurance claim prediction. International Journal of Advances in Soft Computing and its Applications, 10(2), 159-171. 
  9. Fialova, V., and Folvarcna, A. (2020). Default prediction using neural networks for enterprises from the post-soviet country. Ekonomicko-Manazerske Spektrum, 14(1), 43-51. https://doi.org/10.26552/ems.2020.1.43-51 
  10. Gao, G., and Wuthrich, M. V. (2018). Feature extraction from telematics car driving heatmaps. European Actuarial Journal, 8, 383-406. 
  11. Gschlossl, S., and Czado, C. (2007). Spatial modelling of claim frequency and claim size in non-life insurance. Scandinavian Actuarial Journal, 2007(3), 202-225. https://doi.org/10.1080/03461230701414764 
  12. Guelman, L. (2012). Gradient boosting trees for auto insurance loss cost modeling and prediction. Expert Systems with Applications, 39(3), 3659-3667. https://doi.org/10.1016/j.eswa.2011.09.058 
  13. Ghoting, A. Otey, M. E. and Parthasarathy, S. (2004). LOADED: link-based outlier and anomaly detection in evolving data sets. In Fourth IEEE International Conference on Data Mining (ICDM'04) (pp. 387-390). IEEE. 
  14. Heras, A., Moreno, I., and Vilar-Zanon, J. L. (2018). An application of two-stage quantile regression to insurance ratemaking. Scandinavian Actuarial Journal, 2018(9), 753-769. https://doi.org/10.1080/03461238. 2018.1452786 
  15. Han, J., Pei, J., and Yin, Y. (2000). Mining frequent patterns without candidate generation. ACM Sigmod Record, 29, 1-12. https://doi.org/10.1145/335191.335372 
  16. Jorgensen, B., and De Souza, M. C. P. (1994). Fitting tweedie's compound poisson model to insurance claims data. Scandinavian Actuarial Journal, 1994(1), 69-93. https://doi.org/10.1080/03461238.1994.10413930 
  17. Jain, R., Alzubi, J. A., Jain, N., and Joshi, P. (2019). Assessing risk in life insurance using ensemble learning. Journal of Intelligent & Fuzzy Systems, 37(3), 2969-2980. https://doi.org/10.3233/JIFS-190078 
  18. Kas'celan, V., Kas'celan, L., and Buri'c, M. N. (2015). A nonparametric data mining approach for risk prediction in car insurance: A case study from the Montenegrin market. Economic Research-Ekonomska Istrazivanja, 29, 545-558. https://doi. org/10.1080/1331677X.2016.1175729 
  19. Kowshalya, G., and Nandhini, M. (2018). Predicting fraudulent claims in automobile insurance. In 2018 Second International Conference on Inventive Communication and Computational Technologies (ICICCT) (pp. 1338-1343). IEEE. 
  20. Koufakou, A., and Georgiopoulos, M. (2010). A fast outlier detection strategy for distributed high-dimensional data sets with mixed attributes. Data Mining and Knowledge Discovery, 20(2), 259-289. https://doi.org/10.1007/s10618-009-0148-z 
  21. Liu, Y., Wang, B., and Lv, S. G. (2014). Using multi-class adaboost tree for prediction frequency of auto insurance. Journal of Applied Finance Banking, 4(5), 45-53. 
  22. Liu, G., Lu, H., Lou, W., Xu, Y., and Yu, J. X. (2004). Efficient mining of frequent patterns using ascending frequency ordered prefix-tree. Data Mining and Knowledge Discovery, 9, 249-274. https://doi.org/10.1023/B:DAMI.0000041128.59011.53 
  23. Nelder, J. A., and Wedderburn, R. W. M. (1972). Generalized linear models. Journal of the Royal Statistical Society. Series A (General), 135, 370-384. https://doi.org/10.2307/2344614 
  24. Otey, M. E., Ghoting, A., and Parthasarathy, S. (2006). Fast distributed outlier detection in mixed-attribute data sets. Data Mining and Knowledge Discovery, 12(2-3), 203-228. https://doi.org/10.1007/s10618-005-0014-6 
  25. Olden, J. D., and Jackson, D. A. (2002). Illuminating the "black box": A randomization approach for understanding variable contributions in artificial neural networks. Ecological Modelling 154, 135-150. https://doi.org/10.1016/S0304-3800(02)00064-9 
  26. Pesantez-Narvaez, J., Guillen, M., and Alcaniz, M. (2019). Predicting motor insurance claims using telematics data-XGBoost versus logistic regression. Risks, 7(2), 70. https://doi.org/10.3390/risks7020070 
  27. Singh, R., Ayyar, M. P., Pavan, T. S., Gosain, S., and Shah, R. R. (2019). Automating car insurance claims using deep learning techniques. In 2019 IEEE Fifth International Conference on Multimedia Big Data (BigMM) (pp. 199-207). IEEE. 
  28. Sun, N., Bai, H., Geng, Y., and Shi, H. (2017). Price evaluation model in second-hand car system based on BP neural network theory. In 2017 18th IEEE/ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/ Distributed Computing (SNPD), Kanazawa, Japan, June 26-28 (pp. 431-436). 
  29. Smyth, G. K., and Jorgensen, B. (2002). Fitting tweedie's compound poisson model to insurance claims data: Dispersion modelling. ASTIN Bulletin: The Journal of the IAA, 32(1), 143-157. https://doi.org/10.2143/AST.32.1.1020 
  30. Wolpert, D. H. (1992). Stacked generalization. Neural Networks, 5(2), 241-259. https://doi.org/10.1016/S0893-6080(05)80023-1 
  31. Wuthrich, M. V. (2019). Bias regularization in neural network models for general insurance pricing. European Actuarial Journal, 10, 179-202. 
  32. Yang, Y., Qian, W., and Zou, H. (2016). Insurance premium prediction via gradient tree-boosted tweedie compound poisson models. Journal of Business & Economic Statistics 43, 1-45. https://doi.org/10.48550/arXiv.1508.06378 
  33. Yang, Y. (2001). Adaptive regression by mixing. Journal of the American Statistical Association, 96, 574-588. https://doi.org/10.1198/016214501753168262 
  34. Yunos, Z. M., Ali, A., Shamsyuddin, S M., and Ismail, N. (2016). Predictive modelling for motor insurance claims using artificial neural networks. International Journal of Advances in Soft Computing and Its Applications, 8, 160-172. https://doi.org/10.35940/ijrte.F9873.038620 
  35. Zhang, L., and Shen, Q. (2019). Improvement of the traditional auto insurance claims frequency model by boosting algorithm-Based on the traffic compulsory insurance data in five provinces of China. Insure To Study, 7, 67-78.