Predicting claim size in the auto insurance with relative error: a panel data approach

Park, Heungsun;

doi:10.5351/KJAS.2021.34.5.697

The Korean Journal of Applied Statistics (응용통계연구)

Volume 34 Issue 5
/
Pages.697-710
/
2021
/
1225-066X(pISSN)
/
2383-5818(eISSN)

The Korean Statistical Society (한국통계학회)

DOI QR Code

Predicting claim size in the auto insurance with relative error: a panel data approach

상대오차예측을 이용한 자동차 보험의 손해액 예측: 패널자료를 이용한 연구

Park, Heungsun (Department of Statistics, Hankuk University of Foreign Studies)

박흥선 (한국외국어대학교 통계학과)

Received : 2021.06.09
Accepted : 2021.07.02
Published : 2021.10.31

https://doi.org/10.5351/KJAS.2021.34.5.697 Citation PDF KSCI

Download PDF

⟨ Previous Next ⟩

Abstract

Relative error prediction is preferred over ordinary prediction methods when relative/percentile errors are regarded as important, especially in econometrics, software engineering and government official statistics. The relative error prediction techniques have been developed in linear/nonlinear regression, nonparametric regression using kernel regression smoother, and stationary time series models. However, random effect models have not been used in relative error prediction. The purpose of this article is to extend relative error prediction to some of generalized linear mixed model (GLMM) with panel data, which is the random effect models based on gamma, lognormal, or inverse gaussian distribution. For better understanding, the real auto insurance data is used to predict the claim size, and the best predictor and the best relative error predictor are comparatively illustrated.

상대오차를 이용한 예측법은 상대오차(혹은 퍼센트오차)가 중요시되는 분야, 특히 계량경제학이나 소프트웨어 엔지니어링, 또는 정부기관 공식통계 부분에서 기존 예측방법 외에 선호되는 예측방법이다. 그 동안 상대오차를 이용한 예측법은 선형 혹은 비선형 회귀분석 뿐 아니라, 커널회귀를 이용한 비모수 회귀모형, 그리고 정상시계열분석에 이르기까지 그 범위가 확장되어 왔다. 그러나, 지금까지의 분석은 고정효과(fixed effect)만을 고려한 것이어서 임의효과(random effect)에 관한 상대오차 예측법에 대한 확장이 필요하였다. 본 논문의 목적은 상대오차예측법을 일반화선형혼합모형(GLMM)에 속한 감마회귀(gamma regression), 로그정규회귀(lognormal regression), 그리고 역가우스회귀(inverse gaussian regression)의 패널자료(panel data)에 적용시키는데 있다. 이를 위해 실제 자동차 보험회사의 손해액 자료를 사용하였고, 최량예측량과 최량상대오차예측량을 각각 적용-비교해 보았다.

Keywords

Acknowledgement

이 연구는 2020 학년도 한국외국어대학교 교내학술연구비의 지원에 의하여 이루어진 것임.

References

Bickel PJ and Doksum KA (1977). Mathematical Statistics, Holden-Day Inc., Oakland, CA.
Boland PJ (2007). Statistical and Probabilistic Methods in Actuarial Science, Boca Raton: Chapman & Hall/CRC.
Boodhun N and Jayabalan M (2018). Risk prediction in life insurance industry using supervised learning algorithm, Complex & Intelligent Systems, 4, 145-154. https://doi.org/10.1007/s40747-018-0072-1
Breslow NE and Clayton DG (1993). Approximate inference in generalized linear mixed models, Journal of the American Statistical Association, 88, 9-25. https://doi.org/10.2307/2290687
Chen K, Guo S, Lin Y, and Ying Z (2010). Least absolute relative error estimation, Journal of the American Statistical Association, 105, 1104-1112. https://doi.org/10.1198/jasa.2010.tm09307
Chen K, Lin Y, Wang Z, and Ying Z (2016). Least product relative error estimation, Journal of Multivariate Analysis, 144, 91-98. https://doi.org/10.1016/j.jmva.2015.10.017
Chhikara RS and Folks JL (1989). The Inverse Gaussian Distribuition, Marcel Dekker, New York.
Davidian M and Giltinan DM (1995).Nonlinear Models for Repeated Measurement Data, Boca Raton: Chapman & Hall/CRC.
Frees E (2018). Loss Data Analytics, an open text authored by the Actuarial Community.
Golub GH and Welsch JH (1969). Calculation of gaussian auadrature rules, Mathematical Computing, 23, 221-230. https://doi.org/10.1090/S0025-5718-69-99647-1
Hong L and Martin R (2019). Valid Model-Free Prediction of Future Insurance Claims, Retrieved October 12, 2019, from SSRN: https://ssrn.com/abstract=3468969 or http://dx.doi.org/10.2139/ssrn.3468969
Huang T, Zhao R, and Tang W (2009). Risk model with fuzzy random individual claim amount, European Journal of Operational Research, 192, 879-890. https://doi.org/10.1016/j.ejor.2007.10.035
Johnson NL and Kotz S (1970). Continuous Univariate Distributions: Distributions in Statistics, John Wiley & Sons, New York.
Jones MC, Park H, Shin KI, Vines SK, and Jeong SO (2008). Relative error prediction via kernel regression smoothers, Journal of Statistical Planning and Inference, 138, 2887-2898. https://doi.org/10.1016/j.jspi.2007.11.001
Jong Piet de and Heller GZ (2008). Generalized Linear Models for Insurance Data, international series on actuarial science, Cambridge University Press.
Jorgensen B and Souza MCP de (1994). Fitting Tweedie's compound poisson model to insurance claims data, Scandinavian Actuarial Journal, 1, 69-93. https://doi.org/10.1080/03461238.1994.10413930
Kahneman D and Tversky A (1979). Prospect theory: an analysis of decision under risk, Econometrica, 47, 263-291. https://doi.org/10.2307/1914185
Kim MJ and Kim YH (2009). Various modeling approaches in auto insurance pricing, Journal of the Korean Data & Information Science Society, 20, 515-526.
Lee Y and Nelder JA (1996). Hierarchical generalized linear models, Journal of Royal Statistical Society: Series B (Methodological), 58, 619-656. https://doi.org/10.1111/j.2517-6161.1996.tb02105.x
Lee Y and Nelder JA (2001). Hierarchical generalized linear models: a synthesis of generalized linear models, random-effect models and structured dispersions, Biometrika, 88, 987-1006. https://doi.org/10.1093/biomet/88.4.987
Makridakis SG (1984). The Forecasting Accuracy of Major Time Series Methods, Wiley, New York.
Park H and Shin KI (2005). A shrinked forecast in stationary processes favoring percentage error, Journal of Time Series Analysis, 27, 129-139. https://doi.org/10.1111/j.1467-9892.2005.00458.x
Park H and Stefanski LA (1998). Relative-error prediction, Statistics & Probability Letters, 40, 227-236. https://doi.org/10.1016/S0167-7152(98)00088-1
SAS/STAT^Ⓡ User's Guide (2012). The GLIMMIX Procedure, SAS Institute Inc., Cary, NC, USA.
Smyth GK and Jorgensen B (2002). Fitting tweedie's compound poisson model to insurance claims data: dispersion modeling, Actuarial Studies in Non-life insurance (ASTIN) bulletin, 32, 143-157.
Stoyanov J (1999). Inverse Gaussian distribution and the moment problem, Journal of Applied Statistical Science, 9, 61-71.
Tweedie MCK (1984). An index which distinguishes between some important exponential families in statistics applications and new directions. In Proceedings of the Indian Statistical Institute Golden Jubilee International Conference, 579-604.
Wang Z, Chen Z, and Chen Z (2018). H-relative error estimation approach for multiplicative regression model with random effect, Computational Statistics, 33, 623-638. https://doi.org/10.1007/s00180-018-0798-7
Werner G and Modlin C (2009). Basic Ratemaking Workshop, The Casualty Actuarial Society.
Wolfinger RD and O'Connell M (1993). Generalized linear mixed models: a pseudo-likelihood approach, Journal of Statistical Computation and Simulation, 40, 233-243. https://doi.org/10.1080/00949659208811379
Wuthrich MV and Merz M (2008). Stochastic Claims Reserving Methods in Insurance, West Sussex: John Wiley & Sons, England.

The Korean Journal of Applied Statistics (응용통계연구)

Predicting claim size in the auto insurance with relative error: a panel data approach

상대오차예측을 이용한 자동차 보험의 손해액 예측: 패널자료를 이용한 연구

Abstract

Keywords

Acknowledgement

References

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)