Browse > Article
http://dx.doi.org/10.5392/JKCA.2022.22.05.048

A Study on the Analysis of Factors for the Golden Glove Award by using Machine Learning  

Uem, Daeyeob (호서대학교 빅데이터AI학과)
Kim, Seongyong (호서대학교 빅데이터AI학과)
Publication Information
Abstract
The importance of data analysis in baseball has been increasing after the success of MLB's Oakland which applied Billy Beane's money ball theory, and the 2020 KBO winner NC Dinos. Various studies using data in baseball has been conducted not only in the United States but also in Korea, In particular, the models using deep learning and machine learning has been suggested. However, in the previous studies using deep learning and machine learning, the focus is only on predicting the win or loss of the game, and there is a limitation in that it is difficult to interpret the results of which factors have an important influence on the game. In this paper, to investigate which factors is important by position, the prediction model for the Golden Glove award which is given for the best player by position is developed. To develop the prediction model, XGBoost which is one of boosting method is used, which also provide the feature importance which can be used to interpret the factors for prediction results. From the analysis, the important factors by position are identified.
Keywords
Baseball; Golden Glove Award; XGBoost; Feature Importance;
Citations & Related Records
Times Cited By KSCI : 3  (Citation Analysis)
연도 인용수 순위
1 김형우, 머신러닝 기법을 활용한 프로야구 승패 예측, 전남대학교, 석사학위논문, 2021.
2 Y. Oh, H. Kim, J. Yun, and J. Lee, "Using Data Mining Techniques to Predict Win-Loss in Korean Professional Baseball Games," Korean Institute of Industrial Engineers, Vol.40, No.1, pp.8-17, 2014.   DOI
3 박동주, 김병우, 정영선, 안창욱, "Deep Neural Network 기반 프로야구 일일 관중 수 예측 : 광주-기아 챔피언스 필드를 중심으로," 스마트미디어저널, 제7권, 제1호, pp.16-23, 2018.   DOI
4 L. Breiman, "Random Forest," Machine Learning, Vol.45, No.1, pp.5-32, 2001.   DOI
5 T. Chen and C. Guestrin, "XGBoost: A scalable tree boosting system," Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining San Francisco, pp. 785-794, 2016.
6 M. Lewis, Moneyball: The Art of winning an unfair game, Norton: New York, 2003.
7 오광모, 이장택, "데이터마이닝을 이용한 한국 프로야구 선수들의 연봉에 관한 모형연구," 한국스포츠사회학회지, 제16권, 제2호, pp2-310, 2003.
8 홍석미, 정경숙, 정태충, "혼합형 기계 학습 모델을 이용한 프로야구 승패 예측 시스템," 한국정보과학회, 제9권, 제6호, pp.693-698, 2003.
9 김태훈, 임성원, 고진광, 이재학, "인공지능 모델에 따른 한국프로야구의 승패 예측 분석에 관한 연구," 한국빅데이터학회, 제5권, 제2호, pp.77-84, 2020.   DOI
10 서영진, 문형우, 우용태, "기계학습 기법을 이용한 한국프로야구 승패 예측 모델," 한국컴퓨터정보학회, 제24권, 제2호, pp.17-24, 2019.   DOI
11 R. Rojas, AdaBoost and the super bowl of classifiers a tutorial introduction to adaptive boosting, Freie University: Berlin, 2009.
12 T. Hastie, R. Tibshirani, and J. H. Friedman, "10. Boosting and Additive Trees," The Elements of Statistical Learning(2nd ed.), New York: Springer, 2009.
13 채진석, 조은형, 엄한주, "프로야구 포스트시즌 진출예측을 위한 통계적 모형 비교," 한국체육측정평가학회지, 제12권, 제1호, pp.33-48, 2010.   DOI
14 김종훈, 김경태, 한종기, "Deep Learning 기반 기계학습 알고리즘을 이용한 야구 경기 Big Data 분석," 한국통신학회, 제2015권, 제11호, pp.262-265, 2015.
15 J. H. Friedman, "Greedy Function Approximation: A Gradient Boosting Machine," The Annals of Statistics, Vol.29, No.5, pp.1189-1232, 2001.   DOI
16 노언석, 최재현, "기계학습을 활용한 프로야구 승부예측에 관한 연구," 한국IT정책경영학회논문지, 제9권, 제1호, pp.335-338, 2017.