• Title/Summary/Keyword: Baseball Game Analysis

Search Result 30, Processing Time 0.021 seconds

Steal Success Model for 2007 Korean Professional Baseball Games (2007년 한국프로야구에서 도루성공모형)

  • Hong, Chong-Sun;Choi, Jeong-Min
    • The Korean Journal of Applied Statistics
    • /
    • v.21 no.3
    • /
    • pp.455-468
    • /
    • 2008
  • Based on the huge baseball game records, the steal plays an important role to affect the result of games. For the research about success or failure of the steal in baseball games, logistic regression models are developed based on 2007 Korean professional baseball games. The analyses of logistic regression models are compared of those of the discriminant models. It is found that the performance of the logistic regression analysis is more efficient than that of the discriminant analysis. Also, we consider an alternative logistic regression model based on categorical data which are transformed from uneasy obtainable continuous data.

An Estimation Model for Defence Ability Using Big Data Analysis in Korea Baseball

  • Ju-Han Heo;Yong-Tae Woo
    • Journal of the Korea Society of Computer and Information
    • /
    • v.28 no.8
    • /
    • pp.119-126
    • /
    • 2023
  • In this paper, a new model was presented to objectively evaluate the defense ability of defenders in Korean professional baseball. In the proposed model, using Korean professional baseball game data from 2016 to 2019, a representative defender was selected for each team and defensive position to evaluate defensive ability. In order to evaluate the defense ability, a method of calculating the defense range for each position and dividing the calculated defense area was proposed. The defensive range for each position was calculated using the Convex Hull algorithm based on the point at which the defenders in the same position threw out the ball. The out conversion score and victory contribution score for both infielders and outfielders were calculated as basic scores using the defensive range for each position. In addition, double kill points for infielders and extra base points for outfielders were calculated separately and added together.

A Study on the Analysis of Factors for the Golden Glove Award by using Machine Learning (머신러닝을 이용한 골든글러브 수상 요인 분석에 대한 연구)

  • Uem, Daeyeob;Kim, Seongyong
    • The Journal of the Korea Contents Association
    • /
    • v.22 no.5
    • /
    • pp.48-56
    • /
    • 2022
  • The importance of data analysis in baseball has been increasing after the success of MLB's Oakland which applied Billy Beane's money ball theory, and the 2020 KBO winner NC Dinos. Various studies using data in baseball has been conducted not only in the United States but also in Korea, In particular, the models using deep learning and machine learning has been suggested. However, in the previous studies using deep learning and machine learning, the focus is only on predicting the win or loss of the game, and there is a limitation in that it is difficult to interpret the results of which factors have an important influence on the game. In this paper, to investigate which factors is important by position, the prediction model for the Golden Glove award which is given for the best player by position is developed. To develop the prediction model, XGBoost which is one of boosting method is used, which also provide the feature importance which can be used to interpret the factors for prediction results. From the analysis, the important factors by position are identified.

Explanation of Runs Lost Using Combined Fielding Indices in Korean Professional Baseball (결합된 수비지표들을 이용한 한국 프로야구의 실점 설명)

  • Kim, Hyuk Joo;Kim, Yea Hyoung
    • The Korean Journal of Applied Statistics
    • /
    • v.28 no.5
    • /
    • pp.1003-1011
    • /
    • 2015
  • We studied indices to explain runs lost for Korean professional baseball teams. Kim and Kim (2014) studied batting indices to explain run productivity of teams; subsequently, we studied fielding indices to explain runs lost. We considered several combined indices made by combining fielding indices closely connected with the runs lost of teams. Data analysis from all games in the regular seasons of 1982~2014 show that weighted WPH (defined as weighted average of WHIP and number of home runs allowed per game) best explain runs lost. Weighted WPH consisting of WHIP (with weight 81%) and number of home runs allowed per game (with weight 19%) was found optimal weighted WPH having correlation coefficient 0.95033 with average runs lost per game. Analysis by chronological periods gave results not much different.

The Effect of Discomfort Index on Outfielder's Game Record Data (불쾌지수가 외야수의 경기 기록 데이터에 미치는 영향)

  • Kim, Semin;Shin, Chwa-Cheol
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.24 no.8
    • /
    • pp.978-984
    • /
    • 2020
  • In this study, the correlation between sports records and weather data was analyzed using the big data analysis method. To this end, data was collected by API and crawling, data was processed, statistics were performed, and data visualization was performed. The subject of this study was a player who entered the regular at-bat among outfielders in the 2019 KBO League. In addition, meteorological data were analyzed by using the unpleasant index and above 70 and below 70. As a result of the study, in the various hitting indicators, which are the records that pitchers intervene, the higher the unpleasant index, the better the outfielder's record, but pitchers, walks, pitches, pitching success rates, pitches per turn, pitches per game From the records of the back, it was found that the outfielder made the pitcher difficult. It is expected that this study will help the development of the sports data industry and the performance of baseball players, baseball teams, and coaching staff.

Using Data Mining Techniques to Predict Win-Loss in Korean Professional Baseball Games (데이터마이닝을 활용한 한국프로야구 승패예측모형 수립에 관한 연구)

  • Oh, Younhak;Kim, Han;Yun, Jaesub;Lee, Jong-Seok
    • Journal of Korean Institute of Industrial Engineers
    • /
    • v.40 no.1
    • /
    • pp.8-17
    • /
    • 2014
  • In this research, we employed various data mining techniques to build predictive models for win-loss prediction in Korean professional baseball games. The historical data containing information about players and teams was obtained from the official materials that are provided by the KBO website. Using the collected raw data, we additionally prepared two more types of dataset, which are in ratio and binary format respectively. Dividing away-team's records by the records of the corresponding home-team generated the ratio dataset, while the binary dataset was obtained by comparing the record values. We applied seven classification techniques to three (raw, ratio, and binary) datasets. The employed data mining techniques are decision tree, random forest, logistic regression, neural network, support vector machine, linear discriminant analysis, and quadratic discriminant analysis. Among 21(= 3 datasets${\times}$7 techniques) prediction scenarios, the most accurate model was obtained from the random forest technique based on the binary dataset, which prediction accuracy was 84.14%. It was also observed that using the ratio and the binary dataset helped to build better prediction models than using the raw data. From the capability of variable selection in decision tree, random forest, and stepwise logistic regression, we found that annual salary, earned run, strikeout, pitcher's winning percentage, and four balls are important winning factors of a game. This research is distinct from existing studies in that we used three different types of data and various data mining techniques for win-loss prediction in Korean professional baseball games.

The Analysis on Sport Emotion Type by Sport Game Characteristics: with Social Big-Data (스포츠 경기의 특성에 따른 스포츠 감정 유형 분석 : 소셜 빅데이터를 중심으로)

  • Kim, Young-Mee;Yang, Jae-Sik
    • Journal of Digital Convergence
    • /
    • v.19 no.7
    • /
    • pp.371-377
    • /
    • 2021
  • This study tried to analyze the types of sport emotion by sport game characteristics. For that, 7 soccer games and 6 baseball games of Korean team in 2018 Asian Games were selected, and the articles and their replies about those on social network services were collected as study materials. Python was used for the collecting and expert group meeting was held for the emotion analysis. As the results of the analysis on sport emotion types by win or lose, the level of opponents and the performance of Korean team as game characteristics, the following conclusions were drawn. First, it was hard to say that win or lose and opponent's level make certain sport emotion type. Second, The performance could made contended, enthusiastic and joyful emotions when judged good, but frustrated, angry, humiliated emotions when bad. Third, social·cultural background or certain event of the games also could effect on the sport emotion types. Follow-up studies with the other game characteristics and more game cases were needed to find out more clear causal relationship.

Fielding indices for explaining runs lost combining adjusted WHIP and the number of home runs allowed in Korean professional baseball (한국 프로야구에서 수정된 WHIP와 피홈런 수를 결합한 실점 설명 수비지표들)

  • Kim, Hyuk Joo
    • The Korean Journal of Applied Statistics
    • /
    • v.29 no.7
    • /
    • pp.1283-1294
    • /
    • 2016
  • We studied fielding indices to explain runs lost for Korean professional baseball teams, successively motivated by OPS and weighted OPS obtained by combining on-base percentage and slugging average that can adequately explain the run productivity of teams. We considered several combined indices made by combining fielding indices highly correlated with the runs lost of teams. Data analysis from all games in the regular seasons of 1982-2015 shows that weighted adjusted WPH 2 (defined as weighted average of adjusted WHIP and number of home runs allowed per inning) best explains runs lost. Weighted adjusted WPH 2 consisting of adjusted WHIP (with weight 34%) and number of home runs allowed per inning (with weight 66%) was found to be optimal weighted adjusted WPH 2 having correlation coefficient 0.95362 with average runs lost per game. This result is an improvement of the result of the index obtained in Kim and Kim (2015a). Analysis by chronological periods provides results that are not much different. Also we made a list of top 10 pitchers for each of the recent three years, based on the obtained index.

Sport and Culture: Application of Traditional and Contemporary Content

  • CHANG, Deok Seon;KIM, Hae Yu;LEE, Hyuk Jin
    • Journal of Sport and Applied Science
    • /
    • v.5 no.2
    • /
    • pp.1-7
    • /
    • 2021
  • Purpose: This study started with an interest in sports culture-related content and aims to comprehend the application of traditional and contemporary cultural content to sport business. Research design, data, and methodology: The current study reviews related-documents, research papers, media reports, and a secondary data. The collected data were multiple reviewed via content analysis. Results: Findings are as follow. First, the study found that sports is born in religious rituals which are associated with human needs for survival and prosperity. Second, sports is sort of official format that inherent desire of human could be satisfied, representing play and game. Third, the current study discovered that sports could be cultural products such as literature and film. This is because sport has often been used as major themes in contemporary art production. Finally, this study included important cultural content categories, but could not cover all categories due to the limitations of the study. Conclusions: this study reviewed multiple literature to decode historical and anthropological meanings of sport. The finding presents the cultural traits and meaning of contemporary sport. Further implications were discussed.

Professional Baseball Viewing Culture Survey According to Corona 19 using Social Network Big Data (소셜네트워크 빅데이터를 활용한 코로나 19에 따른 프로야구 관람문화조사)

  • Kim, Gi-Tak
    • Journal of Korea Entertainment Industry Association
    • /
    • v.14 no.6
    • /
    • pp.139-150
    • /
    • 2020
  • The data processing of this study focuses on the textom and social media words about three areas: 'Corona 19 and professional baseball', 'Corona 19 and professional baseball', and 'Corona 19 and professional sports' The data was collected and refined in a web environment and then processed in batch, and the Ucinet6 program was used to visualize it. Specifically, the web environment was collected using Naver, Daum, and Google's channels, and was summarized into 30 words through expert meetings among the extracted words and used in the final study. 30 extracted words were visualized through a matrix, and a CONCOR analysis was performed to identify clusters of similarity and commonality of words. As a result of analysis, the clusters related to Corona 19 and Pro Baseball were composed of one central cluster and five peripheral clusters, and it was found that the contents related to the opening of professional baseball according to the corona 19 wave were mainly searched. The cluster related to Corona 19 and unrelated to professional baseball consisted of one central cluster and five peripheral clusters, and it was found that the keyword of the position of professional baseball related to the professional baseball game according to Corona 19 was mainly searched. Corona 19 and the cluster related to professional sports consisted of one central cluster and five peripheral clusters, and it was found that the keywords related to the start of professional sports according to the aftermath of Corona 19 were mainly searched.