• Title/Summary/Keyword: 야구경기 데이터

Search Result 43, Processing Time 0.019 seconds

이코노연재 / DB, 가공 통한 활용 중요

  • Park, Seong-Su
    • Digital Contents
    • /
    • no.5 s.96
    • /
    • pp.36-41
    • /
    • 2001
  • 데이터웨어하우스(Dataware House)가 자료 창고라고 하면 데이터마이닝(Data mining)은 그 창고에 있는 데이터를 가지고 실제 분석을 하는 것이다. 창고에 있는 데이터 자체는 기업에 부가가치를 줄 수는 없다. 이것을 분석하여 유용한 자료가 나와야지 그것을 가지고 행동에 옮겨 기업은 이익을 얻기 때문이다. 예를 들면 아무리 야구에 대한 지식이 머리 속에 많은 감독이라도 고민을 하여 실제 경기에서 응용할 수 있는 작전이 없다면 그 지식은 의미가 없듯이 데이터가 아무리 훌륭하고 많아도 그것을 분석하는 단계에서 문제가 있다면 아무 소용이 없다는 것이다.

  • PDF

Simple Algorithm for Baseball Elimination Problem (야구 배제 문제의 단순 알고리즘)

  • Lee, Sang-Un
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.20 no.3
    • /
    • pp.147-152
    • /
    • 2020
  • The baseball elimination problem(BEP) is eliminates teams that finishes the season in the early stage without play the remaining games because of the team never most wins even though all wins of remaining games. This problem solved by max-flow/min-cut theorem. But the max-flow/min-cut method has a shortcoming of iterative constructs the network for all of team and decides the min-cut for each network. This paper suggests ascending sort in wins game plus remaining games for each team, then the candidate eliminating team set K with lower 1/2 rank and most easy, simple, and fast computes the existence or not of subset R that a team elimination decision. As a result of various experimental data, this algorithm can be find all of elimination teams for whole data with fast and correct.

Evaluating the quality of baseball pitch using PITCHf/x (PITCHf/x를 이용한 투구의 질 평가)

  • Park, Sungmin;Jang, Woncheol
    • The Korean Journal of Applied Statistics
    • /
    • v.33 no.2
    • /
    • pp.171-184
    • /
    • 2020
  • Major League Baseball (MLB) records and releases the trajectory data for every baseball pitch, called the PITCHf/x, using three high-speed cameras installed in every stadium. In a previous study, the quality of the pitch was assessed as the expected number of bases yielded using PITCHf/x data. However, the number of bases yielded does not always lead to baseball scores, or runs. In this paper, we assess the quality of a pitch by combining baseball analytics metric Run Expectancy and Run Value using a Random Forests model. We compare the quality of pitches evaluated with Run Value to the quality of pitches evaluated with the expected number of bases yielded.

A Study on the Analysis of Factors for the Golden Glove Award by using Machine Learning (머신러닝을 이용한 골든글러브 수상 요인 분석에 대한 연구)

  • Uem, Daeyeob;Kim, Seongyong
    • The Journal of the Korea Contents Association
    • /
    • v.22 no.5
    • /
    • pp.48-56
    • /
    • 2022
  • The importance of data analysis in baseball has been increasing after the success of MLB's Oakland which applied Billy Beane's money ball theory, and the 2020 KBO winner NC Dinos. Various studies using data in baseball has been conducted not only in the United States but also in Korea, In particular, the models using deep learning and machine learning has been suggested. However, in the previous studies using deep learning and machine learning, the focus is only on predicting the win or loss of the game, and there is a limitation in that it is difficult to interpret the results of which factors have an important influence on the game. In this paper, to investigate which factors is important by position, the prediction model for the Golden Glove award which is given for the best player by position is developed. To develop the prediction model, XGBoost which is one of boosting method is used, which also provide the feature importance which can be used to interpret the factors for prediction results. From the analysis, the important factors by position are identified.

The estimation of winning rate in Korean professional baseball league (한국 프로야구의 승률 추정)

  • Kim, Soon-Kwi;Lee, Young-Hoon
    • Journal of the Korean Data and Information Science Society
    • /
    • v.27 no.3
    • /
    • pp.653-661
    • /
    • 2016
  • In this paper, we provide a suitable optimal exponent in the generalized Pythagorean theorem and propose to use the logistic model & the probit model to estimate the winning rate in Korean professional baseball league. Under a criterion of root-mean-square-error (RMSE), the efficiencies of the proposed models have been compared with those of the Pythagorean theorem. We use the team historic win-loss records of Korean professional baseball league from 1982 to the first half of 2015, and the proposed methods show slight outperformances over the generalized Pythagorean method under the criterion of RMSE.

A Study on the Timing of Starting Pitcher Replacement Using Machine Learning (머신러닝을 활용한 선발 투수 교체시기에 관한 연구)

  • Noh, Seongjin;Noh, Mijin;Han, Mumoungcho;Um, Sunhyun;Kim, Yangsok
    • Smart Media Journal
    • /
    • v.11 no.2
    • /
    • pp.9-17
    • /
    • 2022
  • The purpose of this study is to implement a predictive model to support decision-making to replace a starting pitcher before a crisis situation in a baseball game. To this end, using the Major League Statcast data provided by Baseball Savant, we implement a predictive model that preemptively replaces starting pitchers before a crisis situation. To this end, first, the crisis situation that the starting pitcher faces in the game was derived through data exploration. Second, if the starting pitcher was replaced before the end of the inning, learning was carried out by composing a label with a replacement in the previous inning. As a result of comparing the trained models, the model based on the ensemble method showed the highest predictive performance with an F1-Score of 65%. The practical significance of this study is that the proposed model can contribute to increasing the team's winning probability by replacing the starting pitcher before a crisis situation, and the coach will be able to receive data-based strategic decision-making support during the game.

Effects of on-base and slugging ability on run productivity in Korean professional baseball (한국 프로야구에서 출루 능력과 장타력이 득점 생산성에 미치는 영향)

  • Kim, Hyuk Joo
    • Journal of the Korean Data and Information Science Society
    • /
    • v.23 no.6
    • /
    • pp.1065-1074
    • /
    • 2012
  • The purpose of this paper is to statistically analyze the effects of on-base and slugging ability on the run productivity in Korean professional baseball. In Section 2, we have investigated the OPS (On-base percentage Plus Slugging average) and introduced new indices of batting ability by modifying the OPS. In Section 3, we have examined the correlation which the batting average, on-base percentage, slugging average, IsoP (Isolated Power), OPS and the indices introduced in Section 2 have with the average runs per game, using the data from all the games of the regular seasons in 2007~2011. In addition, by generalizing the OPS and the indices introduced in Section 2, we have analyzed the correlation of the indices with various weights between the average runs per game. As a result, the weighted OPS consisting of on-base percentage (with weight 57%) and slugging average (with weight 43%) has been found to give the best explanation of the run productivity.

Comprehensive evaluation of baseball player's offensive ability by use of simulation (시뮬레이션을 통한 프로야구 타자들의 공격능력의 종합적인 평가)

  • Kim, Nam Ki;Kim, Sun Ho
    • Journal of the Korean Data and Information Science Society
    • /
    • v.26 no.4
    • /
    • pp.865-874
    • /
    • 2015
  • This research is to comprehensively evaluate offensive abilities of baseball players who are expected to produce as many runs as possible by their hitting and running. To this end, we establish a simulation program to obtain the so-called scoring index of an individual player. The scoring index of a player is defined as an expected number of runs scored by an imaginary team that is composed of nine copies of the player. As a simulation input, we use 2014 season data of Korean pro-baseball. As a result, we present the scoring indices of top 10 players, 9 Korean pro-baseball teams, and overall 2014 season. The scoring index can serve as a comprehensive evaluation of offensive ability of a player or a team, selection of players for a (national) team or for a starting line-up, estimation of player's worth, and so on.

A Study on How to Nurture New Players using Data Analysis (데이터 분석을 활용한 신인급 선수 육성 방안 연구)

  • You, Kangsoo
    • Journal of Industrial Convergence
    • /
    • v.19 no.4
    • /
    • pp.17-21
    • /
    • 2021
  • Recently, in the field of sports, the use of data in conducting games, planning seasons, and operating teams has increased significantly. Also, in order to develop better players, it has become necessary to use data to accurately analyze their performance. Therefore, in this study, various data about rookie players was collected and pre-processed in order to analyze and visualize their performance. Additionally, an analysis was conducted to determine at least how many opportunities should be given to foster rookie players. Then, a data analysis method was presented for nurturing athletes by using data in the field of sports. It is expected that this study will contribute to fostering rookie players by utilizing data.

Analysis of Pitching Motions by Human Pose Estimation Based on RGB Images (RGB 이미지 기반 인간 동작 추정을 통한 투구 동작 분석)

  • Yeong Ju Woo;Ji-Yong Joo;Young-Kwan Kim;Hie Yong Jeong
    • Smart Media Journal
    • /
    • v.13 no.4
    • /
    • pp.16-22
    • /
    • 2024
  • Pitching is a major part of baseball, so much so that it can be said to be the beginning of baseball. Analysis of accurate pitching motions is very important in terms of performance improvement and injury prevention. When analyzing the correct pitching motion, the currently used motion capture method has several critical environmental drawbacks. In this paper, we propose analysis of pitching motion using the RGB-based Human Pose Estimation (HPE) model to replace motion capture, which has these shortcomings, and use motion capture data and HPE data to verify its reliability. The similarity of the two data was verified by comparing joint coordinates using the Dynamic Time Warping (DTW) algorithm.