DOI QR코드

DOI QR Code

Post-Match Outcome Analysis in the FIFA 2023 Women's World Cup: Insights for Sports Management and Performance Logistics

  • Songyi SONG (Sports Information and Science Laboratory, Dankook University) ;
  • Hyongjun CHOI (Department of Physical Education, Dankook University)
  • Received : 2025.06.12
  • Accepted : 2025.07.05
  • Published : 2025.07.30

Abstract

Purpose: This study aims to model and predict team performance outcomes during the 2023 FIFA Women's World Cup by integrating a supply chain perspective into sports analytics. The research addresses a gap in the literature by focusing on the distribution and flow of player and team performance data, which is critical for optimizing decision-making in modern football management, media reporting, and logistics. Research design, data and methodology: Data were collected from 60 matches involving 16 national teams, incorporating detailed player-level performance metrics. Five feature engineering strategies and seven machine learning algorithms were compared through rigorous nested cross-validation techniques. This analytical pipeline represents a structured supply chain of data processing and model evaluation. Results: Among the tested algorithms, logistic regression consistently outperformed others, achieving 99% accuracy under nested cross-validation and 95.8% accuracy on an independent test set. This indicates robust generalizability and practical reliability in predicting match outcomes. Conclusions: This study contributes a comprehensive machine learning framework tailored for women's football, emphasizing the importance of data distribution efficiency. The results offer practical implications for enhancing performance forecasting, supporting coaching strategies, streamlining media reporting, and improving event supply chain operations in international sports tournaments.

Keywords

1. Introduction

Predicting match outcomes in international football tournaments has garnered considerable attention due to potential implications for sports analytics, strategic planning, and media engagement. Despite extensive research predominantly focused on men's football, women's football analytics remains underrepresented, characterized by limited methodological rigor, inadequate predictive modeling, and scarce comparative feature evaluations. Given these gaps, there is a clear need for comprehensive, data-driven predictive frameworks tailored explicitly for women's international football.

The FIFA Women's World Cup provides a unique and complex competitive environment, challenging traditional predictive methodologies due to its hierarchical structure, diverse competitive contexts, and limited match data. This study addresses these challenges by systematically evaluating various feature engineering approaches and machine learning algorithms using detailed, granular player-level performance metrics collected from the FIFA 2023 Women's World Cup. Our primary objectives are to identify football and predictive strategies, understand the distinctive predictive dynamics within women's football, and contribute to methodological robustness in sports analytics.

This paper is structured as follows: Section 2 reviews relevant literature, highlighting historical progressions, methodological advancements, and key research gaps. Section 3 outlines the dataset's detailed characteristics, emphasizing data collection and preprocessing. Section 4 describes our analytical methodologies, including feature engineering strategies, machine learning algorithms, and rigorous validation frameworks. Section 5 presents empirical findings, systematically comparing predictive performances across models and feature sets. Section 6 discusses broader implications, methodological contributions, limitations, and future research directions derived from our findings.

2. Literature Review

2.1. Historical Overview of Football Outcome Prediction

Football match outcome prediction has undergone substantial evolution from basic statistical approaches in the 1990s to sophisticated machine learning (ML) methods in recent years. Early works, such as those by Dixon and Coles (1997) and Rue and Salvesen (2000), employed statistical models like Poisson and Bayesian dynamic generalized linear models, typically achieving accuracy rates between 50-60%. By the early 2000s, these statistical approaches were complemented by machine learning techniques, significantly improving predictive accuracy and robustness. Tax and Joustra (2015) and Baboota and Kaur (2019) notably advanced the field by integrating comprehensive ML algorithms and feature engineering, reaching accuracy rates of up to 75%. Despite advancements, traditional methods were constrained by methodological simplicity, data availability, and validation challenges.

2.2. Machine Learning Approaches in Football Analytics

Machine learning algorithms have increasingly dominated football analytics, with logistic regression, SVM, random forests, XGBoost, and neural networks being prominent methods. Logistic regression, despite its simplicity, consistently performs robustly and offers interpretability. Ensemble methods, particularly random forests and gradient boosting, provide high predictive power by handling complex feature interactions, though at the cost of interpretability. Neural networks, while theoretically powerful due to their ability to model nonlinear relationships, have practical limitations such as data requirements and interpretability challenges. Recent comparative studies (Baboota & Kaur, 2019; Hubáček et al., 2019) highlight that no single algorithm universally outperforms others across all contexts, emphasizing the importance of tailored model selection and rigorous validation methods such as nested cross-validation.

2.3. Feature Engineering and Data Utilization

Effective feature engineering critically impacts predictive model performance in sports analytics. Initial approaches focused predominantly on basic performance statistics like goals, possession percentages, and shots. However, recent advancements advocate for incorporating advanced metrics such as expected goals (xG), detailed passing networks, and tactical metrics like pressing actions. Studies by Carpita et al. (2015) and Sarmento et al. (2018) have underscored the importance of distinguishing between absolute versus relative team performance metrics, as well as leveraging player-level rather than solely team-level data. Our study significantly contributes to this area by systematically comparing multiple feature engineering strategies to identify the most robust predictors for women's international tournament outcomes (Kang & Kim, 2023).

2.4. Tournament vs. League-Based Prediction

Predictive modeling distinctly varies between league and tournament contexts. League-based predictions benefit from extensive historical data, facilitating robust model training and validation. Conversely, tournament predictions, such as for the FIFA World Cup, present unique challenges due to limited matches, diverse competitive levels, and tournament structure dynamics (group stages vs. knockout stages). Recent research indicates that tournament-based predictions benefit substantially from incorporating tournament-specific variables and rigorous simulation frameworks (Groll et al., 2018; Schauberger & Groll, 2018). Our study addresses these challenges explicitly by leveraging tailored feature sets and validation methods specifically suited to tournament contexts.

2.5. Women's Football Analytics: Current State

Women’s football analytics remains significantly underrepresented compared to men's football research. Existing studies, like Groll et al. (2019), have highlighted distinctive aspects of women's football, including unique scoring patterns, disparities between team capabilities, and variations in tactical approaches. Despite these insights, predictive modeling specifically tailored for women's football remains sparse. Our research directly addresses this gap by providing comprehensive and specialized analytical methods designed explicitly for the women's international tournament context, thereby substantially advancing the field (Kim & Kang, 2022).

2.6. Research Gaps and Opportunities

The existing literature reveals critical gaps including the scarcity of predictive research tailored to women's football, limited comparative analyses of feature engineering strategies, inadequate methodological rigor in validation procedures, and insufficient attention to the unique dynamics of tournament play compared to league formats. Our study addresses these deficiencies through meticulous methodological rigor, comprehensive comparative analysis of multiple ML algorithms, and robust nested cross-validation techniques. By clearly delineating these gaps, we underscore the importance of future research directions, including real-time predictive modeling, qualitative feature integration, longitudinal studies, and broader cross-tournament validations to enhance the generalizability and practical applicability of football analytics.

3. Data Collection

The dataset used in this study comprises detailed player-level performance metrics obtained from the FIFA 2023 Women's World Cup, covering all matches played throughout the tournament. This dataset represents an invaluable resource for analyzing team performance and prediction modeling at the highest competitive level of women's international football.

3.1. Dataset Overview

Our dataset includes a total of 2,400 individual player-match records derived from 60 matches involving 16 national teams and 320 distinct players. Each data point meticulously captures performance statistics for an individual player within a single match, thus offering highly granular insights into both individual contributions and collective team dynamics. Specifically, the dataset consists of evenly balanced records with a binary target variable representing match outcomes—win or loss—from each team's perspective, ensuring a perfect balance with 1,200 wins and 1,200 losses. This balance naturally results from the tournament structure, which inherently pairs one winning and one losing team per match, with each team typically fielding approximately 20 players per game.

3.2. Variable Categories and Descriptions

To provide comprehensive analytical depth, the dataset features 31 distinct variables grouped into five main categories: identifiers, team rankings, match outcomes, performance metrics, and positional indicators. Identifiers include unique match and player IDs, facilitating straightforward linkage and analysis. Team rankings, captured as FIFA rankings dated June 2023, reflect pre-tournament standing and are critical for predictive modeling without introducing data leakage from post-event outcomes.

Performance metrics constitute the most extensive set, meticulously detailing multiple dimensions of play: passing performance (e.g., passes attempted and completion rate), attacking performance (goals, attempts at goal, cross completion), defensive capabilities (tackles made and won, interceptions, blocks), advanced tactical metrics (line breaks attempted and completed, pressing actions, possession contests won), and physical performance indicators (total distance covered and top speed achieved). Finally, positional indicators explicitly categorize players into defenders, midfielders, and forwards, enabling nuanced positional analysis. This structured categorization is summarized comprehensively in Table 1.

Table 1: Feature Engineering Performance Comparison

OTGHB7_2025_v23n7_117_6_t0001.png 이미지

3.3. Data Quality and Preprocessing

3.3.1. Data Completeness

The dataset demonstrates exceptional quality with negligible missing values, a notable strength ensuring robust analysis. Each recorded player-match combination includes fully complete performance metrics, allowing reliable and detailed modeling without the imputation challenges often faced in sports analytics.

3.3.2. Temporal Considerations

A crucial aspect of the dataset is the careful temporal structuring that maintains analytical integrity. The FIFA ranking data utilized for modeling isstrictly from June 2023, pre-dating the tournament commencement. This ensures no inadvertent inclusion of post-tournament information, effectively eliminating potential data leakage. Such temporal clarity is pivotal for predictive modeling accuracy, closely mirroring realistic forecasting scenarios where future outcomes are unknown.

3.3.3. Target Variable Distribution

As mentioned, the binary target variable demonstrates perfect balance, inherently achieved by tournament design. This balanced distribution mitigates common machine learning challenges such as class imbalance, simplifying model training and evaluation and ensuring reliable performance metrics (Kang & Kim, 2023).

3.4. Tournament Structure Impact

The structural characteristics of the FIFA Women's World Cup significantly influence the dataset's analytical dynamics. The variability in match frequency per team, ranging from a minimum of 4 to a maximum of 13 games depending on progression within the tournament, introduces hierarchical complexities into the dataset. The knockout tournament format also inherently establishes natural performance hierarchies, contributing distinct competitive contexts for analytical exploration. Additionally, substantial variations in team strength—indicated by FIFA rankings ranging from 3rd to 43rd globally—create diversified competitive scenarios crucial for nuanced predictive modeling. Despite the comprehensive player-level records totaling 2,400 observations, analyses at the team level aggregate down to 120 team-match combinations, and game-level modeling further consolidates data to 60 unique match scenarios.

3.5. Methodological Implications

The dataset’s structure presents several critical considerations for machine learning approaches. Its comprehensive performance metrics spanning numerous game dimensions and balanced target distribution are evident strengths, significantly reducing complexities typically encountered with sports data, such as class imbalance and incomplete datasets. Moreover, the high-quality, professionally gathered data ensures analytical reliability and robustness.

Nevertheless, the hierarchical nature of the data (players nested within teams and matches), combined with varying team sample sizes due to tournament progression, necessitates careful methodological adjustments, such as appropriate aggregation techniques and robust cross-validation strategies. Additionally, the relatively modest final sample size of 60 matches for team-level modeling poses inherent constraints regarding statistical power and generalizability. The subsequent methodological section details specific strategies employed to address these challenges, ensuring that our analytical approaches are both rigorous and contextually appropriate.

4. Methods

4.1. Overall Analytical Framework

To rigorously predict match outcomes in the FIFA 2023 Women's World Cup, we employed a comprehensive analytical framework tailored to the complexity inherent in sports performance prediction and the hierarchical structure of our dataset. This approach systematically incorporates detailed feature engineering, extensive model evaluation across various algorithms, and rigorous validation procedures. Recognizing the limited sample size inherent in tournament data—comprising just 60 matches—we emphasized robust validation techniques to minimize the risk of overfitting and to provide reliable estimates of model performance.

4.2. Data Preprocessing and Hierarchical Aggregation

4.2.1. Target Variable Definition

The predictive modeling targeted binary outcomes of match results from each team's perspective, encoded as win (1) or loss (0). This naturally resulted in a balanced dataset, aligned with the symmetrical structure of match outcomes in the tournament format, simplifying modeling procedures by eliminating concerns related to class imbalance.

4.2.2. Temporal Data Integrity

To prevent data leakage and simulate real-world predictive scenarios, our dataset strictly included pre-tournament FIFA rankings and performance statistics recorded during the matches, explicitly excluding any post-tournament data. This ensured temporal integrity, allowing realistic forecasting conditions analogous to genuine predictive situations.

4.2.3. Hierarchical Data Aggregation

Given the player-level granularity of the original dataset, aggregation was necessary to derive meaningful team-level metrics for predictive modeling. We systematically aggregated player-level statistics to team-level summaries using multiple aggregation methods: sum for cumulative actions such as passes and tackles, mean for efficiency measures such as pass completion percentages, and maximum values for peak performance indicators such as top speed and maximum distance covered. This approach captured diverse dimensions of team performance including cumulative effort, average efficiency, and peak capability.

4.3. Feature Engineering Strategies

We systematically developed and compared five distinct feature engineering approaches; each grounded in different theoretical perspectives on team performance prediction.

4.3.1. Basic Performance (Absolute Values)

This baseline approach utilized fundamental team performance metrics, such as total passes, goal attempts, tackles, blocks, interceptions, total distance covered, and top speed achieved. These straightforward absolute metrics were hypothesized to directly reflect team quality and effectiveness.

4.3.2. Basic Performance (Team Differences)

A comparative approach was employed by calculating the performance differences between competing teams within each match. Metrics like differences in goal attempts, pass completions, and tackles were expected to highlight competitive advantages directly influencing outcomes.

4.3.3. Advanced Performance (Absolute Values)

Expanding beyond basic metrics, this strategy integrated advanced tactical elements including crosses, line breaks, pressing actions, and possession contests. These metrics aimed to capture deeper tactical and strategic facets of contemporary football, potentially enhancing predictive accuracy.

4.3.4. Derived Statistical Features

This method involved creating secondary metrics through statistical transformations, such as efficiency ratios (success rates), composite indices, and standardization techniques. Such derived features aimed to encapsulate nuanced team dynamics and overall efficiency in performance.

4.3.5. Ensemble Feature Combination

Combining various methodologies, this comprehensive feature set incorporated absolute and relative team performance metrics, aggregated using multiple statistical methods simultaneously. This hybrid approach aimed to leverage complementary insights from different analytical perspectives, providing a robust predictive basis.

4.4. Machine Learning Models

Our analysis evaluated seven diverse machine learning algorithms selected to represent distinct learning paradigms and complexity levels. These included linear models (Logistic Regression), tree-based ensembles (Random Forest, Gradient Boosting, XGBoost, LightGBM), instance-based methods (Support Vector Machines), and neural network approaches (Multi-Layer Perceptron). This comprehensive model evaluation provided insights into the relative strengths and limitations of each approach within the context of sports outcome prediction.

4.5. Hyperparameter Optimization

To prevent model overfitting given the modest dataset size, hyperparameter tuning was conservatively executed via systematic grid search methods. Each algorithm underwent targeted exploration within practical parameter ranges, balancing computational efficiency with thoroughness in parameter space exploration. Model-specific grids included regularization parameters, tree depth, learning rates, and kernel types to optimize each algorithm effectively.

4.6. Validation Methodology

We implemented nested cross-validation to obtain unbiased performance estimates despite dataset limitations. This involved an outer loop (5-fold stratified cross-validation) for performance estimation and an inner loop (3-fold cross-validation) for hyperparameter tuning. Final model evaluation was executed on a separate held-out test set (20% of data), providing an independent assessment of predictive capability. Comprehensive metrics—including accuracy, precision, recall, F1-score, and ROC-AUC—were utilized to robustly evaluate model performance.

4.7. Statistical Analysis and Interpretation

Performance across models was systematically compared using cross-validation metrics, with statistical significance assessed through paired t-tests, recognizing potential limitations due to sample size. Feature importance analyses were conducted to elucidate the relative contributions of predictors, assessed via model-specific measures (e.g., logistic regression coefficients, random forest importance scores). Stability and sensitivity analyses further informed interpretations regarding model robustness and generalizability.

4.8. Implementation Details

Analyses were executed in Python, utilizing libraries such as Scikit-learn for machine learning procedures, Pandas and NumPy for data handling, and Matplotlib for visualization. Rigorous reproducibility measures—including fixed random seeds, documented software versions, and modular code organization—were maintained to ensure analytical transparency.

4.9. Methodological Limitations and Considerations

Acknowledging inherent methodological constraints, we recognized sample size limitations impacting statistical power and generalizability. The specific context of the 2023 Women's World Cup, including team strength distribution and tournament format, was noted as potentially limiting broader applicability. While our systematic feature engineering provided rigorous insights, we acknowledged that alternative methodologies might yield complementary perspectives. Consequently, this methodological framework was designed to maximize analytical reliability within inherent constraints, facilitating robust conclusions.

5. Results

5.1 Feature Engineering Performance Comparison

Our initial comparative analysis systematically assessed five distinct feature engineering strategies using multiple machine learning algorithms. Table 5.1 provides a detailed performance comparison across these feature sets, clearly illustrating the predictive accuracy and robustness across various strategies.

Logistic regression consistently outperformed other algorithms across all feature sets, demonstrating approximately 79.2% accuracy with high precision and recall. The similarity in performance across different feature engineering methods suggests fundamental team performance metrics effectively encapsulated critical predictive signals. The relatively large standard deviations indicate inherent variability and challenge in predicting match outcomes within a tournament context, reflecting fluctuating team performances across matches.

5.2. Algorithm Performance Assessment

Further detailed analysis compared the predictive capabilities of seven distinct machine learning algorithms using the basic absolute feature set. Table 5.2 summarizes these comparative results.

Table 2: Machine Learning Algorithm Performance

OTGHB7_2025_v23n7_117_6_t0002.png 이미지

Logistic regression notably emerged as the superior model, yielding the highest accuracy (79.2%) and ROC-AUC (85.8%). While simpler linear models like logistic regression and SVM consistently outperformed more complex ensemble methods, the moderate differences in accuracy suggest algorithmic complexity may not dramatically enhance predictive power beyond featyre,

5.3. Nested Cross-Validation with Hyperparameter Tuning

Nested cross-validation was employed with hyperparameter tuning on the combined feature set, significantly enhancing model performance. Table 5.3 illustrates these refined results clearly.

Table 3: Nested Cross-Validation Performance

OTGHB7_2025_v23n7_117_6_t0003.png 이미지

These enhanced accuracies underscore the efficacy of comprehensive feature sets, rigorous preprocessing, and systematic hyperparameter optimization in predictive modeling (Phommahaxay et al., 2019).

5.4. Summary of Key Findings

The comprehensive evaluation yielded several critical insights:

1. Algorithm Effectiveness: Logistic regression consistently outperformed complex models, highlighting the efficacy of interpretable linear approaches for tournament prediction.

2. Feature Engineering Robustness: Fundamental performance metrics consistently provided strong predictive signals, indicating minimal benefit from extensive feature transformations.

3. Validation Importance: Nested cross-validation with hyperparameter tuning notably increased accuracy and reliability of predictions.

4. Consistent Model Performance: Multiple algorithms consistently achieved high accuracy, indicating clear and stable predictive patterns. These insights enhance our understanding of predictive modeling's potential and highlight the importance of methodical validation in sports analytics.

6. Discussion

Our findings offer several insights into the dynamics of women's international football and the broader implications of predictive modeling. The high predictive performance of our models reveals fundamental determinants of outcome, particularly emphasizing the importance of passing and defensive metrics. Metrics such as passes attempted and completion percentage align with core football principles highlighting possession and technical proficiency. In the context of women's football—where physical disparities are generally narrower than in the men's game—technical skill and tactical discipline play a more decisive role.

Interestingly, several seemingly contradictory metrics emerged with predictive significance. For instance, while "interceptions" were generally less predictive than possession recovery statistics, the latter—particularly recoveries in midfield areas—were consistently associated with match success. This pattern may reflect the tactical importance of controlled pressing and transitional play rather than isolated defensive actions. Likewise, "top speed" and "distance covered" both proved predictive, but their contribution varied depending on context. High top speed often correlated with successful counter-attacking strategies, while distance covered was more relevant for teams employing high pressing and possession-based styles. These findings suggest that interpreting performance metrics require consideration of broader tactical schemes and match contexts, as supported by tactical analysis literature (Carling et al., 2014; Rein et al., 2016).

Defensive indicators like tackles made and tackle success rate also emerged as strong predictors. These findings support the long-standing tactical view that "defense wins championships," especially in knockout tournaments where a single mistake can lead to elimination. This suggests that women's international football places a premium on defensive structure and organization. The significance of physical performance metrics, including total distance covered and top speed, reflects the sport's evolution toward higher athletic intensity. As women's football becomes increasingly professionalized, physical performance now plays a critical role in shaping match outcomes. These attributes are aligned with modern tactical trends such as high pressing and off-the-ball movement.

Additionally, the comparable effectiveness of different feature engineering approaches suggests that relative team strength, rooted in basic technical execution, may be more predictive than tactical variation. This insight is consistent with the strategic conservatism typically observed in tournament football, where teams often prioritize consistency and error minimization over innovation. Our findings further highlight the unique characteristics of women's international football, notably the presence of more pronounced performance disparities than in the men’s game. These disparities likely reflect unequal access to infrastructure, resources, and development programs across countries (Nantharath et al., 2023)

While our analysis aggregates performance at team level, the influence of positional roles remains apparent. Stronger tournament teams tend to exhibit robust defensive lines and composed midfield positions commonly associated with experienced players and disciplined execution. Although offensive metrics contribute meaningfully, defensive and possession-based indicators consistently hold more predictive power. Cross completion rates, in particular, emerged as a relevant attacking metric, reaffirming the role of traditional approaches such as set pieces and wide play.

Theoretically, this study raises important considerations about competitive balance and tactical evolution in women’s football. The predictive success of basic performance metrics points to developmental priorities for emerging national teams, suggesting that fundamental technical skills and defensive training should be prioritized over complex tactical schemes. From a coaching standpoint, our findings emphasize preparation focused on technical execution, defensive positioning, and disciplined transitions. For instance, coaches might structure drills around improving pass completion under pressure, accelerating recovery runs, or reinforcing midfield compactness.

Similarly, in terms of player development and scouting, national programs might achieve greater success by investing in technically proficient athletes capable of executing core tasks reliably. Grassroots initiatives could emphasize passing and defensive technique training from early stages, aligning with the indicators identified in our models. Additionally, this framework provides a reference for match analysts seeking to benchmark team performance and identify areas of tactical strength or vulnerability.

Beyond competitive and tactical applications, our findings also offer utility in broadcasting and fan engagement. Key performance metrics identified in this study could be incorporated into real-time analytics to enrich commentary and deepen viewer understanding. This application is particularly timely given the growing popularity of women’s football and the expanding market for data-driven insights.

In sum, our discussion underscores both the methodological and practical relevance of predictive modeling in women's football. By linking analytical performance to core football principles and tactical structures, this research bridges the gap between data science and domain expertise, contributing to both academic inquiry and applied practice.

7. Conclusion

Our research contributes to predictive modeling capabilities in women's international football by demonstrating the effectiveness of comprehensive feature engineering, systematic methodological frameworks, and rigorous validation techniques. Logistic regression, utilizing detailed performance metrics, consistently yielded high accuracy, improving upon existing benchmarks from previous studies. Our findings indicate that fundamental performance metrics provide robust predictive insights, underscoring the practicality and interpretability of simpler modeling approaches (Kim & Kang, 2022).

Despite methodological rigor and high predictive performance, inherent limitations related to sample size, generalizability, and unmodeled qualitative factors must be acknowledged. Future research should extend this approach across multiple tournaments, incorporate dynamic temporal modeling, and integrate qualitative analyses for broader validation and practical applicability. From a practical standpoint, our findings offer concrete guidance for coaching staff and performance analysts. For instance, training programs could be tailored to enhance passing accuracy and defensive recovery speed, while match analysis routines may prioritize high-frequency tracking of tackles and possession regain patterns. By offering a structured framework linking key performance indicators to training and tactical planning, this study provides a foundation for both scholarly and practical advancement in women's football analytics.

References

  1. Baboota, R., & Kaur, H. (2019). Predictive analysis and modelling football results using machine learning approach for English Premier League. International Journal of Forecasting, 35(2), 741-755. https://doi.org/10.1016/j.ijforecast.2018.01.003
  2. Carpita, M., Ciavolino, E., & Pasca, P. (2015). Exploring and modelling team performances of the Italian Serie A football championship. Quality & Quantity, 49(6), 2725-2740.
  3. Dixon, M. J., & Coles, S. G. (1997). Modelling association football scores and inefficiencies in the football betting market. Journal of the Royal Statistical Society: Series C. Applied Statistics, 46(2), 265-280. https://doi.org/10.1111/1467-9876.00065
  4. Groll, A., Ley, C., Schauberger, G., & Van Eetvelde, H. (2018). Prediction of the FIFA World Cup 2018 – a random forest approach with an emphasis on estimated team ability parameters. Journal of Sports Analytics, 4(4), 243-259. https://doi.org/10.3233/JSA-180153
  5. Groll, A., Schauberger, G., & Tutz, G. (2019). Prediction of major international soccer tournaments based on team-specific regularized Poisson regression: An application to the FIFA Women's World Cup 2019. Statistical Modelling, 19(4), 352-379.
  6. Hubáček, O., Šourek, G., & Železný, F. (2019). Learning to predict soccer results from relational data with gradient boosted relational decision trees. Machine Learning, 108(1), 29-47. https://doi.org/10.1007/s10994-018-5704-6
  7. Kang, E., & Kim, J. H. (2023). Secondary literature analysis: the marketing practice to attract potential customers into leisure and sports industry. The Journal of Industrial Distribution & Business, 14(6), 1-8.
  8. Kim, J. H., & Kang, E. (2022). Qualitative Content Analysis: The Meaningful Association between the Extension of Sports Leisure Culture and the Spread of Wearable Devices. East Asian Journal of Business Economics, 10(4), 29-38.
  9. Nantharath, P., Marwa, S., Nguyen, L., & Kang, E. (2023). Sustainable development and financial inclusion in subsaharan africa: empirical evidence from panel vector error correction model (VECM). Journal of Namibian Studies: History Politics Culture, 36, 846-868.
  10. Phommahaxay, S., Kamnuansipla, P., Draper, J., Nantharath, P., & Kang, E. (2019). Preparedness of Lao People's Democratic Republic to Implement ASEAN Common Visa (ACV). Research in World Economy, 10(3), 419-430. https://doi.org/10.5430/rwe.v10n3p419
  11. Rue, H., & Salvesen, O. (2000). Prediction and retrospective analysis of soccer matches in a league. Journal of the Royal Statistical Society, 49(3), 399-418.
  12. Sarmento, H., Clemente, F. M., Araújo, D., Davids, K., McRobert, A., & Figueiredo, A. (2018). What performance analysts need to know about research trends in association football (2012–2016): A systematic review. Sports Medicine, 48(4), 799-836. https://doi.org/10.1007/s40279-017-0836-6
  13. Schauberger, G., & Groll, A. (2018). Predicting matches in international football tournaments using a random forest approach. Statistical Modelling, 18(5-6), 460-482. https://doi.org/10.1177/1471082X18799934
  14. Tax, N., & Joustra, Y. (2015). Predicting the Dutch football competition using public data: A machine learning approach. Transactions on Knowledge and Data Engineering, 10(10), 1-13.