• Title/Summary/Keyword: Bayes risk

Search Result 75, Processing Time 0.027 seconds

Comparative Evaluation of Machine Learning Models for Predicting Soccer Injury Types

  • Davronbek Malikov;Jaeho Kim;Jung Kyu Park
    • Journal of the Korean Society of Industry Convergence
    • /
    • v.27 no.2_1
    • /
    • pp.257-268
    • /
    • 2024
  • Soccer is type of sport that carries a high risk of injury. Injury is not only cause in the unlucky soccer carrier and also team performance as well as financial effects can be worse since soccer is a team-based game. The duration of recovery from a soccer injury typically relies on its type and severity. Therefore, we conduct this research in order to predict the probability of players injury type using machine learning technologies in this paper. Furthermore, we compare different machine learning models to find the best fit model. This paper utilizes various supervised classification machine learning models, including Decision Tree, Random Forest, K-Nearest Neighbors (KNN), and Naive Bayes. Moreover, based on our finding the KNN and Decision models achieved the highest accuracy rates at 70%, surpassing other models. The Random Forest model followed closely with an accuracy score of 62%. Among the evaluated models, the Naive Bayes model demonstrated the lowest accuracy at 56%. We gathered information about 54 professional soccer players who are playing in the top five European leagues based on their career history. We gathered information about 54 professional soccer players who are playing in the top five European leagues based on their career history.

Prediction model of peptic ulcer diseases in middle-aged and elderly adults based on machine learning (머신러닝 기반 중노년층의 기능성 위장장애 예측 모델 구현)

  • Lee, Bum Ju
    • The Journal of the Convergence on Culture Technology
    • /
    • v.6 no.4
    • /
    • pp.289-294
    • /
    • 2020
  • Peptic ulcer disease is a gastrointestinal disorder caused by Helicobacter pylori infection and the use of nonsteroid anti-inflammatory drugs. While many studies have been conducted to find the risk factors of peptic ulcers, there are no studies on the suggestion of peptic ulcer prediction models for Koreans. Therefore, the purpose of this study is to implement peptic ulcer prediction model using machine learning based on demographic information, obesity information, blood information, and nutritional information for middle-aged and elderly people. For model building, wrapper-based variable selection method and naive Bayes algorithm were used. The classification accuracy of the female prediction model was the area under the receiver operating characteristics curve (AUC) of 0.712, and males showed an AUC of 0.674, which is lower than that of females. These results can be used for prediction and prevention of peptic ulcers in the middle and elderly people.

SOME POINT ESTIMATES FOR THE SHAPE PARAMETERS OF EXPONENTIATED-WEIBULL FAMILY

  • Singh Umesh;Gupta Pramod K.;Upadhyay S.K.
    • Journal of the Korean Statistical Society
    • /
    • v.35 no.1
    • /
    • pp.63-77
    • /
    • 2006
  • Maximum product of spacings estimator is proposed in this paper as a competent alternative of maximum likelihood estimator for the parameters of exponentiated-Weibull distribution, which does work even when the maximum likelihood estimator does not exist. In addition, a Bayes type estimator known as generalized maximum likelihood estimator is also obtained for both of the shape parameters of the aforesaid distribution. Though, the closed form solutions for these proposed estimators do not exist yet these can be obtained by simple appropriate numerical techniques. The relative performances of estimators are compared on the basis of their relative risk efficiencies obtained under symmetric and asymmetric losses. An example based on simulated data is considered for illustration.

A Comparative Study of Prediction Models for College Student Dropout Risk Using Machine Learning: Focusing on the case of N university (머신러닝을 활용한 대학생 중도탈락 위험군의 예측모델 비교 연구 : N대학 사례를 중심으로)

  • So-Hyun Kim;Sung-Hyoun Cho
    • Journal of The Korean Society of Integrative Medicine
    • /
    • v.12 no.2
    • /
    • pp.155-166
    • /
    • 2024
  • Purpose : This study aims to identify key factors for predicting dropout risk at the university level and to provide a foundation for policy development aimed at dropout prevention. This study explores the optimal machine learning algorithm by comparing the performance of various algorithms using data on college students' dropout risks. Methods : We collected data on factors influencing dropout risk and propensity were collected from N University. The collected data were applied to several machine learning algorithms, including random forest, decision tree, artificial neural network, logistic regression, support vector machine (SVM), k-nearest neighbor (k-NN) classification, and Naive Bayes. The performance of these models was compared and evaluated, with a focus on predictive validity and the identification of significant dropout factors through the information gain index of machine learning. Results : The binary logistic regression analysis showed that the year of the program, department, grades, and year of entry had a statistically significant effect on the dropout risk. The performance of each machine learning algorithm showed that random forest performed the best. The results showed that the relative importance of the predictor variables was highest for department, age, grade, and residence, in the order of whether or not they matched the school location. Conclusion : Machine learning-based prediction of dropout risk focuses on the early identification of students at risk. The types and causes of dropout crises vary significantly among students. It is important to identify the types and causes of dropout crises so that appropriate actions and support can be taken to remove risk factors and increase protective factors. The relative importance of the factors affecting dropout risk found in this study will help guide educational prescriptions for preventing college student dropout.

Automated Prioritization of Construction Project Requirements using Machine Learning and Fuzzy Logic System

  • Hassan, Fahad ul;Le, Tuyen;Le, Chau;Shrestha, K. Joseph
    • International conference on construction engineering and project management
    • /
    • 2022.06a
    • /
    • pp.304-311
    • /
    • 2022
  • Construction inspection is a crucial stage that ensures that all contractual requirements of a construction project are verified. The construction inspection capabilities among state highway agencies have been greatly affected due to budget reduction. As a result, efficient inspection practices such as risk-based inspection are required to optimize the use of limited resources without compromising inspection quality. Automated prioritization of textual requirements according to their criticality would be extremely helpful since contractual requirements are typically presented in an unstructured natural language in voluminous text documents. The current study introduces a novel model for predicting the risk level of requirements using machine learning (ML) algorithms. The ML algorithms tested in this study included naïve Bayes, support vector machines, logistic regression, and random forest. The training data includes sequences of requirement texts which were labeled with risk levels (such as very low, low, medium, high, very high) using the fuzzy logic systems. The fuzzy model treats the three risk factors (severity, probability, detectability) as fuzzy input variables, and implements the fuzzy inference rules to determine the labels of requirements. The performance of the model was examined on labeled dataset created by fuzzy inference rules and three different membership functions. The developed requirement risk prediction model yielded a precision, recall, and f-score of 78.18%, 77.75%, and 75.82%, respectively. The proposed model is expected to provide construction inspectors with a means for the automated prioritization of voluminous requirements by their importance, thus help to maximize the effectiveness of inspection activities under resource constraints.

  • PDF

Difference of Area-based deprivation and Education on Cerebrovascular Mortality in Korea (교육수준과 지역결핍지수에 따른 뇌혈관질환 사망률 차이)

  • Sim, Jeoung-Ha;Ahn, Dong-Choon;Son, Mi-A
    • Health Policy and Management
    • /
    • v.22 no.2
    • /
    • pp.163-182
    • /
    • 2012
  • This study was performed to identify the difference of the area-based deprivation and the educational level on the cerebrovascular mortality in Korea. Data used in this study was obtained from the Death Certificate Data 2000 and the 2000 Census produced by Korean National Statistics(NSO). We classified the whole country into 246 areas based on the administrative districts. Then, the Standardized Mortality Ratio (SMR) in cerebrovascular disease was calculated according to the sex, education level and 246 areas. Its Predicted SMR was calculated by the Empirical Bayes Methods to reduce the variation of the SMR values. The area-based deprivation of 246 areas were measured using the modified Carstairs index in which the 5 indicators consisted of overcrowding, the unemployment ratio of men, the percentage of households classified low social class, the percentage of non home owners, and finally those houses lacking basic amenities. The correlation between the area-based deprivation and the SMR of the whole country and the correlation between the area-based deprivation and the SMR of each metropolitan cities or provinces was analyzed by the Pearson correlation analysis method. After classifying the deprivation of 246 areas into 5 levels, we performed the random intercept Poisson regression analysis after adjusting education level and age using Empirical Bayes Method to investigate the relationship between the 5 deprivation levels and the cerebrovascular mortality. The SMR was increased in lower education level. Each 246 areas had different values in SMR, Predicted SMR and area-based deprivation. The area-based deprivation and the SMR of the whole country was not correlated in both sexes. The education level of an individual was associated the risk of cerebrovascular mortality in men. The risk of cerebrovascular mortality increased with age compared to the reference(<30). The area-based deprivation was not associated with the risk of cerebrovascular in both sexes. The findings of this study suggest that the SMR had positive and negative correlations with area-based deprivation depending on the metropolitan cities or province. It also suggests that the individual education level and age were related with mortality and finally that the area-based deprivation was not associated to the cerebrovascular mortality in Korea.

Socioeconomic Predictors of Diabetes Mortality in Japan: An Ecological Study Using Municipality-specific Data

  • Okui, Tasuku
    • Journal of Preventive Medicine and Public Health
    • /
    • v.54 no.5
    • /
    • pp.352-359
    • /
    • 2021
  • Objectives: The aim of this study was to examine the geographic distribution of diabetes mortality in Japan and identify socioeconomic factors affecting differences in municipality-specific diabetes mortality. Methods: Diabetes mortality data by year and municipality from 2013 to 2017 were extracted from Japanese Vital Statistics, and the socioeconomic characteristics of municipalities were obtained from government statistics. We calculated the standardized mortality ratio (SMR) of diabetes for each municipality using the empirical Bayes method and represented geographic differences in SMRs in a map of Japan. Multiple linear regression was conducted to identify the socioeconomic factors affecting differences in SMR. Statistically significant socioeconomic factors were further assessed by calculating the relative risk of mortality of quintiles of municipalities classified according to the degree of each socioeconomic factor using Poisson regression analysis. Results: The geographic distribution of diabetes mortality differed by gender. Of the municipality-specific socioeconomic factors, high rates of single-person households and unemployment and a high number of hospital beds were associated with a high SMR for men. High rates of fatherless households and blue-collar workers were associated with a high SMR for women, while high taxable income per-capita income and total population were associated with low SMR for women. Quintile analysis revealed a complex relationship between taxable income and mortality for women. The mortality risk of quintiles with the highest and lowest taxable per-capita income was significantly lower than that of the middle-income quintile. Conclusions: Socioeconomic factors of municipalities in Japan were found to affect geographic differences in diabetes mortality.

Development of a Secure Routing Protocol using Game Theory Model in Mobile Ad Hoc Networks

  • Paramasivan, Balasubramanian;Viju Prakash, Maria Johan;Kaliappan, Madasamy
    • Journal of Communications and Networks
    • /
    • v.17 no.1
    • /
    • pp.75-83
    • /
    • 2015
  • In mobile ad-hoc networks (MANETs), nodes are mobile in nature. Collaboration between mobile nodes is more significant in MANETs, which have as their greatest challenges vulnerabilities to various security attacks and an inability to operate securely while preserving its resources and performing secure routing among nodes. Therefore, it is essential to develop an effective secure routing protocol to protect the nodes from anonymous behaviors. Currently, game theory is a tool that analyzes, formulates and solves selfishness issues. It is seldom applied to detect malicious behavior in networks. It deals, instead, with the strategic and rational behavior of each node. In our study,we used the dynamic Bayesian signaling game to analyze the strategy profile for regular and malicious nodes. This game also revealed the best actions of individual strategies for each node. Perfect Bayesian equilibrium (PBE) provides a prominent solution for signaling games to solve incomplete information by combining strategies and payoff of players that constitute equilibrium. Using PBE strategies of nodes are private information of regular and malicious nodes. Regular nodes should be cooperative during routing and update their payoff, while malicious nodes take sophisticated risks by evaluating their risk of being identified to decide when to decline. This approach minimizes the utility of malicious nodes and it motivates better cooperation between nodes by using the reputation system. Regular nodes monitor continuously to evaluate their neighbors using belief updating systems of the Bayes rule.

Project Failure Main Factors Analysis using Text Mining in Audit Evaluation (감리결과에 텍스트마이닝 기법을 적용한 프로젝트 실패 주요요인 분석)

  • Jang, Kyoungae;Jang, Seong Yong;Kim, Woo-Je
    • Journal of KIISE
    • /
    • v.42 no.4
    • /
    • pp.468-474
    • /
    • 2015
  • Corporations should make efforts to recognize the importance of projects, identify their failure factors, prevent risks in advance, and raise the success rates, because the corporations need to make quick responses to rapid external changes. There are some previous studies on success and failure factors of projects, however, most of them have limitations in terms of objectivity and quantitative analysis based on data gathering through surveys, statistical sampling and analysis. This study analyzes the failure factors of projects based on data mining to find problems with projects in an audit report, which is an objective project evaluation report. To do this, we identified the texts in the paragraph of suggestions about improvement. We made use of the superior classification algorithms in this study, which were NaiveBayes, SMO and J48. They were evaluated in terms of data of Recall and Precision after performing 10-fold-cross validation. In the identified texts, the failure factors of projects were analyzed so that they could be utilized in project implementation.

Bayesian Estimation of Shape Parameter of Pareto Income Distribution Using LINEX Loss Function

  • Saxena, Sharad;Singh, Housila P.
    • Communications for Statistical Applications and Methods
    • /
    • v.14 no.1
    • /
    • pp.33-55
    • /
    • 2007
  • The economic world is full of patterns, many of which exert a profound influence over society and business. One of the most contentious is the distribution of wealth. Way back in 1897, an Italian engineer-turned-economist named Vilfredo Pareto discovered a pattern in the distribution of wealth that appears to be every bit as universal as the laws of thermodynamics or chemistry. The present paper proposes some Bayes estimators of shape parameter of Pareto income distribution in censored sampling. Asymmetric LINEX loss function has been considered to study the effects of overestimation and underestimation. For the prior distribution of the parameter involved a number of priors including one and two-parameter exponential, truncated Erlang and doubly truncated gamma have been contemplated to express the belief of the experimenter s/he has regarding the parameter. The estimators thus obtained have been compared theoretically and empirically with the corresponding estimators under squared error loss function, some of which were reported by Bhattacharya et al. (1999).