• Title/Summary/Keyword: Predictive Risk Model

Search Result 216, Processing Time 0.021 seconds

Developing the high risk group predictive model for student direct loan default using data mining (데이터마이닝을 이용한 학자금 대출 부실 고위험군 예측모형 개발)

  • Choi, Jae-Seok;Han, Jun-Tae;Kim, Myeon-Jung;Jeong, Jina
    • Journal of the Korean Data and Information Science Society
    • /
    • v.26 no.6
    • /
    • pp.1417-1426
    • /
    • 2015
  • We develop the high risk group predictive model for loan default by utilizing the direct loan data from 2012 to 2014 of the Korea Student Aid Foundation. We perform the decision tree analysis using the data mining methodology and use SAS Enterprise Miner 13.2. As a result of this model, subject types were classified into 25 types. This study shows that the major influencing factors for the loan default are household income, national grant, age, overdue record, level of schooling, field of study, monthly repayment. The high risk group predictive model in this study will be the basis for segmented management service for preventing loan default.

A predictive nomogram-based model for lower extremity compartment syndrome after trauma in the United States: a retrospective case-control study

  • Blake Callahan;Darwin Ang;Huazhi Liu
    • Journal of Trauma and Injury
    • /
    • v.37 no.2
    • /
    • pp.124-131
    • /
    • 2024
  • Purpose: The aim of this study was to utilize the American College of Surgeons Trauma Quality Improvement Program (TQIP) database to identify risk factors associated with developing acute compartment syndrome (ACS) following lower extremity fractures. Specifically, a nomogram of variables was constructed in order to propose a risk calculator for ACS following lower extremity trauma. Methods: A large retrospective case-control study was conducted using the TQIP database to identify risk factors associated with developing ACS following lower extremity fractures. Multivariable regression was used to identify significant risk factors and subsequently, these variables were implemented in a nomogram to develop a predictive model for developing ACS. Results: Novel risk factors identified include venous thromboembolism prophylaxis type particularly unfractionated heparin (odds ratio [OR], 2.67; 95% confidence interval [CI], 2.33-3.05; P<0.001), blood product transfusions (blood per unit: OR 1.13 [95% CI, 1.09-1.18], P<0.001; platelets per unit: OR 1.16 [95% CI, 1.09-1.24], P<0.001; cryoprecipitate per unit: OR 1.13 [95% CI, 1.04-1.22], P=0.003). Conclusions: This study provides evidence to believe that heparin use and blood product transfusions may be additional risk factors to evaluate when considering methods of risk stratification of lower extremity ACS. We propose a risk calculator using previously elucidated risk factors, as well as the risk factors demonstrated in this study. Our nomogram-based risk calculator is a tool that will aid in screening for high-risk patients for ACS and help in clinical decision-making.

Recidivism prediction of sex offender risk assessment tools: STATIC-99 and HAGSOR-Dynamic (교정시설내 성범죄자 재범위험성 평가도구의 재범 예측: STATIC-99와 HAGSOR-동적요인을 중심으로)

  • Yoon, Jeongsook
    • Korean Journal of Forensic Psychology
    • /
    • v.13 no.2
    • /
    • pp.99-119
    • /
    • 2022
  • Research on sex offense has shown that sex offenders are very heterogeneous. Sex offenders are heterogeneous in their probability of risk of recidivism. Some sex offenders are known to be much higher in their tendencies to reactivate than others. The study examined the predictive and explanatory power of static and dynamic risk factors in STATIC-99 and HAGSOR-Dynamic which have been used in Korean correctional facilities since 2014. STATIC-99 and HAGSOR-Dynamic showed moderate predictive accuracy for all crimes(AUC = .737, AUC = .597, respectively, ps < .001). However, when examining sex crime alone, only STATIC-99 predicted recidivism significantly(AUC = .743, p < .001). The incremental predictive power of HAGSOR-Dynamic was confirmed; the explanatory power of Model 2 comprising both static and dynamic risk factors were significant beyond Model 1 comprising only static factors(∆χ2= 12.721, p < .001), but this tendency was only applied to the model of all crimes. These findings were discussed with implications of practicing the sex offender assessment and treatment.

Analysis of SEER Adenosquamous Carcinoma Data to Identify Cause Specific Survival Predictors and Socioeconomic Disparities

  • Cheung, Rex
    • Asian Pacific Journal of Cancer Prevention
    • /
    • v.17 no.1
    • /
    • pp.347-352
    • /
    • 2016
  • Background: This study used receiver operating characteristic curve to analyze Surveillance, Epidemiology and End Results (SEER) adenosquamous carcinoma data to identify predictive models and potential disparities in outcome. Materials and Methods: This study analyzed socio-economic, staging and treatment factors available in the SEER database for adenosquamous carcinoma. For the risk modeling, each factor was fitted by a generalized linear model to predict the cause specific survival. An area under the receiver operating characteristic curve (ROC) was computed. Similar strata were combined to construct the most parsimonious models. Results: A total of 20,712 patients diagnosed from 1973 to 2009 were included in this study. The mean follow up time (S.D.) was 54.2 (78.4) months. Some 2/3 of the patients were female. The mean (S.D.) age was 63 (13.8) years. SEER stage was the most predictive factor of outcome (ROC area of 0.71). 13.9% of the patients were un-staged and had risk of cause specific death of 61.3% that was higher than the 45.3% risk for the regional disease and lower than the 70.3% for metastatic disease. Sex, site, radiotherapy, and surgery had ROC areas of about 0.55-0.65. Rural residence and race contributed to socioeconomic disparity for treatment outcome. Radiotherapy was underused even with localized and regional stages when the intent was curative. This under use was most pronounced in older patients. Conclusions: Anatomic stage was predictive and useful in treatment selection. Under-staging may have contributed to poor outcome.

Developing the high-risk drinking predictive model in Korea using the data mining technique (데이터마이닝 기법을 활용한 한국인의 고위험 음주 예측모형 개발 연구)

  • Park, Il-Su;Han, Jun-Tae
    • Journal of the Korean Data and Information Science Society
    • /
    • v.28 no.6
    • /
    • pp.1337-1348
    • /
    • 2017
  • In this paper, we develop the high-risk drinking predictive model in Korea using the cross-sectional data from Korea Community Health Survey (2014). We perform the logistic regression analysis, the decision tree analysis, and the neural network analysis using the data mining technique. The results of logistic regression analysis showed that men in their forties had a high risk and the risk of office workers and sales workers were high. Especially, current smokers had higher risk of high-risk drinking. Neural network analysis and logistic regression were the most significant in terms of AUROC (area under a receiver operation characteristic curve) among the three models. The high-risk drinking predictive model developed in this study and the selection method of the high-risk intensive drinking group can be the basis for providing more effective health care services such as hazardous drinking prevention education, and improvement of drinking program.

Predictive Analysis of Fire Risk Factors in Gyeonggi-do Using Machine Learning (머신러닝을 이용한 경기도 화재위험요인 예측분석)

  • Seo, Min Song;Castillo Osorio, Ever Enrique;Yoo, Hwan Hee
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.39 no.6
    • /
    • pp.351-361
    • /
    • 2021
  • The seriousness of fire is rising because fire causes enormous damage to property and human life. Therefore, this study aims to predict various risk factors affecting fire by fire type. The predictive analysis of fire factors was carried out targeting Gyeonggi-do, which has the highest number of fires in the country. For the analysis, using machine learning methods SVM (Support Vector Machine), RF (Random Forest), GBRT (Gradient Boosted Regression Tree) the accuracy of each model was presented with a high fit model through MAE (Mean Absolute Error) and RMSE (Root Mean Squared Error), and based on this, predictive analysis of fire factors in Gyeonggi-do was conducted. In addition, using machine learning methods such as SVM (Support Vector Machine), RF (Random Forest), and GBRT (Gradient Boosted Regression Tree), the accuracy of each model was presented with a high-fit model through MAE and RMSE. Predictive analysis of occurrence factors was achieved. Based on this, as a result of comparative analysis of three machine learning methods, the RF method showed a MAE = 1.765 and RMSE = 1.876, as well as the MAE and RMSE verification and test data were very similar with a difference between MAE = 0.046 and RMSE = 0.04 showing the best predictive results. The results of this study are expected to be used as useful data for fire safety management allowing decision makers to identify the sequence of dangers related to the factors affecting the occurrence of fire.

Positive Association Between miR-499A>G and Hepatocellular Carcinoma Risk in a Chinese Population

  • Zou, Hong-Zhi;Zhao, Yan-Qiu
    • Asian Pacific Journal of Cancer Prevention
    • /
    • v.14 no.3
    • /
    • pp.1769-1772
    • /
    • 2013
  • A case-control study of the association of miR-499A>G rs3746444 with risk of hepatocellular carcinoma (HCC)was conducted. Patients with HCC and healthy control subjects were recruited for genotyping of miR-499A>G using duplex polymerase-chain-reaction with confronting-two-pair primer(PCR-RFLP) analysis. The MiR-499 GG genotype was associated with a decreased risk of HCC as compared with the miR-499 AA genotype (adjusted OR=0.74, 95%CI=0.24-0.96). Similarly, the GG genotype showed a 0.45-fold decreased HCC risk in a recessive model. The MiR-499 G allele was significantly associated with decreased risk of HCC among patients infected with HBV in a dominant model (OR=0.09, 95%CI= 0.02-0.29). In conclusion, the MiR-499A>G rs3746444 polymorphism is associated with HCC risk in the Chinese population, and may be useful predictive marker for CAD susceptibility.

Development and Evaluation of Electronic Health Record Data-Driven Predictive Models for Pressure Ulcers (전자건강기록 데이터 기반 욕창 발생 예측모델의 개발 및 평가)

  • Park, Seul Ki;Park, Hyeoun-Ae;Hwang, Hee
    • Journal of Korean Academy of Nursing
    • /
    • v.49 no.5
    • /
    • pp.575-585
    • /
    • 2019
  • Purpose: The purpose of this study was to develop predictive models for pressure ulcer incidence using electronic health record (EHR) data and to compare their predictive validity performance indicators with that of the Braden Scale used in the study hospital. Methods: A retrospective case-control study was conducted in a tertiary teaching hospital in Korea. Data of 202 pressure ulcer patients and 14,705 non-pressure ulcer patients admitted between January 2015 and May 2016 were extracted from the EHRs. Three predictive models for pressure ulcer incidence were developed using logistic regression, Cox proportional hazards regression, and decision tree modeling. The predictive validity performance indicators of the three models were compared with those of the Braden Scale. Results: The logistic regression model was most efficient with a high area under the receiver operating characteristics curve (AUC) estimate of 0.97, followed by the decision tree model (AUC 0.95), Cox proportional hazards regression model (AUC 0.95), and the Braden Scale (AUC 0.82). Decreased mobility was the most significant factor in the logistic regression and Cox proportional hazards models, and the endotracheal tube was the most important factor in the decision tree model. Conclusion: Predictive validity performance indicators of the Braden Scale were lower than those of the logistic regression, Cox proportional hazards regression, and decision tree models. The models developed in this study can be used to develop a clinical decision support system that automatically assesses risk for pressure ulcers to aid nurses.

A Deep Learning-Based Model for Predicting Traffic Congestion in Semiconductor Fabrication (딥러닝을 활용한 반도체 제조 물류 시스템 통행량 예측모델 설계)

  • Kim, Jong Myeong;Kim, Ock Hyeon;Hong, Sung Bin;Lim, Dae-Eun
    • Journal of Industrial Technology
    • /
    • v.39 no.1
    • /
    • pp.27-31
    • /
    • 2019
  • Semiconductor logistics systems are facing difficulties in increasing production as production processes become more complicated due to the upgrading of fine processes. Therefore, the purpose of the research is to design predictive models that can predict traffic during the pre-planning stage, identify the risk zones that occur during the production process, and prevent them in advance. As a solution, we build FABs using automode simulation to collect data. Then, the traffic prediction model of the areas of interest is constructed using deep learning techniques (keras - multistory conceptron structure). The design of the predictive model gave an estimate of the traffic in the area of interest with an accuracy of about 87%. The expected effect can be used as an indicator for making decisions by proactively identifying congestion risk areas during the Fab Design or Factory Expansion Planning stage, as the maximum traffic per section is predicted.

Development of Big Data-based Cardiovascular Disease Prediction Analysis Algorithm

  • Kyung-A KIM;Dong-Hun HAN;Myung-Ae CHUNG
    • Korean Journal of Artificial Intelligence
    • /
    • v.11 no.3
    • /
    • pp.29-34
    • /
    • 2023
  • Recently, the rapid development of artificial intelligence technology, many studies are being conducted to predict the risk of heart disease in order to lower the mortality rate of cardiovascular diseases worldwide. This study presents exercise or dietary improvement contents in the form of a software app or web to patients with cardiovascular disease, and cardiovascular disease through digital devices such as mobile phones and PCs. LR, LDA, SVM, XGBoost for the purpose of developing "Life style Improvement Contents (Digital Therapy)" for cardiovascular disease care to help with management or treatment We compared and analyzed cardiovascular disease prediction models using machine learning algorithms. Research Results XGBoost. The algorithm model showed the best predictive model performance with overall accuracy of 80% before and after. Overall, accuracy was 80.0%, F1 Score was 0.77~0.79, and ROC-AUC was 80%~84%, resulting in predictive model performance. Therefore, it was found that the algorithm used in this study can be used as a reference model necessary to verify the validity and accuracy of cardiovascular disease prediction. A cardiovascular disease prediction analysis algorithm that can enter accurate biometric data collected in future clinical trials, add lifestyle management (exercise, eating habits, etc.) elements, and verify the effect and efficacy on cardiovascular-related bio-signals and disease risk. development, ultimately suggesting that it is possible to develop lifestyle improvement contents (Digital Therapy).