• Title/Summary/Keyword: ROC AUC

Search Result 292, Processing Time 0.02 seconds

Classification models for chemotherapy recommendation using LGBM for the patients with colorectal cancer

  • Oh, Seo-Hyun;Baek, Jeong-Heum;Kang, Un-Gu
    • Journal of the Korea Society of Computer and Information
    • /
    • v.26 no.7
    • /
    • pp.9-17
    • /
    • 2021
  • In this study, we propose a part of the CDSS(Clinical Decision Support System) study, a system that can classify chemotherapy, one of the treatment methods for colorectal cancer patients. In the treatment of colorectal cancer, the selection of chemotherapy according to the patient's condition is very important because it is directly related to the patient's survival period. Therefore, in this study, chemotherapy was classified using a machine learning algorithm by creating a baseline model, a pathological model, and a combined model using both characteristics of the patient using the individual and pathological characteristics of colorectal cancer patients. As a result of comparing the prediction accuracy with Top-n Accuracy, ROC curve, and AUC, it was found that the combined model showed the best prediction accuracy, and that the LGBM algorithm had the best performance. In this study, a chemotherapy classification model suitable for the patient's condition was constructed by classifying the model by patient characteristics using a machine learning algorithm. Based on the results of this study in future studies, it will be helpful for CDSS research by creating a better performing chemotherapy classification model.

Prediction of Drug Side Effects Based on Drug-Related Information (약물 관련 정보를 이용한 약물 부작용 예측)

  • Seo, Sukyung;Lee, Taekeon;Yoon, Youngmi
    • The Journal of Korean Institute of Information Technology
    • /
    • v.17 no.12
    • /
    • pp.21-28
    • /
    • 2019
  • Side effects of drugs mean harmful and unintended effects resulting from drugs used to prevent, diagnose, or treat diseases. These side effects can lead to patients' death and are the main causes of drug developmental failures. Thus, various methods have been tried to identify side effects. These can be divided into biological and systems biology approaches. In this study, we use systems biology approach and focus on using various phenotypic information in addition to the chemical structure and target proteins. First, we collect datasets that are used in this study, and calculate similarities individually. Second, we generate a set of features using the similarities for each drug-side effect pair. Finally, we confirm the results by AUC(Area Under the ROC Curve), and showed the significance of this study through a comparison experiment.

AI-based Construction Site Prioritization for Safety Inspection Using Big Data (빅데이터를 활용한 AI 기반 우선점검 대상현장 선정 모델)

  • Hwang, Yun-Ho;Chi, Seokho;Lee, Hyeon-Seung;Jung, Hyunjun
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.42 no.6
    • /
    • pp.843-852
    • /
    • 2022
  • Despite continuous safety management, the death rate of construction workers is not decreasing every year. Accordingly, various studies are in progress to prevent construction site accidents. In this paper, we developed an AI-based priority inspection target selection model that preferentially selects sites are expected to cause construction accidents among construction sites with construction costs of less than 5 billion won (KRW). In particular, Random Forest (90.48 % of accident prediction AUC-ROC) showed the best performance among applied AI algorithms (Classification analysis). The main factors causing construction accidents were construction costs, total number of construction days and the number of construction performance evaluations. In this study an ROI (return of investment) of about 917.7 % can be predicted over 8 years as a result of better efficiency of manual inspections human resource and a preemptive response to construction accidents.

Comparison of Machine Learning Model Performance based on Observation Methods using Naked-eye and Visibility-meter (머신러닝을 이용한 안개 예측 시 목측과 시정계 계측 방법에 따른 모델 성능 차이 비교)

  • Changhyoun Park;Soon-hwan Lee
    • Journal of the Korean earth science society
    • /
    • v.44 no.2
    • /
    • pp.105-118
    • /
    • 2023
  • In this study, we predicted the presence of fog with a one-hour delay using the XGBoost DART machine learning algorithm for Andong, which had the highest occurrence of fog among inland stations from 2016 to 2020. We used six datasets: meteorological data, agricultural observation data, additional derived data, and their expanded data. The weather phenomenon numbers obtained through naked-eye observations and the visibility distances measured by visibility meters were classified as fog [1] or no-fog [0]. We set up twelve machine learning modeling experiments and used data from 2021 for model validation. We mainly evaluated model performance using recall and AUC-ROC, considering the harmful effects of fog on society and local communities. The combination of oversampled meteorological data features and the target induced by weather phenomenon numbers showed the best performance. This result highlights the importance of naked-eye observations in predicting fog using machine learning algorithms.

Detection and Prediction of Subway Failure using Machine Learning (머신러닝을 이용한 지하철 고장 탐지 및 예측)

  • Kuk-Kyung Sung
    • Advanced Industrial SCIence
    • /
    • v.2 no.4
    • /
    • pp.11-16
    • /
    • 2023
  • The subway is a means of public transportation that plays an important role in the transportation system of modern cities. However, congestion often occurs due to sudden breakdowns and system outages, causing inconvenience. Therefore, in this paper, we conducted a study on failure prediction and prevention using machine learning to efficiently operate the subway system. Using UC Irvine's MetroPT-3 dataset, we built a subway breakdown prediction model using logistic regression. The model predicted the non-failure state with a high accuracy of 0.991. However, precision and recall are relatively low, suggesting the possibility of error in failure prediction. The ROC_AUC value is 0.901, indicating that the model can classify better than random guessing. The constructed model is useful for stable operation of the subway system, but additional research is needed to improve performance. Therefore, in the future, if there is a lot of learning data and the data is well purified, failure can be prevented by pre-inspection through prediction.

Comparison of the Predictive Validity of the Pressure Injury Risk Assessment in Pediatric Patients: Braden, Braden Q and Braden QD Scale (소아 환자에서 욕창 위험도 사정 도구의 예측타당도 비교: Braden, Braden Q 및 Braden QD 도구)

  • Kang, Ji Hyeon;Lim, Eun Young;Lee, Nam Ju;Yu, Hye Min
    • Journal of Korean Clinical Nursing Research
    • /
    • v.30 no.1
    • /
    • pp.35-44
    • /
    • 2024
  • Purpose: The purpose of this study is to compare the predictive validity of pressure injury risk assessment, Braden, Braden Q and Braden QD for pediatric patients. Methods: Prospective observational study included patients under the age of 19 who were hospitalized to general wards, intensive care units of a children's hospital. Characteristics related to pressure injury were collected, and predicted validity was compared by calculating the areas under the curve (AUC) of the Braden, Braden Q, and Braden QD scales. Results: A total of 689 patients were included in the study. A total of 13 (1.9%) patients had pressure injuries, and the number of pressure injuries was 17. Factors related to the occurrence of pressure injuries were 9 (52.9%) immobility-related and 8 (47.1%) medical device-related. The AUC for each scale was .91 (95% CI .89~.94) for Braden, .92 (95% CI .90~.95) for Braden Q, and .94(95% CI .92~.96) for Braden QD. The optimal cut-off points were identified as 16 for Braden (sensitivity=88.8%, specificity=86.4%), 17 for Braden Q(sensitivity=63.6%, specificity=94.9%), and 12 for Braden QD (sensitivity=94.4%, specificity=88.7%). Conclusion: The Braden QD scale demonstrated the highest predictive validity for pressure injuries in pediatric patients and is expected to be valuable tool in preventing pediatrics pressure injuries.

Differential Validity of K-MoCA-22 Compared to K-MoCA-30 and K-MMSE for Screening MCI and Dementia

  • Haeyoon Kim;Kyung-Ho Yu;Yeonwook Kang
    • Dementia and Neurocognitive Disorders
    • /
    • v.23 no.4
    • /
    • pp.236-244
    • /
    • 2024
  • Background and Purpose: Since the onset of the coronavirus disease 2019 pandemic, the Telephone-Montreal Cognitive Assessment (T-MoCA) has gained popularity as a remote cognitive screening tool. T-MoCA includes items from the original MoCA (MoCA-30), excluding those requiring visual stimuli, resulting in a maximum score of 22 points. This study aimed to assess whether the T-MoCA items (MoCA-22) demonstrate comparable discriminatory power to MoCA-30 and Mini-Mental State Examination (MMSE) in screening for mild cognitive impairment (MCI) and dementia. Methods: Participants included 233 cognitively normal (CN) individuals, 175 with MCI, and 166 with dementia. All completed the Korean-MoCA-30 (K-MoCA-30) and Korean-MMSE (K-MMSE), with the Korean-MoCA-22 (K-MoCA-22) scores derived from the K-MoCA-30 responses. A receiver operating characteristic (ROC) curve analysis was conducted. Results: K-MoCA-22 showed a strong correlation with K-MoCA-30 and a moderate correlation with K-MMSE. Scores decreased progressively from CN to MCI and dementia, with significant differences between groups, consistent with K-MoCA-30 and K-MMSE. The study also explored modified K-MoCA-22 index scores across 5 cognitive domains. ROC curve analysis revealed that the area under the curve (AUC) for K-MoCA-22 was significantly smaller than that for K-MoCA-30 in distinguishing both MCI and dementia from CN. However, no significant difference in AUC was found between K-MoCA-22 and K-MMSE, indicating similar discriminatory power. Additionally, the discriminability of K-MoCA-22 varied by education level. Conclusions: These results indicate that K-MoCA-22, although slightly less effective than K-MoCA-30, still shows good to excellent discriminatory power and is comparable to K-MMSE in screening for MCI and dementia.

Verification of the Suitability of Fine Dust and Air Quality Management Systems Based on Artificial Intelligence Evaluation Models

  • Heungsup Sim
    • Journal of the Korea Society of Computer and Information
    • /
    • v.29 no.8
    • /
    • pp.165-170
    • /
    • 2024
  • This study aims to verify the accuracy of the air quality management system in Yangju City using an artificial intelligence (AI) evaluation model. The consistency and reliability of fine dust data were assessed by comparing public data from the Ministry of Environment with data from Yangju City's air quality management system. To this end, we analyzed the completeness, uniqueness, validity, consistency, accuracy, and integrity of the data. Exploratory statistical analysis was employed to compare data consistency. The results of the AI-based data quality index evaluation revealed no statistically significant differences between the two datasets. Among AI-based algorithms, the random forest model demonstrated the highest predictive accuracy, with its performance evaluated through ROC curves and AUC. Notably, the random forest model was identified as a valuable tool for optimizing the air quality management system. This study confirms that the reliability and suitability of fine dust data can be effectively assessed using AI-based model performance evaluation, contributing to the advancement of air quality management strategies.

Comparison of radiomics prediction models for lung metastases according to four semiautomatic segmentation methods in soft-tissue sarcomas of the extremities

  • Heesoon Sheen;Han-Back Shin;Jung Young Kim
    • Journal of the Korean Physical Society
    • /
    • v.80
    • /
    • pp.247-256
    • /
    • 2022
  • Our objective was to investigate radiomics signatures and prediction models defined by four segmentation methods in using 2-[18F]fluoro-2-deoxy-d-glucose positron emission tomography (18F-FDG PET) imaging of lung metastases of soft-tissue sarcomas (STSs). For this purpose, three fixed threshold methods using the standardized uptake value (SUV) and gradient-based edge detection (ED) were used for tumor delineation on the PET images of STSs. The Dice coefficients (DCs) of the segmentation methods were compared. The least absolute shrinkage and selection operator (LASSO) regression and Spearman's rank, and Friedman's ANOVA test were used for selection and validation of radiomics features. The developed radiomics models were assessed using ROC (receiver operating characteristics) curve and confusion matrices. According to the results, the DC values showed the biggest difference between SUV40% and other segmentation methods (DC: 0.55 and 0.59). Grey-level run-length matrix_run-length nonuniformity (GLRLM_RLNU) was a common radiomics signature extracted by all segmentation methods. The multivariable logistic regression of ED showed the highest area under the ROC (receiver operating characteristic) curve (AUC), sensitivity, specificity, and accuracy (AUC: 0.88, sensitivity: 0.85, specificity: 0.74, accuracy: 0.81). In our research, the ED method was able to derive a significant model of radiomics. GLRLM_RLNU which was selected from all segmented methods as a meaningful feature was considered the obvious radiomics feature associated with the heterogeneity and the aggressiveness. Our results have apparently showed that radiomics signatures have the potential to uncover tumor characteristics.

Research on Outlier and Missing Value Correction Methods to Improve Smart Farm Data Quality (스마트팜 데이터 품질 향상을 위한 이상치 및 결측치 보정 방법에 관한 연구)

  • Sung-Jae Lee;Hyun Sim
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.19 no.5
    • /
    • pp.1027-1034
    • /
    • 2024
  • This study aims to address the issues of outliers and missing values in AI-based smart farming to improve data quality and enhance the accuracy of agricultural predictive activities. By utilizing real data provided by the Rural Development Administration (RDA) and the Korea Agency of Education, Promotion, and Information Service in Food, Agriculture, Forestry, and Fisheries (EPIS), outlier detection and missing value imputation techniques were applied to collect and manage high-quality data. For successful smart farm operations, an IoT-based AI automatic growth measurement model is essential, and achieving a high data quality index through stable data preprocessing is crucial. In this study, various methods for correcting outliers and imputing missing values in growth data were applied, and the proposed preprocessing strategies were validated using machine learning performance evaluation indices. The results showed significant improvements in model performance, with high predictive accuracy observed in key evaluation metrics such as ROC and AUC.