• Title/Summary/Keyword: ROC Curve

Search Result 593, Processing Time 0.025 seconds

Learning Behavior Analysis of Bayesian Algorithm Under Class Imbalance Problems (클래스 불균형 문제에서 베이지안 알고리즘의 학습 행위 분석)

  • Hwang, Doo-Sung
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.45 no.6
    • /
    • pp.179-186
    • /
    • 2008
  • In this paper we analyse the effects of Bayesian algorithm in teaming class imbalance problems and compare the performance evaluation methods. The teaming performance of the Bayesian algorithm is evaluated over the class imbalance problems generated by priori data distribution, imbalance data rate and discrimination complexity. The experimental results are calculated by the AUC(Area Under the Curve) values of both ROC(Receiver Operator Characteristic) and PR(Precision-Recall) evaluation measures and compared according to imbalance data rate and discrimination complexity. In comparison and analysis, the Bayesian algorithm suffers from the imbalance rate, as the same result in the reported researches, and the data overlapping caused by discrimination complexity is the another factor that hampers the learning performance. As the discrimination complexity and class imbalance rate of the problems increase, the learning performance of the AUC of a PR measure is much more variant than that of the AUC of a ROC measure. But the performances of both measures are similar with the low discrimination complexity and class imbalance rate of the problems. The experimental results show 4hat the AUC of a PR measure is more proper in evaluating the learning of class imbalance problem and furthermore gets the benefit in designing the optimal learning model considering a misclassification cost.

Surveying and Optimizing the Predictors for Ependymoma Specific Survival using SEER Data

  • Cheung, Min Rex
    • Asian Pacific Journal of Cancer Prevention
    • /
    • v.15 no.2
    • /
    • pp.867-870
    • /
    • 2014
  • Purpose: This study used receiver operating characteristic curve to analyze Surveillance, Epidemiology and End Results (SEER) ependymoma data to identify predictive models and potential disparity in outcome. Materials and Methods: This study analyzed socio-economic, staging and treatment factors available in the SEER database for ependymoma. For the risk modeling, each factor was fitted by a Generalized Linear Model to predict the outcome ('brain and other nervous systems' specific death in yes/no). The area under the receiver operating characteristic curve (ROC) was computed. Similar strata were combined to construct the most parsimonious models. A random sampling algorithm was used to estimate the modeling errors. Risk of ependymoma death was computed for the predictors for comparison. Results: A total of 3,500 patients diagnosed from 1973 to 2009 were included in this study. The mean follow up time (S.D.) was 79.8 (82.3) months. Some 46% of the patients were female. The mean (S.D.) age was 34.4 (22.8) years. Age was the most predictive factor of outcome. Unknown grade demonstrated a 15% risk of cause specific death compared to 9% for grades I and II, and 36% for grades III and IV. A 5-tiered grade model (with a ROC area 0.48) was optimized to a 3-tiered model (with ROC area of 0.53). This ROC area tied for the second with that for surgery. African-American patients had 21.5% risk of death compared with 16.6% for the others. Some 72.7% of patient who did not get RT had cerebellar or spinal ependymoma. Patients undergoing surgery had 16.3% risk of death, as compared to 23.7% among those who did not have surgery. Conclusion: Grading ependymoma may dramatically improve modeling of data. RT is under used for cerebellum and spinal cord ependymoma and it may be a potential way to improve outcome.

Analysis of stage III proximal colon cancer using the Cox proportional hazards model (Cox 비례위험모형을 이용한 우측 대장암 3기 자료 분석)

  • Lee, Taeseob;Lee, Minjung
    • Journal of the Korean Data and Information Science Society
    • /
    • v.28 no.2
    • /
    • pp.349-359
    • /
    • 2017
  • In this paper, we conducted survival analyses by fitting the Cox proportional hazards model to stage III proximal colon cancer data obtained from the Surveillance, Epidemiology, and End Results program of the National Cancer Institute. We investigated the effect of covariates on the hazard function for death from proximal colon cancer in stage III with surgery performed and estimated the survival probability for a patient with specific covariates. We showed that the proportional hazards assumption is satisfied for covariates that were used to analyses, using a test based on the Schoenfeld residuals and plots of the Schoenfeld residuals and $log[-log\{{\hat{S}}(t)\}]$. We evaluated the model calibration and discriminatory accuracy by calibration plot and time-dependent area under the ROC curve, which were calculated using 10-fold cross validation.

High-impact chronic pain: evaluation of risk factors and predictors

  • Ilteris Ahmet Senturk;Erman Senturk;Isil Ustun;Akin Gokcedag;Nilgun Pulur Yildirim;Nilufer Kale Icen
    • The Korean Journal of Pain
    • /
    • v.36 no.1
    • /
    • pp.84-97
    • /
    • 2023
  • Background: The concept of high-impact chronic pain (HICP) has been proposed for patients with chronic pain who have significant limitations in work, social life, and personal care. Recognition of HICP and being able to distinguish patients with HICP from other chronic pain patients who do not have life interference allows the necessary measures to be taken in order to restore the physical and emotional functioning of the affected persons. The aim was to reveal the risk factors and predictors associated with HICP. Methods: Patients with chronic pain without life interference (grade 1 and 2) and patients with HICP were compared. Significant data were evaluated with regression analysis to reveal the associated risk factors. Receiving operating characteristic (ROC) analysis was used to evaluate predictors and present cutoff scores. Results: One thousand and six patients completed the study. From pain related cognitive processes, fear of pain (odds ratio [OR], 0.92; 95% confidence interval [CI], 0.87-0.98; P = 0.007) and helplessness (OR, 1.06; 95% CI, 1.01-1.12; P = 0.018) were found to be risk factors associated with HICP. Predictors of HICP were evaluated by ROC analysis. The highest discrimination value was found for pain intensity (cut-off score > 6.5; 83.8% sensitive; 68.7% specific; area under the curve = 0.823; P < 0.001). Conclusions: This is the first study in our geography to evaluate HICP with measurement tools that evaluate all dimensions of pain. Moreover, it is the first study in the literature to evaluate predictors and cut-off scores using ROC analysis for HICP.

Effects of a Five Times Sit to Stand Test on the Daily Life Independence of Korean Elderly and Cut-Off Analysis

  • Nam, Seung-Min;Kim, Seong-Gil
    • Journal of the Korean Society of Physical Medicine
    • /
    • v.14 no.4
    • /
    • pp.29-35
    • /
    • 2019
  • PURPOSE: The aim of this study was to provide the standard value of the Five Times Sit to Stand Test (FTSST) measurement on the daily life independence of the elderly in Korea and examine the effects of this test on their daily lives. METHODS: This study was conducted on elderly people over 65 years of age living in Gyeongsangbuk-do, Korea. FTSST was performed while sitting position on a chair. The subjects were classified into independent and dependent living groups according to their lifestyle, and their influence was then examined through logistic regression analysis. To determine the usefulness and cut-off value of the FTSST, the analysis was performed using the ROC curve. RESULTS: The elderly were more likely to live in a group rather than independently as the FTSST time increased (p<.05) (OR=1.098). The area of the lower part of the ROC curve was .707, and as the FTSST increased, a subject was more likely to live in a group rather than independently (p<.05). The cut-off value was assigned to the point where both the specificity and sensitivity were at the coordinates. The sensitivity and specificity were .626 and .753, respectively at 15.62 seconds. CONCLUSION: The elderly in Korea are more likely to live a group-dependent lifestyle than live independently; the likelihood of this outcome is increased further for every additional second beyond 15.62 seconds. The loss of independence of daily life could be predicted based on the status of a subject's lower leg strength using the FTSST.

Nomogram comparison conducted by logistic regression and naïve Bayesian classifier using type 2 diabetes mellitus (T2D) (제 2형 당뇨병을 이용한 로지스틱과 베이지안 노모그램 구축 및 비교)

  • Park, Jae-Cheol;Kim, Min-Ho;Lee, Jea-Young
    • The Korean Journal of Applied Statistics
    • /
    • v.31 no.5
    • /
    • pp.573-585
    • /
    • 2018
  • In this study, we fit the logistic regression model and naïve Bayesian classifier model using 11 risk factors to predict the incidence rate probability for type 2 diabetes mellitus. We then introduce how to construct a nomogram that can help people visually understand it. We use data from the 2013-2015 Korean National Health and Nutrition Examination Survey (KNHANES). We take 3 interactions in the logistic regression model to improve the quality of the analysis and facilitate the application of the left-aligned method to the Bayesian nomogram. Finally, we compare the two nomograms and examine their utility. Then we verify the nomogram using the ROC curve.

Accuracy Evaluation and Alert Level Setting for Real-time Cyanobacteria Measurement Using Receiver Operating Characteristic Curve Analysis (ROC 분석을 이용한 수질자동측정소 실시간 남조류 측정의 정확성 평가 및 경보기준 설정)

  • Song, Sanghwan;Park, Jong-hwan;Kang, Tae-Woo;Kim, Young-Suk;Kim, Jihyun;Kang, Taegu
    • Journal of Korean Society on Water Environment
    • /
    • v.33 no.2
    • /
    • pp.130-139
    • /
    • 2017
  • With the need to evaluate accuracy of real-time measurement of cyanobacterial fluorescence to determine cyanobacterial blooms, this research examined 357 paired data (2013-2016) comprising both microscopic toxic cyanobacterial cell counts and concurrent real-time cyanobacterial concentrations at 2 sites (YS1 and YS2) in Yeongsan river. The increase in real-time cyanobacterial concentration was closely associated with the exceedance of 5,000 cyanobacterial cells/ml (odds ratio [OR] 1.07, 95% confidence interval [CI] 1.03-1.12) and 10,000 cells/ml (OR 1.08, 95% CI 1.04-1.12) at YS2 site. The area under the receiver operating characteristic (ROC) curve for the real-time cyanobacterial measurement at the YS2 site was 0.93, which indicates the measurement provides a high accurate detection of cyanobacterial blooms. On the ROC curve, the early alert levels of real-time cyanobacteria ranging $16-23{\mu}g$ chl-a/L would produce acceptable sensitivity of 79% and specificities greater than 90%. The real-time fluorescence measurement was found to be an accurate indicator of cyanobacteria and can serve as a tool for detecting toxic cyanobacterial bloom events in Youngsan river.

Model Based on Alkaline Phosphatase and Gamma-Glutamyltransferase for Gallbladder Cancer Prognosis

  • Xu, Xin-Sen;Miao, Run-Chen;Zhang, Ling-Qiang;Wang, Rui-Tao;Qu, Kai;Pang, Qing;Liu, Chang
    • Asian Pacific Journal of Cancer Prevention
    • /
    • v.16 no.15
    • /
    • pp.6255-6259
    • /
    • 2015
  • Purpose: To evaluate the prognostic value of alkaline phosphatase (ALP) and gamma-glutamyltransferase (GGT) in gallbladder cancer (GBC). Materials and Methods: Serum ALP and GGT levels and clinicopathological parameters were retrospectively evaluated in 199 GBC patients. Receiver operating characteristic (ROC) curve analysis was performed to determine the cut-off values of ALP and GGT. Then, associations with overall survival were assessed by multivariate analysis. Based on the significant factors, a prognostic score model was established. Results: By ROC curve analysis, $ALP{\geq}210U/L$ and $GGT{\geq}43U/L$ were considered elevated. Overall survival for patients with elevated ALP and GGT was significantly worse than for patients within the normal range. Multivariate analysis showed that the elevated ALP, GGT and tumor stage were independent prognostic factors. Giving each positive factor a score of 1, we established a preoperative prognostic score model. Varied outcomes would be significantly distinguished by the different score groups. By further ROC curve analysis, the simple score showed great superiority compared with the widely used TNM staging, each of the ALP or GGT alone, or traditional tumor markers such as CEA, AFP, CA125 and CA199. Conclusions: Elevated ALP and GGT levels were risk predictors in GBC patients. Our prognostic model provides infomration on varied outcomes of patients from different score groups.

Assessment of Gait as a Diagnostic Tool for Patients with Dementia (치매 진단도구로서 치매노인의 보행능력 평가에 대한 연구)

  • Lee, Han-Suk;Park, Sun-Wook
    • Journal of the Korean Society of Physical Medicine
    • /
    • v.12 no.2
    • /
    • pp.129-136
    • /
    • 2017
  • PURPOSE: The purpose of this study was to compare the gait of elderly patients with and without dementia to investigate the possibility of an ambulation assessment test as a diagnostic tool for dementia. METHODS: A total of 96 subjects were included with 60 participants without dementia (control group) and 36 patients with dementia (dementia group). To compare the walking ability of the two groups, a 4-m walking test (4MWT) and Groningen Meander Walking Test (GMWT) were conducted. The GMWT is graded by amount of time in seconds and by number of oversteps outside the track. Mann-Whitney U test was used to compare the gait between the groups and the area under the curve (AUC) with Received Operating Characteristic (ROC) curve was analyzed. Statistical significance was considered at a p<.05, with a 95% confidence interval. RESULTS: There were statistically significant differences (p<.05) between the dementia group and the control group for the 4MWT, GMWTSEC, and GMWTSTEP scores. The AUC was .95 for 4MWT, .92 for GMWTSEC, and .96 for GMWTSTEP with the 95% confidence interval. The cut-off values of the ROC curve were 1.03m/s for 4MWT, 10.8 second for GMWTSEC, and 3.75 steps for GMSTEP. CONCLUSION: In our study, we investigated the utility of ambulatory assessment tools to predict dementia. The results of this study suggest that the 4MWT and the GMWT used in this study are appropriate assessment tools for dementia prediction.

Cross Validation of Attention-Deficit/Hyperactivity Disorder-After School Checklist

  • Lee, Sukhyun;Kim, Bongseog;Yoo, Hanik K.;Huh, Hannah;Roh, Jaewoo
    • Journal of the Korean Academy of Child and Adolescent Psychiatry
    • /
    • v.29 no.3
    • /
    • pp.129-136
    • /
    • 2018
  • Objectives: This study aimed to evaluate the efficacy of the attention-deficit/hyperactivity disorder (ADHD)-After School Checklist (ASK) by comparing the results of the Comprehensive Attention Test (CAT) and Clinical Global Impression-Severity (CGI-S) Scale and then by calculating the area under the receiver operating characteristic (ROC) curve. Methods: We performed correlation analyses on the ASK and CAT results and then the ASK and CGI-S results. We created a ROC curve and evaluated performance on the ASK as a diagnostic tool. We then analyzed the test results of 1348 subjects (male 56.8%), including 1201 subjects in the general population and 147 ADHD subjects, aged 6-15 years, from kindergarten to middle school in Seoul and Gyeonggi province, South Korea. Results: According to the correlation analyses, ASK scores and the Attention Quotient (AQ) of CAT scores showed a significant correlation of -0.20--0.29 (p<0.05). The t-test between ADHD scores and CGI-S also showed a significant correlation (t=-2.55, p<0.05). The area under the ROC curve was calculated as 0.81, indicating good efficacy of the ASK, and the cut-off score was calculated as 15.5. Conclusion: The ASK can be used as a valid tool not only to evaluate functional impairment of ADHD children and adolescents but also to screen ADHD.