• Title/Summary/Keyword: ROC AUC

Search Result 292, Processing Time 0.022 seconds

Application of Receiver Operating Characteristic (ROC) Curve for Evaluation of Diagnostic Test Performance (진단검사의 특성 평가를 위한 Receiver Operating Characteristic (ROC) 곡선의 활용)

  • Pak, Son-Il;Oh, Tae-Ho
    • Journal of Veterinary Clinics
    • /
    • v.33 no.2
    • /
    • pp.97-101
    • /
    • 2016
  • In the field of clinical medicine, diagnostic accuracy studies refer to the degree of agreement between the index test and the reference standard for the discriminatory ability to identify a target disorder of interest in a patient. The receiver operating characteristic (ROC) curve offers a graphical display the trade-off between sensitivity and specificity at each cutoff for a diagnostic test and is useful in assigning the best cutoff for clinical use. In this end, the ROC curve analysis is a useful tool for estimating and comparing the accuracy of competing diagnostic tests. This paper reviews briefly the measures of diagnostic accuracy such as sensitivity, specificity, and area under the ROC curve (AUC) that is a summary measure for diagnostic accuracy across the spectrum of test results. In addition, the methods of creating an ROC curve in single diagnostic test with five-category discrete scale for disease classification from healthy individuals, meaningful interpretation of the AUC, and the applications of ROC methodology in clinical medicine to determine the optimal cutoff values have been discussed using a hypothetical example as an illustration.

Partial AUC and optimal thresholds (부분 AUC와 최적분류점들)

  • Hong, Chong Sun;Cho, Hyun Su
    • The Korean Journal of Applied Statistics
    • /
    • v.32 no.2
    • /
    • pp.187-198
    • /
    • 2019
  • Extensive literature exists on how to estimate optimal thresholds based on various accuracy measures using receiver operating characteristic (ROC) and cumulative accuracy profile (CAP) curves. This paper now proposes an alternative measure to represented the specific partial area under the ROC and CAP curves. The relationship between ROC and CAP functions is examined using differential equations of the new defined partial area under curves. In addition, the relationship with the optimal thresholds under conditions of various accuracy measures for the ROC and CAP functions is also derived. We assume there are two kinds of distribution functions composing the mixed distribution as various normal distributions before finding the optimal thresholds. Corresponding type 1 and 2 errors are also explored and discussed under various conditions for accuracy measures.

Optimization of Classifier Performance at Local Operating Range: A Case Study in Fraud Detection

  • Park Lae-Jeong;Moon Jung-Ho
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • v.5 no.3
    • /
    • pp.263-267
    • /
    • 2005
  • Building classifiers for financial real-world classification problems is often plagued by severely overlapping and highly skewed class distribution. New performance measures such as receiver operating characteristic (ROC) curve and area under ROC curve (AUC) have been recently introduced in evaluating and building classifiers for those kind of problems. They are, however, in-effective to evaluation of classifier's discrimination performance in a particular class of the classification problems that interests lie in only a local operating range of the classifier, In this paper, a new method is proposed that enables us to directly improve classifier's discrimination performance at a desired local operating range by defining and optimizing a partial area under ROC curve or domain-specific curve, which is difficult to achieve with conventional classification accuracy based learning methods. The effectiveness of the proposed approach is demonstrated in terms of fraud detection capability in a real-world fraud detection problem compared with the MSE-based approach.

ROC Curve Fitting with Normal Mixtures (정규혼합분포를 이용한 ROC 분석)

  • Hong, Chong-Sun;Lee, Won-Yong
    • The Korean Journal of Applied Statistics
    • /
    • v.24 no.2
    • /
    • pp.269-278
    • /
    • 2011
  • There are many researches that have considered the distribution functions and appropriate covariates corresponding to the scores in order to improve the accuracy of a diagnostic test, including the ROC curve that is represented with the relations of the sensitivity and the specificity. The ROC analysis was used by the regression model including some covariates under the assumptions that its distribution function is known or estimable. In this work, we consider a general situation that both the distribution function and the elects of covariates are unknown. For the ROC analysis, the mixtures of normal distributions are used to estimate the distribution function fitted to the credit evaluation data that is consisted of the score random variable and two sub-populations of parameters. The AUC measure is explored to compare with the nonparametric and empirical ROC curve. We conclude that the method using normal mixtures is fitted to the classical one better than other methods.

The Unified Framework for AUC Maximizer

  • Jun, Jong-Jun;Kim, Yong-Dai;Han, Sang-Tae;Kang, Hyun-Cheol;Choi, Ho-Sik
    • Communications for Statistical Applications and Methods
    • /
    • v.16 no.6
    • /
    • pp.1005-1012
    • /
    • 2009
  • The area under the curve(AUC) is commonly used as a measure of the receiver operating characteristic(ROC) curve which displays the performance of a set of binary classifiers for all feasible ratios of the costs associated with true positive rate(TPR) and false positive rate(FPR). In the bipartite ranking problem where one has to compare two different observations and decide which one is "better", the AUC measures the quantity that ranking score of a randomly chosen sample in one class is larger than that of a randomly chosen sample in the other class and hence, the function which maximizes an AUC of bipartite ranking problem is different to the function which maximizes (minimizes) accuracy (misclassification error rate) of binary classification problem. In this paper, we develop a way to construct the unified framework for AUC maximizer including support vector machines based on maximizing large margin and logistic regression based on estimating posterior probability. Moreover, we develop an efficient algorithm for the proposed unified framework. Numerical results show that the propose unified framework can treat various methodologies successfully.

Parameter estimation for the imbalanced credit scoring data using AUC maximization (AUC 최적화를 이용한 낮은 부도율 자료의 모수추정)

  • Hong, C.S.;Won, C.H.
    • The Korean Journal of Applied Statistics
    • /
    • v.29 no.2
    • /
    • pp.309-319
    • /
    • 2016
  • For binary classification models, we consider a risk score that is a function of linear scores and estimate the coefficients of the linear scores. There are two estimation methods: one is to obtain MLEs using logistic models and the other is to estimate by maximizing AUC. AUC approach estimates are better than MLEs when using logistic models under a general situation which does not support logistic assumptions. This paper considers imbalanced data that contains a smaller number of observations in the default class than those in the non-default for credit assessment models; consequently, the AUC approach is applied to imbalanced data. Various logit link functions are used as a link function to generate imbalanced data. It is found that predicted coefficients obtained by the AUC approach are equivalent to (or better) than those from logistic models for low default probability - imbalanced data.

Multivariate Outlier Removing for the Risk Prediction of Gas Leakage based Methane Gas (메탄 가스 기반 가스 누출 위험 예측을 위한 다변량 특이치 제거)

  • Dashdondov, Khongorzul;Kim, Mi-Hye
    • Journal of the Korea Convergence Society
    • /
    • v.11 no.12
    • /
    • pp.23-30
    • /
    • 2020
  • In this study, the relationship between natural gas (NG) data and gas-related environmental elements was performed using machine learning algorithms to predict the level of gas leakage risk without directly measuring gas leakage data. The study was based on open data provided by the server using the IoT-based remote control Picarro gas sensor specification. The naturel gas leaks into the air, it is a big problem for air pollution, environment and the health. The proposed method is multivariate outlier removing method based Random Forest (RF) classification for predicting risk of NG leak. After, unsupervised k-means clustering, the experimental dataset has done imbalanced data. Therefore, we focusing our proposed models can predict medium and high risk so best. In this case, we compared the receiver operating characteristic (ROC) curve, accuracy, area under the ROC curve (AUC), and mean standard error (MSE) for each classification model. As a result of our experiments, the evaluation measurements include accuracy, area under the ROC curve (AUC), and MSE; 99.71%, 99.57%, and 0.0016 for MOL_RF respectively.

Estimating Discriminatory Power with Non-normality and a Small Number of Defaults

  • Hong, C.S.;Kim, H.J.;Lee, J.L.
    • The Korean Journal of Applied Statistics
    • /
    • v.25 no.5
    • /
    • pp.803-811
    • /
    • 2012
  • For credit evaluation models, we extend the study of discriminatory power based on AUC obtained from a ROC curve when the number of defaults is small and distribution functions of the defaults and non-defaults are normal distributions. Since distribution functions do not satisfy normality in real world, the distribution functions of the defaults and non-defaults are assumed as normal mixture distributions based on results that the normal mixture could be better fitted than other distribution estimation methods for non-normal data. By using several AUC statistics, the discriminatory power under such a circumstance is explored and compared with those of normal distributions.

Bayesian hierarchical model for the estimation of proper receiver operating characteristic curves using stochastic ordering

  • Jang, Eun Jin;Kim, Dal Ho
    • Communications for Statistical Applications and Methods
    • /
    • v.26 no.2
    • /
    • pp.205-216
    • /
    • 2019
  • Diagnostic tests in medical fields detect or diagnose a disease with results measured by continuous or discrete ordinal data. The performance of a diagnostic test is summarized using the receiver operating characteristic (ROC) curve and the area under the curve (AUC). The diagnostic test is considered clinically useful if the outcomes in actually-positive cases are higher than actually-negative cases and the ROC curve is concave. In this study, we apply the stochastic ordering method in a Bayesian hierarchical model to estimate the proper ROC curve and AUC when the diagnostic test results are measured in discrete ordinal data. We compare the conventional binormal model and binormal model under stochastic ordering. The simulation results and real data analysis for breast cancer indicate that the binormal model under stochastic ordering can be used to estimate the proper ROC curve with a small bias even though the sample sizes were small or the sample size of actually-negative cases varied from actually-positive cases. Therefore, it is appropriate to consider the binormal model under stochastic ordering in the presence of large differences for a sample size between actually-negative and actually-positive groups.

In vivo Evaluation of Flow Estimation Methods for 3D Color Doppler Imaging

  • Yoo, Yang-Mo
    • Journal of Biomedical Engineering Research
    • /
    • v.31 no.3
    • /
    • pp.177-186
    • /
    • 2010
  • In 3D ultrasound color Doppler imaging (CDI), 8-16 pulse transmissions (ensembles) per each scanline are used for effective clutter rejection and flow estimation, but it yields a low volume acquisition rate. In this paper, we have evaluated three flow estimation methods: autoregression (AR), eigendecomposition (ED), and autocorrelation combined with adaptive clutter rejection (AC-ACR) for a small ensemble size (E=4). The performance of AR, ED and AC-ACR methods was compared using 2D and 3D in vivo data acquired under different clutter conditions (common carotid artery, kidney and liver). To evaluate the effectiveness of three methods, receiver operating characteristic (ROC) curves were generated. For 2D kidney in vivo data, the AC-ACR method outperforms the AR and ED methods in terms of the area under the ROC curve (AUC) (0.852 vs. 0.793 and 0.813, respectively). Similarly, the AC-ACR method shows higher AUC values for 2D liver in vivo data compared to the AR and ED methods (0.855 vs. 0.807 and 0.823, respectively). For the common carotid artery data, the AR provides higher AUC values, but it suffers from biased estimates. For 3D in vivo data acquired from a kidney transplant patient, the AC-ACR with E=4 provides an AUC value of 0.799. These in vivo experiment results indicate that the AC-ACR method can provide more robust flow estimates compared to the AR and ED methods with a small ensemble size.