• Title/Summary/Keyword: Predictive Accuracy

Search Result 828, Processing Time 0.021 seconds

The Use of Confidence Interval of Measures of Diagnostic Accuracy (진단검사 정확도 평가지표의 신뢰구간)

  • Oh, Tae-Ho;Pak, Son-Il
    • Journal of Veterinary Clinics
    • /
    • v.32 no.4
    • /
    • pp.319-323
    • /
    • 2015
  • The performance of diagnostic test accuracy is usually summarized by a variety of statistics such as sensitivity, specificity, predictive value, likelihood ratio, and kappa. These indices are most commonly presented when evaluations of competing diagnostic tests are reported, and it is of utmost importance to compare the accuracies of diagnostic tests to decide on the best available test for certain medical disorder. However, it is important to emphasize that specific point values of these indices are merely estimates. If parameter estimates are reported without a measure of uncertainty (precision), knowledgeable readers cannot know the range within which the true values of the indices are likely to lie. Therefore, when evaluations of diagnostic accuracy are reported the precision of estimates should be stated in parallel. To reflect the precision of any estimate of a diagnostic performance characteristic or of the difference between performance characteristics, the computation of confidential interval (CI), an indicator of precision, is widely used in medical literatures in that CIs are more informative to interpret test results than the simple point estimates. The majority of peer-reviewed journals usually require CIs to be specified for descriptive estimates, whereas domestic veterinary journals seem less vigilant on this issues. This paper describes how to calculate the indices and associated CIs using practical examples when assessing diagnostic test performance.

A Study on the Comparison of Predictive Models of Cardiovascular Disease Incidence Based on Machine Learning

  • Ji Woo SEOK;Won ro LEE;Min Soo KANG
    • Korean Journal of Artificial Intelligence
    • /
    • v.11 no.1
    • /
    • pp.1-7
    • /
    • 2023
  • In this paper, a study was conducted to compare the prediction model of cardiovascular disease occurrence. It is the No.1 disease that accounts for 1/3 of the world's causes of death, and it is also the No. 2 cause of death in Korea. Primary prevention is the most important factor in preventing cardiovascular diseases before they occur. Early diagnosis and treatment are also more important, as they play a role in reducing mortality and morbidity. The Results of an experiment using Azure ML, Logistic Regression showed 88.6% accuracy, Decision Tree showed 86.4% accuracy, and Support Vector Machine (SVM) showed 83.7% accuracy. In addition to the accuracy of the ROC curve, AUC is 94.5%, 93%, and 92.4%, indicating that the performance of the machine learning algorithm model is suitable, and among them, the results of applying the logistic regression algorithm model are the most accurate. Through this paper, visualization by comparing the algorithms can serve as an objective assistant for diagnosis and guide the direction of diagnosis made by doctors in the actual medical field.

Using Machine Learning Technique for Analytical Customer Loyalty

  • Mohamed M. Abbassy
    • International Journal of Computer Science & Network Security
    • /
    • v.23 no.8
    • /
    • pp.190-198
    • /
    • 2023
  • To enhance customer satisfaction for higher profits, an e-commerce sector can establish a continuous relationship and acquire new customers. Utilize machine-learning models to analyse their customer's behavioural evidence to produce their competitive advantage to the e-commerce platform by helping to improve overall satisfaction. These models will forecast customers who will churn and churn causes. Forecasts are used to build unique business strategies and services offers. This work is intended to develop a machine-learning model that can accurately forecast retainable customers of the entire e-commerce customer data. Developing predictive models classifying different imbalanced data effectively is a major challenge in collected data and machine learning algorithms. Build a machine learning model for solving class imbalance and forecast customers. The satisfaction accuracy is used for this research as evaluation metrics. This paper aims to enable to evaluate the use of different machine learning models utilized to forecast satisfaction. For this research paper are selected three analytical methods come from various classifications of learning. Classifier Selection, the efficiency of various classifiers like Random Forest, Logistic Regression, SVM, and Gradient Boosting Algorithm. Models have been used for a dataset of 8000 records of e-commerce websites and apps. Results indicate the best accuracy in determining satisfaction class with both gradient-boosting algorithm classifications. The results showed maximum accuracy compared to other algorithms, including Gradient Boosting Algorithm, Support Vector Machine Algorithm, Random Forest Algorithm, and logistic regression Algorithm. The best model developed for this paper to forecast satisfaction customers and accuracy achieve 88 %.

Enhancing Heart Disease Prediction Accuracy through Soft Voting Ensemble Techniques

  • Byung-Joo Kim
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.16 no.3
    • /
    • pp.290-297
    • /
    • 2024
  • We investigate the efficacy of ensemble learning methods, specifically the soft voting technique, for enhancing heart disease prediction accuracy. Our study uniquely combines Logistic Regression, SVM with RBF Kernel, and Random Forest models in a soft voting ensemble to improve predictive performance. We demonstrate that this approach outperforms individual models in diagnosing heart disease. Our research contributes to the field by applying a well-curated dataset with normalization and optimization techniques, conducting a comprehensive comparative analysis of different machine learning models, and showcasing the superior performance of the soft voting ensemble in medical diagnosis. This multifaceted approach allows us to provide a thorough evaluation of the soft voting ensemble's effectiveness in the context of heart disease prediction. We evaluate our models based on accuracy, precision, recall, F1 score, and Area Under the ROC Curve (AUC). Our results indicate that the soft voting ensemble technique achieves higher accuracy and robustness in heart disease prediction compared to individual classifiers. This study advances the application of machine learning in medical diagnostics, offering a novel approach to improve heart disease prediction. Our findings have significant implications for early detection and management of heart disease, potentially contributing to better patient outcomes and more efficient healthcare resource allocation.

A Fuzzy-ARTMAP Equalizer for Compensating the Nonlinearity of Satellite Communication Channel

  • Lee, Jung-Sik
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.26 no.8B
    • /
    • pp.1078-1084
    • /
    • 2001
  • In this paper, fuzzy-ARTMAP neural network is applied for compensating the nonlinearity of satellite communication channel. The fuzzy-ARTMAP is made of using fuzzy logic and ART neural network. By a match tracking process with vigilance parameter, fuzzy ARTMAP neural network achieves a minimax learning rule that minimizes predictive error and maximizes generalization. Thus, the system automatically learns a minimal number of recognition categories, or hidden units, to meet accuracy criteria. Simulation studies are performed over satellite nonlinear channels. The performance of proposed fuzzy-ARTMAP equalizer is compared with MLP-basis equalizers.

  • PDF

Diagnostic Accuracy of Cervicovaginal Cytology in the Detection of Squamous Epithelial Lesions of the Uterine Cervix; Cytologic/Histologic Correlation of 481 Cases (자궁경부 편평상피병변에서 자궁경부질도말 세포검사의 진단정확도 : 481예의 세포-조직 상관관계)

  • Jin, So-Young;Park, Sang-Mo;Kim, Mee-Sun;Jeen, Yoon-Mi;Kim, Dong-Won;Lee, Dong-Wha
    • The Korean Journal of Cytopathology
    • /
    • v.19 no.2
    • /
    • pp.111-118
    • /
    • 2008
  • Background : Cervicovaginal cytology is a screening test of uterine cervical cancer. The sensitivity of cervicovaginal cytology is less than 50%, but studies of cytologic/histologic correlation are limited. We analyzed the diagnostic accuracy of cervicovaginal cytology in the detection of the squamous epithelial lesions of the uterine cervix and investigate the cause of diagnostic discordance. Materials and Methods : We collected a total of 481 sets of cervicovaginal cytology and biopsies over 5 years. The cytologic diagnoses were categorized based on The Bethesda System and the histologic diagnoses were classified as negative, flat condyloma, cervical intraepithelial neoplasia (CIN) I, CIN II, CIN III, or squamous cell carcinoma. Cytohistologic discrepancies were reviewed. Results: The concordance rate between the cytological and the histological diagnosis was 79.0%. The sensitivity and specificity of cervicovaginal cytology were 80.6% and 92.6%, respectively. Its positive predictive value and negative predictive value were 93.7% and 77.7%, respectively. The false negative rate was 19.4%. Among 54 false negative cytology cases, they were confirmed by histology as 50 flat condylomas, 2 CIN I, 1 CIN III, and 1 squamous cell carcinoma. The causes of false negative cytology were sampling errors in 75.6% and interpretation errors in 24.4%. The false positive rate was 7.4%. Among 15 false positive cytology cases, they were confirmed by histology as 12 atypical squamous cells of undetermined significance (ASCUS) and 3 low grade squamous intraepithelial lesions (LSIL). The cause of error was interpretation error in all cases. The overall diagnostic accuracy of cervicovaginal cytology was 85.7%. Conclusions : Cervicovaginal cytology shows high overall diagnostic accuracy and is a useful primary screen of uterine cervical cancer.

Diagnostic Accuracy of Percutaneous Transthoracic Needle Lung Biopsies: A Multicenter Study

  • Kyung Hee Lee;Kun Young Lim;Young Joo Suh;Jin Hur;Dae Hee Han;Mi-Jin Kang;Ji Yung Choo;Cherry Kim;Jung Im Kim;Soon Ho Yoon;Woojoo Lee;Chang Min Park
    • Korean Journal of Radiology
    • /
    • v.20 no.8
    • /
    • pp.1300-1310
    • /
    • 2019
  • Objective: To measure the diagnostic accuracy of percutaneous transthoracic needle lung biopsies (PTNBs) on the basis of the intention-to-diagnose principle and identify risk factors for diagnostic failure of PTNBs in a multi-institutional setting. Materials and Methods: A total of 9384 initial PTNBs performed in 9239 patients (mean patient age, 65 years [range, 20-99 years]) from January 2010 to December 2014 were included. The accuracy, sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) of PTNBs for diagnosis of malignancy were measured. The proportion of diagnostic failures was measured, and their risk factors were identified. Results: The overall accuracy, sensitivity, specificity, PPV, and NPV were 91.1% (95% confidence interval [CI], 90.6-91.7%), 92.5% (95% CI, 91.9-93.1%), 86.5% (95% CI, 85.0-87.9%), 99.2% (95% CI, 99.0-99.4%), and 84.3% (95% CI, 82.7-85.8%), respectively. The proportion of diagnostic failures was 8.9% (831 of 9384; 95% CI, 8.3-9.4%). The independent risk factors for diagnostic failures were lesions ≤ 1 cm in size (adjusted odds ratio [AOR], 1.86; 95% CI, 1.23-2.81), lesion size 1.1-2 cm (1.75; 1.45-2.11), subsolid lesions (1.81; 1.32-2.49), use of fine needle aspiration only (2.43; 1.80-3.28), final diagnosis of benign lesions (2.18; 1.84-2.58), and final diagnosis of lymphomas (10.66; 6.21-18.30). Use of cone-beam CT (AOR, 0.31; 95% CI, 0.13-0.75) and conventional CT-guidance (0.55; 0.32-0.94) reduced diagnostic failures. Conclusion: The accuracy of PTNB for diagnosis of malignancy was fairly high in our large-scale multi-institutional cohort. The identified risk factors for diagnostic failure may help reduce diagnostic failure and interpret the biopsy results.

The Prediction of Export Credit Guarantee Accident using Machine Learning (기계학습을 이용한 수출신용보증 사고예측)

  • Cho, Jaeyoung;Joo, Jihwan;Han, Ingoo
    • Journal of Intelligence and Information Systems
    • /
    • v.27 no.1
    • /
    • pp.83-102
    • /
    • 2021
  • The government recently announced various policies for developing big-data and artificial intelligence fields to provide a great opportunity to the public with respect to disclosure of high-quality data within public institutions. KSURE(Korea Trade Insurance Corporation) is a major public institution for financial policy in Korea, and thus the company is strongly committed to backing export companies with various systems. Nevertheless, there are still fewer cases of realized business model based on big-data analyses. In this situation, this paper aims to develop a new business model which can be applied to an ex-ante prediction for the likelihood of the insurance accident of credit guarantee. We utilize internal data from KSURE which supports export companies in Korea and apply machine learning models. Then, we conduct performance comparison among the predictive models including Logistic Regression, Random Forest, XGBoost, LightGBM, and DNN(Deep Neural Network). For decades, many researchers have tried to find better models which can help to predict bankruptcy since the ex-ante prediction is crucial for corporate managers, investors, creditors, and other stakeholders. The development of the prediction for financial distress or bankruptcy was originated from Smith(1930), Fitzpatrick(1932), or Merwin(1942). One of the most famous models is the Altman's Z-score model(Altman, 1968) which was based on the multiple discriminant analysis. This model is widely used in both research and practice by this time. The author suggests the score model that utilizes five key financial ratios to predict the probability of bankruptcy in the next two years. Ohlson(1980) introduces logit model to complement some limitations of previous models. Furthermore, Elmer and Borowski(1988) develop and examine a rule-based, automated system which conducts the financial analysis of savings and loans. Since the 1980s, researchers in Korea have started to examine analyses on the prediction of financial distress or bankruptcy. Kim(1987) analyzes financial ratios and develops the prediction model. Also, Han et al.(1995, 1996, 1997, 2003, 2005, 2006) construct the prediction model using various techniques including artificial neural network. Yang(1996) introduces multiple discriminant analysis and logit model. Besides, Kim and Kim(2001) utilize artificial neural network techniques for ex-ante prediction of insolvent enterprises. After that, many scholars have been trying to predict financial distress or bankruptcy more precisely based on diverse models such as Random Forest or SVM. One major distinction of our research from the previous research is that we focus on examining the predicted probability of default for each sample case, not only on investigating the classification accuracy of each model for the entire sample. Most predictive models in this paper show that the level of the accuracy of classification is about 70% based on the entire sample. To be specific, LightGBM model shows the highest accuracy of 71.1% and Logit model indicates the lowest accuracy of 69%. However, we confirm that there are open to multiple interpretations. In the context of the business, we have to put more emphasis on efforts to minimize type 2 error which causes more harmful operating losses for the guaranty company. Thus, we also compare the classification accuracy by splitting predicted probability of the default into ten equal intervals. When we examine the classification accuracy for each interval, Logit model has the highest accuracy of 100% for 0~10% of the predicted probability of the default, however, Logit model has a relatively lower accuracy of 61.5% for 90~100% of the predicted probability of the default. On the other hand, Random Forest, XGBoost, LightGBM, and DNN indicate more desirable results since they indicate a higher level of accuracy for both 0~10% and 90~100% of the predicted probability of the default but have a lower level of accuracy around 50% of the predicted probability of the default. When it comes to the distribution of samples for each predicted probability of the default, both LightGBM and XGBoost models have a relatively large number of samples for both 0~10% and 90~100% of the predicted probability of the default. Although Random Forest model has an advantage with regard to the perspective of classification accuracy with small number of cases, LightGBM or XGBoost could become a more desirable model since they classify large number of cases into the two extreme intervals of the predicted probability of the default, even allowing for their relatively low classification accuracy. Considering the importance of type 2 error and total prediction accuracy, XGBoost and DNN show superior performance. Next, Random Forest and LightGBM show good results, but logistic regression shows the worst performance. However, each predictive model has a comparative advantage in terms of various evaluation standards. For instance, Random Forest model shows almost 100% accuracy for samples which are expected to have a high level of the probability of default. Collectively, we can construct more comprehensive ensemble models which contain multiple classification machine learning models and conduct majority voting for maximizing its overall performance.

A Framework for Managing Approximation Models in place of Expensive Simulations in Optimization (최적화에서의 근사모델 관리기법의 활용)

  • 양영순;장범선;연윤석
    • Proceedings of the Computational Structural Engineering Institute Conference
    • /
    • 2000.04b
    • /
    • pp.159-167
    • /
    • 2000
  • In optimization problems, computationally intensive or expensive simulations hinder the use of standard optimization techniques because the computational expense is too heavy to implement them at each iteration of the optimization algorithm. Therefore, those expensive simulations are often replaced with approximation models which can be evaluated nearly free. However, because of the limited accuracy of the approximation models, it is practically impossible to find an exact optimal point of the original problem. Significant efforts have been made to overcome this problem. The approximation models are sequentially updated during the iterative optimization process such that interesting design points are included. The interesting points have a strong influence on making the approximation model capture an overall trend of the original function or improving the accuracy of the approximation in the vicinity of a minimizer. They are successively determined at each iteration by utilizing the predictive ability of the approximation model. This paper will focuses on those approaches and introduces various approximation methods.

  • PDF

A Highly Efficient Aeroelastic Optimization Method Based on a Surrogate Model

  • Zhiqiang, Wan;Xiaozhe, Wang;Chao, Yang
    • International Journal of Aeronautical and Space Sciences
    • /
    • v.17 no.4
    • /
    • pp.491-500
    • /
    • 2016
  • This paper presents a highly efficient aeroelastic optimization method based on a surrogate model; the model is verified by considering the case of a high-aspect-ratio composite wing. Optimization frameworks using the Kriging model and genetic algorithm (GA), the Kriging model and improved particle swarm optimization (IPSO), and the back propagation neural network model (BP) and IPSO are presented. The feasibility of the method is verified, as the model can improve the optimization efficiency while also satisfying the engineering requirements. Moreover, the effects of the number of design variables and number of constraints on the optimization efficiency and objective function are analysed in detail. The accuracy of two surrogate models in aeroelastic optimization is also compared. The Kriging model is constructed more conveniently, and its predictive accuracy of the aeroelastic responses also satisfies the engineering requirements. According to the case of a high-aspect-ratio composite wing, the GA is better at global optimization.