• Title/Summary/Keyword: AUC-optimization

Search Result 19, Processing Time 0.023 seconds

L1-penalized AUC-optimization with a surrogate loss

  • Hyungwoo Kim;Seung Jun Shin
    • Communications for Statistical Applications and Methods
    • /
    • v.31 no.2
    • /
    • pp.203-212
    • /
    • 2024
  • The area under the ROC curve (AUC) is one of the most common criteria used to measure the overall performance of binary classifiers for a wide range of machine learning problems. In this article, we propose a L1-penalized AUC-optimization classifier that directly maximizes the AUC for high-dimensional data. Toward this, we employ the AUC-consistent surrogate loss function and combine the L1-norm penalty which enables us to estimate coefficients and select informative variables simultaneously. In addition, we develop an efficient optimization algorithm by adopting k-means clustering and proximal gradient descent which enjoys computational advantages to obtain solutions for the proposed method. Numerical simulation studies demonstrate that the proposed method shows promising performance in terms of prediction accuracy, variable selectivity, and computational costs.

Use of Artificial Bee Swarm Optimization (ABSO) for Feature Selection in System Diagnosis for Coronary Heart Disease

  • Wiharto;Yaumi A. Z. A. Fajri;Esti Suryani;Sigit Setyawan
    • Journal of information and communication convergence engineering
    • /
    • v.21 no.2
    • /
    • pp.130-138
    • /
    • 2023
  • The selection of the correct examination variables for diagnosing heart disease provides many benefits, including faster diagnosis and lower cost of examination. The selection of inspection variables can be performed by referring to the data of previous examination results so that future investigations can be carried out by referring to these selected variables. This paper proposes a model for selecting examination variables using an Artificial Bee Swarm Optimization method by considering the variables of accuracy and cost of inspection. The proposed feature selection model was evaluated using the performance parameters of accuracy, area under curve (AUC), number of variables, and inspection cost. The test results show that the proposed model can produce 24 examination variables and provide 95.16% accuracy and 97.61% AUC. These results indicate a significant decrease in the number of inspection variables and inspection costs while maintaining performance in the excellent category.

Parameter estimation of linear function using VUS and HUM maximization (VUS와 HUM 최적화를 이용한 선형함수의 모수추정)

  • Hong, Chong Sun;Won, Chi Hwan;Jeong, Dong Gil
    • Journal of the Korean Data and Information Science Society
    • /
    • v.26 no.6
    • /
    • pp.1305-1315
    • /
    • 2015
  • Consider the risk score which is a function of a linear score for the classification models. The AUC optimization method can be applied to estimate the coefficients of linear score. These estimates obtained by this AUC approach method are shown to be better than the maximum likelihood estimators using logistic models under the general situation which does not fit the logistic assumptions. In this work, the VUS and HUM approach methods are suggested by extending AUC approach method for more realistic discrimination and prediction worlds. Some simulation results are obtained with both various distributions of thresholds and three kinds of link functions such as logit, complementary log-log and modified logit functions. It is found that coefficient prediction results by using the VUS and HUM approach methods for multiple categorical classification are equivalent to or better than those by using logistic models with some link functions.

Optimization of In Vivo Stickiness Evaluation for Cosmetic Creams Using Texture Analyzer (Texture Analyzer (TA)를 이용한 화장품 크림의 In Vivo 끈적임 평가법의 최적화)

  • Ryoo, Joo-Yeon;Bae, Jung-Eun;Kang, Nae-Gyu
    • Journal of the Society of Cosmetic Scientists of Korea
    • /
    • v.46 no.4
    • /
    • pp.371-382
    • /
    • 2020
  • There have been continuous attempts to quantify sensory attributes of cosmetic products by measuring relevant physical properties. The most representative method to evaluate stickiness is to measure axial force using texture analyzer. Stickiness is known to correlate with AUC which abbreviates area under curve in the obtained axial force curve as a function of time. Recently, Normandie University research group developed in vivo stickiness evaluation method considering the characteristics of skin along with established evaluation method[8]. Based on the study, we tried to optimize in vivo stickiness evaluation method especially for cosmetic creams. The experiment was carried out on 5 different facial creams products by changing the amount and the times of rolling of creams, and the shape and material of probes. Based on the results of the sensory evaluation, the most consistent conditions were established as the optimal evaluation method. As a result, applying 70 μL of cream and rubbing 10 times for 7 s inside the 3.4 cm circle were judged to be suitable. As for the probes, spherical metallic probe was more proper due to its reproducibility. We conducted the settled method on 10 subjects to check its validity. Although the absolute values of AUC differed depending on the individuals, the AUC values were all ranked the same. Finally, for the standardization of stickiness of AUC, polyvinylpyrrolidone (PVP) was set as a reference material and we measured AUC of its aqueous solution by changing concentration. Then, the degree of stickiness recognition for 5 different creams was surveyed to check the correlation between AUC and stickiness.

Development of benthic macroinvertebrate species distribution models using the Bayesian optimization (베이지안 최적화를 통한 저서성 대형무척추동물 종분포모델 개발)

  • Go, ByeongGeon;Shin, Jihoon;Cha, Yoonkyung
    • Journal of Korean Society of Water and Wastewater
    • /
    • v.35 no.4
    • /
    • pp.259-275
    • /
    • 2021
  • This study explored the usefulness and implications of the Bayesian hyperparameter optimization in developing species distribution models (SDMs). A variety of machine learning (ML) algorithms, namely, support vector machine (SVM), random forest (RF), boosted regression tree (BRT), XGBoost (XGB), and Multilayer perceptron (MLP) were used for predicting the occurrence of four benthic macroinvertebrate species. The Bayesian optimization method successfully tuned model hyperparameters, with all ML models resulting an area under the curve (AUC) > 0.7. Also, hyperparameter search ranges that generally clustered around the optimal values suggest the efficiency of the Bayesian optimization in finding optimal sets of hyperparameters. Tree based ensemble algorithms (BRT, RF, and XGB) tended to show higher performances than SVM and MLP. Important hyperparameters and optimal values differed by species and ML model, indicating the necessity of hyperparameter tuning for improving individual model performances. The optimization results demonstrate that for all macroinvertebrate species SVM and RF required fewer numbers of trials until obtaining optimal hyperparameter sets, leading to reduced computational cost compared to other ML algorithms. The results of this study suggest that the Bayesian optimization is an efficient method for hyperparameter optimization of machine learning algorithms.

Optimization of Classifier Performance at Local Operating Range: A Case Study in Fraud Detection

  • Park Lae-Jeong;Moon Jung-Ho
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • v.5 no.3
    • /
    • pp.263-267
    • /
    • 2005
  • Building classifiers for financial real-world classification problems is often plagued by severely overlapping and highly skewed class distribution. New performance measures such as receiver operating characteristic (ROC) curve and area under ROC curve (AUC) have been recently introduced in evaluating and building classifiers for those kind of problems. They are, however, in-effective to evaluation of classifier's discrimination performance in a particular class of the classification problems that interests lie in only a local operating range of the classifier, In this paper, a new method is proposed that enables us to directly improve classifier's discrimination performance at a desired local operating range by defining and optimizing a partial area under ROC curve or domain-specific curve, which is difficult to achieve with conventional classification accuracy based learning methods. The effectiveness of the proposed approach is demonstrated in terms of fraud detection capability in a real-world fraud detection problem compared with the MSE-based approach.

Exploring the Feature Selection Method for Effective Opinion Mining: Emphasis on Particle Swarm Optimization Algorithms

  • Eo, Kyun Sun;Lee, Kun Chang
    • Journal of the Korea Society of Computer and Information
    • /
    • v.25 no.11
    • /
    • pp.41-50
    • /
    • 2020
  • Sentimental analysis begins with the search for words that determine the sentimentality inherent in data. Managers can understand market sentimentality by analyzing a number of relevant sentiment words which consumers usually tend to use. In this study, we propose exploring performance of feature selection methods embedded with Particle Swarm Optimization Multi Objectives Evolutionary Algorithms. The performance of the feature selection methods was benchmarked with machine learning classifiers such as Decision Tree, Naive Bayesian Network, Support Vector Machine, Random Forest, Bagging, Random Subspace, and Rotation Forest. Our empirical results of opinion mining revealed that the number of features was significantly reduced and the performance was not hurt. In specific, the Support Vector Machine showed the highest accuracy. Random subspace produced the best AUC results.

Predicting Forest Fires Using Machine Learning Considering Human Factors (인적요인을 고려한 머신러닝 활용 산림화재 예측)

  • Jin-Myeong Jang;Joo-Chan Kim;Hwa-Joong Kim;Kwang-Tae Kim
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.28 no.5
    • /
    • pp.109-126
    • /
    • 2023
  • Early detection of forest fires is essential in preventing large-scale forest fires. Predicting forest fires serves as a vital early detection method, leading to various related studies. However, many previous studies focused solely on climate and geographic factors, overlooking human factors, which significantly contribute to forest fires. This study aims to develop forest fire prediction models that take into account human, weather and geographical factors. This study conducted a comparative analysis of four machine learning models alongside the logistic regression model, using forest fire data from Gangwon-do spanning 2003 to 2020. The results indicate that XG Boost models performed the best (AUC=0.925), closely followed by Random Forest (AUC=0.920), both of which are machine learning techniques. Lastly, the study analyzed the relative importance of various factors through permutation feature importance analysis to derive operational insights. While meteorological factors showed a greater impact compared to human factors, various human factors were also found to be significant.

Pharmaceutical Studies on Ferroglycine Fumarate (II) -Studies on Optimization of Controlled Release Ferroglycine Fumarate Tablets- (푸마르산철글리신 복합체의 약제학적 연구 (제2보) -지속성 푸마르산철글리신 정제의 최적화에 관한 연구-)

  • Shin, Hyun-Jong;Lee, Wan-Ha
    • Journal of Pharmaceutical Investigation
    • /
    • v.17 no.3
    • /
    • pp.101-110
    • /
    • 1987
  • In order to reduce gastric irritation in the stomach of iron preparations, ferroglycine fumarate (FGF) granules coated with hydroxyethylcellulose was made by matrix granulator, and the constrained optimization method, employing the Lagrange equation, was successfully applied to the manufacturing process design of controlled release tablets. The effects of stearic acid and dried corn starch on tablet hardness, friability, dissolution rate $t_{50%}$ and tablet volume were found to be very significant. In rabbit test, pharmacokinetic parameters $(K_a,\;C_{max}\;and\;AUC^{0-12})$ and urinary excretion rate $(K_e)$ of the controlled release FGF tablets were higher than those of controlled release ferroglycine sulfate tablets which were manufactured in the same optimal conditions. Controlled release FGF tablets were more stable than controlled release ferroglycine sulfate tablets in accelerated storage conditions.

  • PDF

Aviation Convective Index for Deep Convective Area using the Global Unified Model of the Korean Meteorological Administration, Korea: Part 2. Seasonal Optimization and Case Studies (안전한 항공기 운항을 위한 현업 전지구예보모델 기반 깊은 대류 예측 지수: Part 2. 계절별 최적화 및 사례 분석)

  • Yi-June Park;Jung-Hoon Kim
    • Atmosphere
    • /
    • v.33 no.5
    • /
    • pp.531-548
    • /
    • 2023
  • We developed the Aviation Convective Index (ACI) for predicting deep convective area using the operational global Numerical Weather Prediction model of the Korea Meteorological Administration. Seasonally optimized ACI (ACISnOpt) was developed to consider seasonal variabilities on deep convections in Korea. Yearly optimized ACI (ACIYrOpt) in Part 1 showed that seasonally averaged values of Area Under the ROC Curve (AUC) and True Skill Statistics (TSS) were decreased by 0.420% and 5.797%, respectively, due to the significant degradation in winter season. In Part 2, we developed new membership function (MF) and weight combination of input variables in the ACI algorithm, which were optimized in each season. Finally, the seasonally optimized ACI (ACISnOpt) showed better performance skills with the significant improvements in AUC and TSS by 0.983% and 25.641% respectively, compared with those from the ACIYrOpt. To confirm the improvements in new algorithm, we also conducted two case studies in winter and spring with observed Convectively-Induced Turbulence (CIT) events from the aircraft data. In these cases, the ACISnOpt predicted a better spatial distribution and intensity of deep convection. Enhancements in the forecast fields from the ACIYrOpt to ACISnOpt in the selected cases explained well the changes in overall performance skills of the probability of detection for both "yes" and "no" occurrences of deep convection during 1-yr period of the data. These results imply that the ACI forecast should be optimized seasonally to take into account the variabilities in the background conditions for deep convections in Korea.