• Title/Summary/Keyword: ROC AUC

Search Result 292, Processing Time 0.026 seconds

An Experimental Study on AutoEncoder to Detect Botnet Traffic Using NetFlow-Timewindow Scheme: Revisited (넷플로우-타임윈도우 기반 봇넷 검출을 위한 오토엔코더 실험적 재고찰)

  • Koohong Kang
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.33 no.4
    • /
    • pp.687-697
    • /
    • 2023
  • Botnets, whose attack patterns are becoming more sophisticated and diverse, are recognized as one of the most serious cybersecurity threats today. This paper revisits the experimental results of botnet detection using autoencoder, a semi-supervised deep learning model, for UGR and CTU-13 data sets. To prepare the input vectors of autoencoder, we create data points by grouping the NetFlow records into sliding windows based on source IP address and aggregating them to form features. In particular, we discover a simple power-law; that is the number of data points that have some flow-degree is proportional to the number of NetFlow records aggregated in them. Moreover, we show that our power-law fits the real data very well resulting in correlation coefficients of 97% or higher. We also show that this power-law has an impact on the learning of autoencoder and, as a result, influences the performance of botnet detection. Furthermore, we evaluate the performance of autoencoder using the area under the Receiver Operating Characteristic (ROC) curve.

Landslide Risk Assessment of Cropland and Man-made Infrastructures using Bayesian Predictive Model (베이지안 예측모델을 활용한 농업 및 인공 인프라의 산사태 재해 위험 평가)

  • Al, Mamun;Jang, Dong-Ho
    • Journal of The Geomorphological Association of Korea
    • /
    • v.27 no.3
    • /
    • pp.87-103
    • /
    • 2020
  • The purpose of this study is to evaluate the risk of cropland and man-made infrastructures in a landslide-prone area using a GIS-based method. To achieve this goal, a landslide inventory map was prepared based on aerial photograph analysis as well as field observations. A total of 550 landslides have been counted in the entire study area. For model analysis and validation, extracted landslides were randomly selected and divided into two groups. The landslide causative factors such as slope, aspect, curvature, topographic wetness index, elevation, forest type, forest crown density, geology, land-use, soil drainage, and soil texture were used in the analysis. Moreover, to identify the correlation between landslides and causative factors, pixels were divided into several classes and frequency ratio was also extracted. A landslide susceptibility map was constructed using a bayesian predictive model (BPM) based on the entire events. In the cross validation process, the landslide susceptibility map as well as observation data were plotted with a receiver operating characteristic (ROC) curve then the area under the curve (AUC) was calculated and tried to extract a success rate curve. The results showed that, the BPM produced 85.8% accuracy. We believed that the model was acceptable for the landslide susceptibility analysis of the study area. In addition, for risk assessment, monetary value (local) and vulnerability scale were added for each social thematic data layers, which were then converted into US dollar considering landslide occurrence time. Moreover, the total number of the study area pixels and predictive landslide affected pixels were considered for making a probability table. Matching with the affected number, 5,000 landslide pixels were assumed to run for final calculation. Based on the result, cropland showed the estimated total risk as US $ 35.4 million and man-made infrastructure risk amounted to US $ 39.3 million.

Nomogram Models for Distinguishing Intraductal Carcinoma of the Prostate From Prostatic Acinar Adenocarcinoma Based on Multiparametric Magnetic Resonance Imaging

  • Ling Yang;Xue-Ming Li;Meng-Ni Zhang;Jin Yao;Bin Song
    • Korean Journal of Radiology
    • /
    • v.24 no.7
    • /
    • pp.668-680
    • /
    • 2023
  • Objective: To compare multiparametric magnetic resonance imaging (MRI) features of intraductal carcinoma of the prostate (IDC-P) with those of prostatic acinar adenocarcinoma (PAC) and develop prediction models to distinguish IDC-P from PAC and IDC-P with a high proportion (IDC ≥ 10%, hpIDC-P) from IDC-P with a low proportion (IDC < 10%, lpIDC-P) and PAC. Materials and Methods: One hundred and six patients with hpIDC-P, 105 with lpIDC-P and 168 with PAC, who underwent pretreatment multiparametric MRI between January 2015 and December 2020 were included in this study. Imaging parameters, including invasiveness and metastasis, were evaluated and compared between the PAC and IDC-P groups as well as between the hpIDC-P and lpIDC-P subgroups. Nomograms for distinguishing IDC-P from PAC, and hpIDC-P from lpIDC-P and PAC, were made using multivariable logistic regression analysis. The discrimination performance of the models was assessed using the receiver operating characteristic area under the curve (ROC-AUC) in the sample, where the models were derived from without an independent validation sample. Results: The tumor diameter was larger and invasive and metastatic features were more common in the IDC-P than in the PAC group (P < 0.001). The distribution of extraprostatic extension (EPE) and pelvic lymphadenopathy was even greater, and the apparent diffusion coefficient (ADC) ratio was lower in the hpIDC-P than in the lpIDC-P group (P < 0.05). The ROC-AUCs of the stepwise models based solely on imaging features for distinguishing IDC-P from PAC and hpIDC-P from lpIDC-P and PAC were 0.797 (95% confidence interval, 0.750-0.843) and 0.777 (0.727-0.827), respectively. Conclusion: IDC-P was more likely to be larger, more invasive, and more metastatic, with obviously restricted diffusion. EPE, pelvic lymphadenopathy, and a lower ADC ratio were more likely to occur in hpIDC-P, and were also the most useful variables in both nomograms for predicting IDC-P and hpIDC-P.

Value of the Platelet to Lymphocyte Ratio in the Diagnosis of Ovarian Neoplasms in Adolescents

  • Ozaksit, Gulnur;Tokmak, Aytekin;Kalkan, Hatice;Yesilyurt, Huseyin
    • Asian Pacific Journal of Cancer Prevention
    • /
    • v.16 no.5
    • /
    • pp.2037-2041
    • /
    • 2015
  • Background: Relationships between poor prognosis of ovarian malignancies and changes in complete blood count parameters have been proposed previously. In this work, we aimed to evaluate clinicopathologic features in adolescents with adnexal masses and sought to establish any predictive value of the platelet to lymphocyte ratio (PLR) in diagnosis. Materials and Methods: This retrospective study was conducted on 196 adolescent females with adnexal masses. Three groups were constituted with respect to clinical or histopathology results: group 1, non-neoplastic patients (n:65); group 2, neoplastic patients (n:68); and group 3 expectantly managed patients (n:63). The main parameters recorded from the hospital database and patient files were age, body mass index (BMI), chief symptoms, diameter of the mass (DOM), tumor marker levels, complete blood count values including absolute neutrophil, lymphocyte, and platelet counts, mean platelet volume, platelet distribution width, and platecrit, surgical features, and postoperative histopathology results. Results: The expectantly managed patients were younger than the other groups (p=0.007). The mean body mass index (BMI) was higher in the neoplastic group (p=0.016). Preoperative DOM, CA125, mean platelet volume and PLR were statistically significantly different between the groups (p<0.05). ROC curve analysis demonstrated that increased PLR (AUC, 0.609; p=0.011) and BMI (AUC, 0.611; p=0.011) may be discriminative factors in predicting ovarian neoplasms in adolescents preoperatively. When the cut-off point for the PLR level was set to 140, the sensitivity and specificity levels were found to be 65.7% and 57.6%, respectively. Conclusions: We suggest that beside a careful preoperative evaluation including clinical characteristics, ultrasonographic features and tumor markers, PLR may predict ovarian neoplasms in adolescents.

Validation of Instruments to Classify the Frailty of the Elderly in Community (지역사회 거주 노인의 허약선별도구 타당도 평가)

  • Lee, In-Sook;Park, Young-Im;Park, Eun-Ok;Lee, Soon-Hee;Jeong, Ihn-Sook
    • Research in Community and Public Health Nursing
    • /
    • v.22 no.3
    • /
    • pp.302-314
    • /
    • 2011
  • Purpose: This study aimed to validate instruments to classify the frailty of Korean elderly people in community. Methods: For this study, 632 elders were selected from community-based elderly houses and home visiting registries, and data on frailty were collected using three instruments during November, 2008. The Korean Frail Scale (KFS) was composed of 10 domains with the maximum score of 20. The Edmonton Frail Scale (EFS) had 10 domains with the maximum score of 17. The 25_Japan Frail Scale (25_JFS) was composed of 6 domains with the maximum score of 25. Internal consistency was measured with Cronbach's ${\alpha}$. Sensitivity, specificity and area under the curve (AUC) of ROC were measured to see validity with long.term care insurance grade as a gold standard. Results: The Cronbach's ${\alpha}$ was .72 for KFS, .55 for EFS, and .80 for 25_JFS. Sensitivity, specificity, and AUC were 70.0%, 83.2%, and .83, respectively, at cutting point 10.5 for the KFS, 50.0%, 80.9%, and .66, respectively, at 8.5 for EFS, and 80.0%, 85.9%, and .86, respectively, at 12.5 for 25_JFS. Conclusion: KFS and three JFS showed favorable internal consistency and predictive validity. Further longitudinal studies are recommended to confirm predictive validity.

A Survival Prediction Model of Rats in Uncontrolled Acute Hemorrhagic Shock Using the Random Forest Classifier (랜덤 포리스트를 이용한 비제어 급성 출혈성 쇼크의 흰쥐에서의 생존 예측)

  • Choi, J.Y.;Kim, S.K.;Koo, J.M.;Kim, D.W.
    • Journal of Biomedical Engineering Research
    • /
    • v.33 no.3
    • /
    • pp.148-154
    • /
    • 2012
  • Hemorrhagic shock is a primary cause of deaths resulting from injury in the world. Although many studies have tried to diagnose accurately hemorrhagic shock in the early stage, such attempts were not successful due to compensatory mechanisms of humans. The objective of this study was to construct a survival prediction model of rats in acute hemorrhagic shock using a random forest (RF) model. Heart rate (HR), mean arterial pressure (MAP), respiration rate (RR), lactate concentration (LC), and peripheral perfusion (PP) measured in rats were used as input variables for the RF model and its performance was compared with that of a logistic regression (LR) model. Before constructing the models, we performed 5-fold cross validation for RF variable selection, and forward stepwise variable selection for the LR model to examine which variables were important for the models. For the LR model, sensitivity, specificity, accuracy, and area under the receiver operating characteristic curve (ROC-AUC) were 0.83, 0.95, 0.88, and 0.96, respectively. For the RF models, sensitivity, specificity, accuracy, and AUC were 0.97, 0.95, 0.96, and 0.99, respectively. In conclusion, the RF model was superior to the LR model for survival prediction in the rat model.

Co-amplification at Lower Denaturation-temperature PCR Combined with Unlabled-probe High-resolution Melting to Detect KRAS Codon 12 and 13 Mutations in Plasma-circulating DNA of Pancreatic Adenocarcinoma Cases

  • Wu, Jiong;Zhou, Yan;Zhang, Chun-Yan;Song, Bin-Bin;Wang, Bei-Li;Pan, Bai-Shen;Lou, Wen-Hui;Guo, Wei
    • Asian Pacific Journal of Cancer Prevention
    • /
    • v.15 no.24
    • /
    • pp.10647-10652
    • /
    • 2015
  • Background: The aim of our study was to establish COLD-PCR combined with an unlabeled-probe HRM approach for detecting KRAS codon 12 and 13 mutations in plasma-circulating DNA of pancreatic adenocarcinoma (PA) cases as a novel and effective diagnostic technique. Materials and Methods: We tested the sensitivity and specificity of this approach with dilutions of known mutated cell lines. We screened 36 plasma-circulating DNA samples, 24 from the disease control group and 25 of a healthy group, to be subsequently sequenced to confirm mutations. Simultaneously, we tested the specimens using conventional PCR followed by HRM and then used target-DNA cloning and sequencing for verification. The ROC and respective AUC were calculated for KRAS mutations and/or serum CA 19-9. Results: It was found that the sensitivity of Sanger reached 0.5% with COLD-PCR, whereas that obtained after conventional PCR did 20%; that of COLD-PCR based on unlabeled-probe HRM, 0.1%. KRAS mutations were identified in 26 of 36 PA cases (72.2%), while none were detected in the disease control and/or healthy group. KRAS mutations were identified both in 26 PA tissues and plasma samples. The AUC of COLD-PCR based unlabeled probe HRM turned out to be 0.861, which when combined with CA 19-9 increased to 0.934. Conclusions: It was concluded that COLD-PCR with unlabeled-probe HRM can be a sensitive and accurate screening technique to detect KRAS codon 12 and 13 mutations in plasma-circulating DNA for diagnosing and treating PA.

A Comparison of Different Depression Instruments for Stroke Patients (뇌졸중 환자의 우울증 평가도구 비교)

  • Lee, Dong-Jin;Shim, Jae-Kwang;An, Seung-Heon
    • The Journal of Korean Physical Therapy
    • /
    • v.23 no.2
    • /
    • pp.69-76
    • /
    • 2011
  • Purpose: The aim of this study was to investigate the prevalence of depressive symptoms in stroke patients and to compare characteristics of different rating scales - Hamilton Depression Rating Scale (HDRS), Beck Depression Inventory (BDI) and Hospital Anxiety and Depression Scale-Depression (HAD.D)- with regard to diagnosis and severity assessment for post-stroke depression. Methods: Participants included 44 stroke patients who could communicate. At admission, all study participants received a semi-structured interview using the HDRS and a self-completed questionnaire using the BDI and the HAD-D. Pearson's correlation method was used to examine associations among the three depression scales. The BDI and HAD-D were compared based on HDRS criteria, and the sensitivity and specificity using cut-off values were analyzed. Results: The HDRS showed that 52.30% of stroke patients had depressive symptoms on the BDI and HAD-D it was 59.10%. The HDRS correlated significantly with the BDI (r=0.81, p<0.01) and HAD-D (r=0.55, p<0.01). The BDI correlated significantly with HADS (r=0.50, p<0.01). After calculating the area under the ROC curve to decide on HDRS criteria, the BDI (AUC=0.91, 95% CI: 0.83.0.99) showed a significantly larger area compared to the HAD.D (AUC=0.82, 95% CI: 0.69-0.94). The cut-off value of the BDI was 12.50 points with a sensitivity of 81.00% and a specificity of 76.20%. Conclusion: These findings show that the BDI is a useful screening test for depression that most closely predicts the HRDS score.

Median Filtering Detection of Digital Images Using Pixel Gradients

  • RHEE, Kang Hyeon
    • IEIE Transactions on Smart Processing and Computing
    • /
    • v.4 no.4
    • /
    • pp.195-201
    • /
    • 2015
  • For median filtering (MF) detection in altered digital images, this paper presents a new feature vector that is formed from autoregressive (AR) coefficients via an AR model of the gradients between the neighboring row and column lines in an image. Subsequently, the defined 10-D feature vector is trained in a support vector machine (SVM) for MF detection among forged images. The MF classification is compared to the median filter residual (MFR) scheme that had the same 10-D feature vector. In the experiment, three kinds of test items are area under receiver operating characteristic (ROC) curve (AUC), classification ratio, and minimal average decision error. The performance is excellent for unaltered (ORI) or once-altered images, such as $3{\times}3$ average filtering (AVE3), QF=90 JPEG (JPG90), 90% down, and 110% up to scale (DN0.9 and Up1.1) images, versus $3{\times}3$ and $5{\times}5$ median filtering (MF3 and MF5, respectively) and MF3 and MF5 composite images (MF35). When the forged image was post-altered with AVE3, DN0.9, UP1.1 and JPG70 after MF3, MF5 and MF35, the performance of the proposed scheme is lower than the MFR scheme. In particular, the feature vector in this paper has a superior classification ratio compared to AVE3. However, in the measured performances with unaltered, once-altered and post-altered images versus MF3, MF5 and MF35, the resultant AUC by 'sensitivity' (TP: true positive rate) and '1-specificity' (FN: false negative rate) is achieved closer to 1. Thus, it is confirmed that the grade evaluation of the proposed scheme can be rated as 'Excellent (A)'.

The prognostic value of median nerve thickness in diagnosing carpal tunnel syndrome using magnetic resonance imaging: a pilot study

  • Lee, Sooho;Cho, Hyung Rae;Yoo, Jun Sung;Kim, Young Uk
    • The Korean Journal of Pain
    • /
    • v.33 no.1
    • /
    • pp.54-59
    • /
    • 2020
  • Background: The median nerve cross-sectional area (MNCSA) is a useful morphological parameter for the evaluation of carpal tunnel syndrome (CTS). However, there have been limited studies investigating the anatomical basis of median nerve flattening. Thus, to evaluate the connection between median nerve flattening and CTS, we carried out a measurement of the median nerve thickness (MNT). Methods: Both MNCSA and MNT measurement tools were collected from 20 patients with CTS, and from 20 control individuals who underwent carpal tunnel magnetic resonance imaging (CTMRI). We measured the MNCSA and MNT at the level of the hook of hamate on CTMRI. The MNCSA was measured on the transverse angled sections through the whole area. The MNT was measured based on the most compressed MNT. Results: The mean MNCSA was 9.01 ± 1.94 ㎟ in the control group and 6.58 ± 1.75 ㎟ in the CTS group. The mean MNT was 2.18 ± 0.39 mm in the control group and 1.43 ± 0.28 mm in the CTS group. Receiver operating characteristics curve analysis demonstrated that the optimal cut-off value for the MNCSA was 7.72 ㎟, with 75.0% sensitivity, 75.0% specificity, and an area under the curve (AUC) of 0.82 (95% confidence interval [CI], 0.69-0.95). The best cut off-threshold of the MNT was 1.76 mm, with 85% sensitivity, 85% specificity, and an AUC of 0.94 (95% CI, 0.87-1.00). Conclusions: Even though both MNCSA and MNT were significantly associated with CTS, MNT was identified as a more suitable measurement parameter.