• Title/Summary/Keyword: Classification Accuracy Test

Search Result 393, Processing Time 0.027 seconds

Accuracy of Phishing Websites Detection Algorithms by Using Three Ranking Techniques

  • Mohammed, Badiea Abdulkarem;Al-Mekhlafi, Zeyad Ghaleb
    • International Journal of Computer Science & Network Security
    • /
    • v.22 no.2
    • /
    • pp.272-282
    • /
    • 2022
  • Between 2014 and 2019, the US lost more than 2.1 billion USD to phishing attacks, according to the FBI's Internet Crime Complaint Center, and COVID-19 scam complaints totaled more than 1,200. Phishing attacks reflect these awful effects. Phishing websites (PWs) detection appear in the literature. Previous methods included maintaining a centralized blacklist that is manually updated, but newly created pseudonyms cannot be detected. Several recent studies utilized supervised machine learning (SML) algorithms and schemes to manipulate the PWs detection problem. URL extraction-based algorithms and schemes. These studies demonstrate that some classification algorithms are more effective on different data sets. However, for the phishing site detection problem, no widely known classifier has been developed. This study is aimed at identifying the features and schemes of SML that work best in the face of PWs across all publicly available phishing data sets. The Scikit Learn library has eight widely used classification algorithms configured for assessment on the public phishing datasets. Eight was tested. Later, classification algorithms were used to measure accuracy on three different datasets for statistically significant differences, along with the Welch t-test. Assemblies and neural networks outclass classical algorithms in this study. On three publicly accessible phishing datasets, eight traditional SML algorithms were evaluated, and the results were calculated in terms of classification accuracy and classifier ranking as shown in tables 4 and 8. Eventually, on severely unbalanced datasets, classifiers that obtained higher than 99.0 percent classification accuracy. Finally, the results show that this could also be adapted and outperforms conventional techniques with good precision.

Comparison of Machine Learning Classification Models for the Development of Simulators for General X-ray Examination Education (일반엑스선검사 교육용 시뮬레이터 개발을 위한 기계학습 분류모델 비교)

  • Lee, In-Ja;Park, Chae-Yeon;Lee, Jun-Ho
    • Journal of radiological science and technology
    • /
    • v.45 no.2
    • /
    • pp.111-116
    • /
    • 2022
  • In this study, the applicability of machine learning for the development of a simulator for general X-ray examination education is evaluated. To this end, k-nearest neighbor(kNN), support vector machine(SVM) and neural network(NN) classification models are analyzed to present the most suitable model by analyzing the results. Image data was obtained by taking 100 photos each corresponding to Posterior anterior(PA), Posterior anterior oblique(Obl), Lateral(Lat), Fan lateral(Fan lat). 70% of the acquired 400 image data were used as training sets for learning machine learning models and 30% were used as test sets for evaluation. and prediction model was constructed for right-handed PA, Obl, Lat, Fan lat image classification. Based on the data set, after constructing the classification model using the kNN, SVM, and NN models, each model was compared through an error matrix. As a result of the evaluation, the accuracy of kNN was 0.967 area under curve(AUC) was 0.993, and the accuracy of SVM was 0.992 AUC was 1.000. The accuracy of NN was 0.992 and AUC was 0.999, which was slightly lower in kNN, but all three models recorded high accuracy and AUC. In this study, right-handed PA, Obl, Lat, Fan lat images were classified and predicted using the machine learning classification models, kNN, SVM, and NN models. The prediction showed that SVM and NN were the same at 0.992, and AUC was similar at 1.000 and 0.999, indicating that both models showed high predictive power and were applicable to educational simulators.

Mastitis Detection by Near-infrared Spectra of Cows Milk and SIMCA Classification Method

  • Tsenkova, R.;Atanassova, S.
    • Proceedings of the Korean Society of Near Infrared Spectroscopy Conference
    • /
    • 2001.06a
    • /
    • pp.1248-1248
    • /
    • 2001
  • Mastitis is a major problem for the global dairy industry and causes substantial economic losses from decreasing milk production and considerable compositional changes in milk, reducing milk quality. The potential of near infrared (NIR) spectroscopy in the region from 1100 to 2500nm and chemometric method for classification to detect milk from mastitic cows was investigated. A total of 189 milk samples from 7 Holstein cows were collected for 27 days, consecutively, and analyzed for somatic cells (SCC). Three of the cows were healthy, and the rest had mastitis periods during the experiment. NIR transflectance milk spectra were obtained by the InfraAlyzer 500 spectrophotometer in the spectral range from 1100 to 2500nm. All samples were divided into calibration set and test set. Class variable was assigned for each sample as follow: healthy (class 1) and mastitic (class 2), based on milk SCC content. The classification of the samples was performed using soft independent modeling of class analogy (SIMCA) and different spectral data pretreatment. Two concentration of SCC - 200 000 cells/ml and 300 000 cells/ml, respectively, were used as thresholds fer separation of healthy and mastitis cows. The best detection accuracy was found for models, obtained using 200 000 cells/ml as threshold and smoothed absorbance data - 98.41% from samples in the calibration set and 87.30% from the samples in the independent test set were correctly classified. SIMCA results for classes, based on 300 000 cells/ml threshold, showed a little lower accuracy of classification. The analysis of changes in the loading of first PC factor for group of healthy milk and group of mastitic milk showed, that separation between classes was indirect and based on influence of mastitis on the milk components. The accuracy of mastitis detection by SIMCA method, based on NIR spectra of milk would allow health screening of cows and differentiation between healthy and mastitic milk samples. Having SIMCA models, mastitis detection would be possible by using only DIR spectra of milk, without any other analyses.

  • PDF

Classification of Mouse Lung Metastatic Tumor with Deep Learning

  • Lee, Ha Neul;Seo, Hong-Deok;Kim, Eui-Myoung;Han, Beom Seok;Kang, Jin Seok
    • Biomolecules & Therapeutics
    • /
    • v.30 no.2
    • /
    • pp.179-183
    • /
    • 2022
  • Traditionally, pathologists microscopically examine tissue sections to detect pathological lesions; the many slides that must be evaluated impose severe work burdens. Also, diagnostic accuracy varies by pathologist training and experience; better diagnostic tools are required. Given the rapid development of computer vision, automated deep learning is now used to classify microscopic images, including medical images. Here, we used a Inception-v3 deep learning model to detect mouse lung metastatic tumors via whole slide imaging (WSI); we cropped the images to 151 by 151 pixels. The images were divided into training (53.8%) and test (46.2%) sets (21,017 and 18,016 images, respectively). When images from lung tissue containing tumor tissues were evaluated, the model accuracy was 98.76%. When images from normal lung tissue were evaluated, the model accuracy ("no tumor") was 99.87%. Thus, the deep learning model distinguished metastatic lesions from normal lung tissue. Our approach will allow the rapid and accurate analysis of various tissues.

Development of a Metabolic Syndrome Classification and Prediction Model for Koreans Using Deep Learning Technology: The Korea National Health and Nutrition Examination Survey (KNHANES) (2013-2018)

  • Hyerim Kim;Ji Hye Heo;Dong Hoon Lim;Yoona Kim
    • Clinical Nutrition Research
    • /
    • v.12 no.2
    • /
    • pp.138-153
    • /
    • 2023
  • The prevalence of metabolic syndrome (MetS) and its cost are increasing due to lifestyle changes and aging. This study aimed to develop a deep neural network model for prediction and classification of MetS according to nutrient intake and other MetS-related factors. This study included 17,848 individuals aged 40-69 years from the Korea National Health and Nutrition Examination Survey (2013-2018). We set MetS (3-5 risk factors present) as the dependent variable and 52 MetS-related factors and nutrient intake variables as independent variables in a regression analysis. The analysis compared and analyzed model accuracy, precision and recall by conventional logistic regression, machine learning-based logistic regression and deep learning. The accuracy of train data was 81.2089, and the accuracy of test data was 81.1485 in a MetS classification and prediction model developed in this study. These accuracies were higher than those obtained by conventional logistic regression or machine learning-based logistic regression. Precision, recall, and F1-score also showed the high accuracy in the deep learning model. Blood alanine aminotransferase (β = 12.2035) level showed the highest regression coefficient followed by blood aspartate aminotransferase (β = 11.771) level, waist circumference (β = 10.8555), body mass index (β = 10.3842), and blood glycated hemoglobin (β = 10.1802) level. Fats (cholesterol [β = -2.0545] and saturated fatty acid [β = -2.0483]) showed high regression coefficients among nutrient intakes. The deep learning model for classification and prediction on MetS showed a higher accuracy than conventional logistic regression or machine learning-based logistic regression.

A Study on Performance Comparison of Machine Learning Algorithm for Scaffold Defect Classification (인공지지체 불량 분류를 위한 기계 학습 알고리즘 성능 비교에 관한 연구)

  • Lee, Song-Yeon;Huh, Yong Jeong
    • Journal of the Semiconductor & Display Technology
    • /
    • v.19 no.3
    • /
    • pp.77-81
    • /
    • 2020
  • In this paper, we create scaffold defect classification models using machine learning based data. We extract the characteristic from collected scaffold external images using USB camera. SVM, KNN, MLP algorithm of machine learning was using extracted features. Classification models of three type learned using train dataset. We created scaffold defect classification models using test dataset. We quantified the performance of defect classification models. We have confirmed that the SVM accuracy is 95%. So the best performance model is using SVM.

Classification of Mental States Based on Spatiospectral Patterns of Brain Electrical Activity

  • Hwang, Han-Jeong;Lim, Jeong-Hwan;Im, Chang-Hwan
    • Journal of Biomedical Engineering Research
    • /
    • v.33 no.1
    • /
    • pp.15-24
    • /
    • 2012
  • Classification of human thought is an emerging research field that may allow us to understand human brain functions and further develop advanced brain-computer interface (BCI) systems. In the present study, we introduce a new approach to classify various mental states from noninvasive electrophysiological recordings of human brain activity. We utilized the full spatial and spectral information contained in the electroencephalography (EEG) signals recorded while a subject is performing a specific mental task. For this, the EEG data were converted into a 2D spatiospectral pattern map, of which each element was filled with 1, 0, and -1 reflecting the degrees of event-related synchronization (ERS) and event-related desynchronization (ERD). We evaluated the similarity between a current (input) 2D pattern map and the template pattern maps (database), by taking the inner-product of pattern matrices. Then, the current 2D pattern map was assigned to a class that demonstrated the highest similarity value. For the verification of our approach, eight participants took part in the present study; their EEG data were recorded while they performed four different cognitive imagery tasks. Consistent ERS/ERD patterns were observed more frequently between trials in the same class than those in different classes, indicating that these spatiospectral pattern maps could be used to classify different mental states. The classification accuracy was evaluated for each participant from both the proposed approach and a conventional mental state classification method based on the inter-hemispheric spectral power asymmetry, using the leave-one-out cross-validation (LOOCV). An average accuracy of 68.13% (${\pm}9.64%$) was attained for the proposed method; whereas an average accuracy of 57% (${\pm}5.68%$) was attained for the conventional method (significance was assessed by the one-tail paired $t$-test, $p$ < 0.01), showing that the proposed simple classification approach might be one of the promising methods in discriminating various mental states.

A Study on the Performance of Deep learning-based Automatic Classification of Forest Plants: A Comparison of Data Collection Methods (데이터 수집방법에 따른 딥러닝 기반 산림수종 자동분류 정확도 변화에 관한 연구)

  • Kim, Bomi;Woo, Heesung;Park, Joowon
    • Journal of Korean Society of Forest Science
    • /
    • v.109 no.1
    • /
    • pp.23-30
    • /
    • 2020
  • The use of increased computing power, machine learning, and deep learning techniques have dramatically increased in various sectors. In particular, image detection algorithms are broadly used in forestry and remote sensing areas to identify forest types and tree species. However, in South Korea, machine learning has rarely, if ever, been applied in forestry image detection, especially to classify tree species. This study integrates the application of machine learning and forest image detection; specifically, we compared the ability of two machine learning data collection methods, namely image data captured by forest experts (D1) and web-crawling (D2), to automate the classification of five trees species. In addition, two methods of characterization to train/test the system were investigated. The results indicated a significant difference in classification accuracy between D1 and D2: the classification accuracy of D1 was higher than that of D2. In order to increase the classification accuracy of D2, additional data filtering techniques were required to reduce the noise of uncensored image data.

KOMPSAT-3A Urban Classification Using Machine Learning Algorithm - Focusing on Yang-jae in Seoul - (기계학습 기법에 따른 KOMPSAT-3A 시가화 영상 분류 - 서울시 양재 지역을 중심으로 -)

  • Youn, Hyoungjin;Jeong, Jongchul
    • Korean Journal of Remote Sensing
    • /
    • v.36 no.6_2
    • /
    • pp.1567-1577
    • /
    • 2020
  • Urban land cover classification is role in urban planning and management. So, it's important to improve classification accuracy on urban location. In this paper, machine learning model, Support Vector Machine (SVM) and Artificial Neural Network (ANN) are proposed for urban land cover classification based on high resolution satellite imagery (KOMPSAT-3A). Satellite image was trained based on 25 m rectangle grid to create training data, and training models used for classifying test area. During the validation process, we presented confusion matrix for each result with 250 Ground Truth Points (GTP). Of the four SVM kernels and the two activation functions ANN, the SVM Polynomial kernel model had the highest accuracy of 86%. In the process of comparing the SVM and ANN using GTP, the SVM model was more effective than the ANN model for KOMPSAT-3A classification. Among the four classes (building, road, vegetation, and bare-soil), building class showed the lowest classification accuracy due to the shadow caused by the high rise building.

An Automatic Document Classification with Bayesian Learning (베이지안 학습을 이용한 문서의 자동분류)

  • Kim, Jin-Sang;Shin, Yang-Kyu
    • Journal of the Korean Data and Information Science Society
    • /
    • v.11 no.1
    • /
    • pp.19-30
    • /
    • 2000
  • As the number of online documents increases enormously with the expansion of information technology, the importance of automatic document classification is greatly enlarged. In this paper, an automatic document classification method is investigated and applied to UseNet 20 newsgroup articles to test its efficacy. The classification system uses Naive Bayes classification algorithm and the experimental result shows that a randomly selected newsgroup arcicle can be classified into its own category over 77% accuracy.

  • PDF