• Title/Summary/Keyword: Confusion Matrix

Search Result 115, Processing Time 0.032 seconds

Performance of Support Vector Machine for Classifying Land Cover in Optical Satellite Images: A Case Study in Delaware River Port Area

  • Ramayanti, Suci;Kim, Bong Chan;Park, Sungjae;Lee, Chang-Wook
    • Korean Journal of Remote Sensing
    • /
    • v.38 no.6_4
    • /
    • pp.1911-1923
    • /
    • 2022
  • The availability of high-resolution satellite images provides precise information without direct observation of the research target. Korea Multi-Purpose Satellite (KOMPSAT), also known as the Arirang satellite, has been developed and utilized for earth observation. The machine learning model was continuously proven as a good classifier in classifying remotely sensed images. This study aimed to compare the performance of the support vector machine (SVM) model in classifying the land cover of the Delaware River port area on high and medium-resolution images. Three optical images, which are KOMPSAT-2, KOMPSAT-3A, and Sentinel-2B, were classified into six land cover classes, including water, road, vegetation, building, vacant, and shadow. The KOMPSAT images are provided by Korea Aerospace Research Institute (KARI), and the Sentinel-2B image was provided by the European Space Agency (ESA). The training samples were manually digitized for each land cover class and considered the reference image. The predicted images were compared to the actual data to obtain the accuracy assessment using a confusion matrix analysis. In addition, the time-consuming training and classifying were recorded to evaluate the model performance. The results showed that the KOMPSAT-3A image has the highest overall accuracy and followed by KOMPSAT-2 and Sentinel-2B results. On the contrary, the model took a long time to classify the higher-resolution image compared to the lower resolution. For that reason, we can conclude that the SVM model performed better in the higher resolution image with the consequence of the longer time-consuming training and classifying data. Thus, this finding might provide consideration for related researchers when selecting satellite imagery for effective and accurate image classification.

Predicting defects of EBM-based additive manufacturing through XGBoost (XGBoost를 활용한 EBM 3D 프린터의 결함 예측)

  • Jeong, Jahoon
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.26 no.5
    • /
    • pp.641-648
    • /
    • 2022
  • This paper is a study to find out the factors affecting the defects that occur during the use of Electron Beam Melting (EBM), one of the 3D printer output methods, through data analysis. By referring to factors identified as major causes of defects in previous studies, log files occurring between processes were analyzed and related variables were extracted. In addition, focusing on the fact that the data is time series data, the concept of a window was introduced to compose variables including data from all three layers. The dependent variable is a binary classification problem with the presence or absence of defects, and due to the problem that the proportion of defect layers is low (about 4%), balanced training data were created through the SMOTE technique. For the analysis, I use XGBoost using Gridsearch CV, and evaluate the classification performance based on the confusion matrix. I conclude results of the stuy by analyzing the importance of variables through SHAP values.

A Detecting Technique for the Climatic Factors that Aided the Spread of COVID-19 using Deep and Machine Learning Algorithms

  • Al-Sharari, Waad;Mahmood, Mahmood A.;Abd El-Aziz, A.A.;Azim, Nesrine A.
    • International Journal of Computer Science & Network Security
    • /
    • v.22 no.6
    • /
    • pp.131-138
    • /
    • 2022
  • Novel Coronavirus (COVID-19) is viewed as one of the main general wellbeing theaters on the worldwide level all over the planet. Because of the abrupt idea of the flare-up and the irresistible force of the infection, it causes individuals tension, melancholy, and other pressure responses. The avoidance and control of the novel Covid pneumonia have moved into an imperative stage. It is fundamental to early foresee and figure of infection episode during this troublesome opportunity to control of its grimness and mortality. The entire world is investing unimaginable amounts of energy to fight against the spread of this lethal infection. In this paper, we utilized machine learning and deep learning techniques for analyzing what is going on utilizing countries shared information and for detecting the climate factors that effect on spreading Covid-19, such as humidity, sunny hours, temperature and wind speed for understanding its regular dramatic way of behaving alongside the forecast of future reachability of the COVID-2019 around the world. We utilized data collected and produced by Kaggle and the Johns Hopkins Center for Systems Science. The dataset has 25 attributes and 9566 objects. Our Experiment consists of two phases. In phase one, we preprocessed dataset for DL model and features were decreased to four features humidity, sunny hours, temperature and wind speed by utilized the Pearson Correlation Coefficient technique (correlation attributes feature selection). In phase two, we utilized the traditional famous six machine learning techniques for numerical datasets, and Dense Net deep learning model to predict and detect the climatic factor that aide to disease outbreak. We validated the model by using confusion matrix (CM) and measured the performance by four different metrics: accuracy, f-measure, recall, and precision.

Particulate Matter Rating Map based on Machine Learning with Adaboost Algorithm (기계학습 Adaboost에 기초한 미세먼지 등급 지도)

  • Jeong, Jong-Chul
    • Journal of Cadastre & Land InformatiX
    • /
    • v.51 no.2
    • /
    • pp.141-150
    • /
    • 2021
  • Fine dust is a substance that greatly affects human health, and various studies have been conducted in this regard. Due to the human influence of particulate matter, various studies are being conducted to predict particulate matter grade using past data measured in the monitoring network of Seoul city. In this paper, predictive model have focused on particulate matter concentration in May, 2019, Seoul. The air pollutant variables were used to training such as SO2, CO, NO2, O3. The predictive model based on Adaboost, and training model was dividing PM10 and PM2.5. As a result of the prediction performance comparison through confusion matrix, the Adaboost model was more conformable for predicting the particulate matter concentration grade. Although air pollutant variables have a higher correlation with PM2.5, training model need to train a lot of data and to use additional variables such as traffic volume to predict more effective PM10 and PM2.5 distribution grade.

Personalized Diabetes Risk Assessment Through Multifaceted Analysis (PD- RAMA): A Novel Machine Learning Approach to Early Detection and Management of Type 2 Diabetes

  • Gharbi Alshammari
    • International Journal of Computer Science & Network Security
    • /
    • v.23 no.8
    • /
    • pp.17-25
    • /
    • 2023
  • The alarming global prevalence of Type 2 Diabetes Mellitus (T2DM) has catalyzed an urgent need for robust, early diagnostic methodologies. This study unveils a pioneering approach to predicting T2DM, employing the Extreme Gradient Boosting (XGBoost) algorithm, renowned for its predictive accuracy and computational efficiency. The investigation harnesses a meticulously curated dataset of 4303 samples, extracted from a comprehensive Chinese research study, scrupulously aligned with the World Health Organization's indicators and standards. The dataset encapsulates a multifaceted spectrum of clinical, demographic, and lifestyle attributes. Through an intricate process of hyperparameter optimization, the XGBoost model exhibited an unparalleled best score, elucidating a distinctive combination of parameters such as a learning rate of 0.1, max depth of 3, 150 estimators, and specific colsample strategies. The model's validation accuracy of 0.957, coupled with a sensitivity of 0.9898 and specificity of 0.8897, underlines its robustness in classifying T2DM. A detailed analysis of the confusion matrix further substantiated the model's diagnostic prowess, with an F1-score of 0.9308, illustrating its balanced performance in true positive and negative classifications. The precision and recall metrics provided nuanced insights into the model's ability to minimize false predictions, thereby enhancing its clinical applicability. The research findings not only underline the remarkable efficacy of XGBoost in T2DM prediction but also contribute to the burgeoning field of machine learning applications in personalized healthcare. By elucidating a novel paradigm that accentuates the synergistic integration of multifaceted clinical parameters, this study fosters a promising avenue for precise early detection, risk stratification, and patient-centric intervention in diabetes care. The research serves as a beacon, inspiring further exploration and innovation in leveraging advanced analytical techniques for transformative impacts on predictive diagnostics and chronic disease management.

Deep learning method for compressive strength prediction for lightweight concrete

  • Yaser A. Nanehkaran;Mohammad Azarafza;Tolga Pusatli;Masoud Hajialilue Bonab;Arash Esmatkhah Irani;Mehdi Kouhdarag;Junde Chen;Reza Derakhshani
    • Computers and Concrete
    • /
    • v.32 no.3
    • /
    • pp.327-337
    • /
    • 2023
  • Concrete is the most widely used building material, with various types including high- and ultra-high-strength, reinforced, normal, and lightweight concretes. However, accurately predicting concrete properties is challenging due to the geotechnical design code's requirement for specific characteristics. To overcome this issue, researchers have turned to new technologies like machine learning to develop proper methodologies for concrete specification. In this study, we propose a highly accurate deep learning-based predictive model to investigate the compressive strength (UCS) of lightweight concrete with natural aggregates (pumice). Our model was implemented on a database containing 249 experimental records and revealed that water, cement, water-cement ratio, fine-coarse aggregate, aggregate substitution rate, fine aggregate replacement, and superplasticizer are the most influential covariates on UCS. To validate our model, we trained and tested it on random subsets of the database, and its performance was evaluated using a confusion matrix and receiver operating characteristic (ROC) overall accuracy. The proposed model was compared with widely known machine learning methods such as MLP, SVM, and DT classifiers to assess its capability. In addition, the model was tested on 25 laboratory UCS tests to evaluate its predictability. Our findings showed that the proposed model achieved the highest accuracy (accuracy=0.97, precision=0.97) and the lowest error rate with a high learning rate (R2=0.914), as confirmed by ROC (AUC=0.971), which is higher than other classifiers. Therefore, the proposed method demonstrates a high level of performance and capability for UCS predictions.

Prediction of Stunting Among Under-5 Children in Rwanda Using Machine Learning Techniques

  • Similien Ndagijimana;Ignace Habimana Kabano;Emmanuel Masabo;Jean Marie Ntaganda
    • Journal of Preventive Medicine and Public Health
    • /
    • v.56 no.1
    • /
    • pp.41-49
    • /
    • 2023
  • Objectives: Rwanda reported a stunting rate of 33% in 2020, decreasing from 38% in 2015; however, stunting remains an issue. Globally, child deaths from malnutrition stand at 45%. The best options for the early detection and treatment of stunting should be made a community policy priority, and health services remain an issue. Hence, this research aimed to develop a model for predicting stunting in Rwandan children. Methods: The Rwanda Demographic and Health Survey 2019-2020 was used as secondary data. Stratified 10-fold cross-validation was used, and different machine learning classifiers were trained to predict stunting status. The prediction models were compared using different metrics, and the best model was chosen. Results: The best model was developed with the gradient boosting classifier algorithm, with a training accuracy of 80.49% based on the performance indicators of several models. Based on a confusion matrix, the test accuracy, sensitivity, specificity, and F1 were calculated, yielding the model's ability to classify stunting cases correctly at 79.33%, identify stunted children accurately at 72.51%, and categorize non-stunted children correctly at 94.49%, with an area under the curve of 0.89. The model found that the mother's height, television, the child's age, province, mother's education, birth weight, and childbirth size were the most important predictors of stunting status. Conclusions: Therefore, machine-learning techniques may be used in Rwanda to construct an accurate model that can detect the early stages of stunting and offer the best predictive attributes to help prevent and control stunting in under five Rwandan children.

Image Clustering Using Machine Learning : Study of InceptionV3 with K-means Methods. (머신 러닝을 사용한 이미지 클러스터링: K-means 방법을 사용한 InceptionV3 연구)

  • Nindam, Somsauwt;Lee, Hyo Jong
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2021.11a
    • /
    • pp.681-684
    • /
    • 2021
  • In this paper, we study image clustering without labeling using machine learning techniques. We proposed an unsupervised machine learning technique to design an image clustering model that automatically categorizes images into groups. Our experiment focused on inception convolutional neural networks (inception V3) with k-mean methods to cluster images. For this, we collect the public datasets containing Food-K5, Flowers, Handwritten Digit, Cats-dogs, and our dataset Rice Germination, and the owner dataset Palm print. Our experiment can expand into three-part; First, format all the images to un-label and move to whole datasets. Second, load dataset into the inception V3 extraction image features and transferred to the k-mean cluster group hold on six classes. Lastly, evaluate modeling accuracy using the confusion matrix base on precision, recall, F1 to analyze. In this our methods, we can get the results as 1) Handwritten Digit (precision = 1.000, recall = 1.000, F1 = 1.00), 2) Food-K5 (precision = 0.975, recall = 0.945, F1 = 0.96), 3) Palm print (precision = 1.000, recall = 0.999, F1 = 1.00), 4) Cats-dogs (precision = 0.997, recall = 0.475, F1 = 0.64), 5) Flowers (precision = 0.610, recall = 0.982, F1 = 0.75), and our dataset 6) Rice Germination (precision = 0.997, recall = 0.943, F1 = 0.97). Our experiment showed that modeling could get an accuracy rate of 0.8908; the outcomes state that the proposed model is strongest enough to differentiate the different images and classify them into clusters.

Human Activity Classification Using Deep Transfer Learning (딥 전이 학습을 이용한 인간 행동 분류)

  • Nindam, Somsawut;Manmai, Thong-oon;Sung, Thaileang;Wu, Jiahua;Lee, Hyo Jong
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2022.11a
    • /
    • pp.478-480
    • /
    • 2022
  • This paper studies human activity image classification using deep transfer learning techniques focused on the inception convolutional neural networks (InceptionV3) model. For this, we used UFC-101 public datasets containing a group of students' behaviors in mathematics classrooms at a school in Thailand. The video dataset contains Play Sitar, Tai Chi, Walking with Dog, and Student Study (our dataset) classes. The experiment was conducted in three phases. First, it extracts an image frame from the video, and a tag is labeled on the frame. Second, it loads the dataset into the inception V3 with transfer learning for image classification of four classes. Lastly, we evaluate the model's accuracy using precision, recall, F1-Score, and confusion matrix. The outcomes of the classifications for the public and our dataset are 1) Play Sitar (precision = 1.0, recall = 1.0, F1 = 1.0), 2), Tai Chi (precision = 1.0, recall = 1.0, F1 = 1.0), 3) Walking with Dog (precision = 1.0, recall = 1.0, F1 = 1.0), and 4) Student Study (precision = 1.0, recall = 1.0, F1 = 1.0), respectively. The results show that the overall accuracy of the classification rate is 100% which states the model is more powerful for learning UCF-101 and our dataset with higher accuracy.

Development of Deep Learning AI Model and RGB Imagery Analysis Using Pre-sieved Soil (입경 분류된 토양의 RGB 영상 분석 및 딥러닝 기법을 활용한 AI 모델 개발)

  • Kim, Dongseok;Song, Jisu;Jeong, Eunji;Hwang, Hyunjung;Park, Jaesung
    • Journal of The Korean Society of Agricultural Engineers
    • /
    • v.66 no.4
    • /
    • pp.27-39
    • /
    • 2024
  • Soil texture is determined by the proportions of sand, silt, and clay within the soil, which influence characteristics such as porosity, water retention capacity, electrical conductivity (EC), and pH. Traditional classification of soil texture requires significant sample preparation including oven drying to remove organic matter and moisture, a process that is both time-consuming and costly. This study aims to explore an alternative method by developing an AI model capable of predicting soil texture from images of pre-sorted soil samples using computer vision and deep learning technologies. Soil samples collected from agricultural fields were pre-processed using sieve analysis and the images of each sample were acquired in a controlled studio environment using a smartphone camera. Color distribution ratios based on RGB values of the images were analyzed using the OpenCV library in Python. A convolutional neural network (CNN) model, built on PyTorch, was enhanced using Digital Image Processing (DIP) techniques and then trained across nine distinct conditions to evaluate its robustness and accuracy. The model has achieved an accuracy of over 80% in classifying the images of pre-sorted soil samples, as validated by the components of the confusion matrix and measurements of the F1 score, demonstrating its potential to replace traditional experimental methods for soil texture classification. By utilizing an easily accessible tool, significant time and cost savings can be expected compared to traditional methods.