• Title/Summary/Keyword: Random Forest, RF

Search Result 193, Processing Time 0.033 seconds

Research on the Lesion Classification by Radiomics in Laryngoscopy Image (후두내시경 영상에서의 라디오믹스에 의한 병변 분류 연구)

  • Park, Jun Ha;Kim, Young Jae;Woo, Joo Hyun;Kim, Kwang Gi
    • Journal of Biomedical Engineering Research
    • /
    • v.43 no.5
    • /
    • pp.353-360
    • /
    • 2022
  • Laryngeal disease harms quality of life, and laryngoscopy is critical in identifying causative lesions. This study extracts and analyzes using radiomics quantitative features from the lesion in laryngoscopy images and will fit and validate a classifier for finding meaningful features. Searching the region of interest for lesions not classified by the YOLOv5 model, features are extracted with radionics. Selected the extracted features are through a combination of three feature selectors, and three estimator models. Through the selected features, trained and verified two classification models, Random Forest and Gradient Boosting, and found meaningful features. The combination of SFS, LASSO, and RF shows the highest performance with an accuracy of 0.90 and AUROC 0.96. Model using features to select by SFM, or RIDGE was low lower performance than other things. Classification of larynx lesions through radiomics looks effective. But it should use various feature selection methods and minimize data loss as losing color data.

Sentiment Analysis of COVID-19 Vaccination in Saudi Arabia

  • Sawsan Alowa;Lama Alzahrani;Noura Alhakbani;Hend Alrasheed
    • International Journal of Computer Science & Network Security
    • /
    • v.23 no.2
    • /
    • pp.13-30
    • /
    • 2023
  • Since the COVID-19 vaccine became available, people have been sharing their opinions on social media about getting vaccinated, causing discussions of the vaccine to trend on Twitter alongside certain events, making the website a rich data source. This paper explores people's perceptions regarding the COVID-19 vaccine during certain events and how these events influenced public opinion about the vaccine. The data consisted of tweets sent during seven important events that were gathered within 14 days of the first announcement of each event. These data represent people's reactions to these events without including irrelevant tweets. The study targeted tweets sent in Arabic from users located in Saudi Arabia. The data were classified as positive, negative, or neutral in tone. Four classifiers were used-support vector machine (SVM), naïve Bayes (NB), logistic regression (LOGR), and random forest (RF)-in addition to a deep learning model using BiLSTM. The results showed that the SVM achieved the highest accuracy, at 91%. Overall perceptions about the COVID-19 vaccine were 54% negative, 36% neutral, and 10% positive.

Form-finding of lifting self-forming GFRP elastic gridshells based on machine learning interpretability methods

  • Soheila, Kookalani;Sandy, Nyunn;Sheng, Xiang
    • Structural Engineering and Mechanics
    • /
    • v.84 no.5
    • /
    • pp.605-618
    • /
    • 2022
  • Glass fiber reinforced polymer (GFRP) elastic gridshells consist of long continuous GFRP tubes that form elastic deformations. In this paper, a method for the form-finding of gridshell structures is presented based on the interpretable machine learning (ML) approaches. A comparative study is conducted on several ML algorithms, including support vector regression (SVR), K-nearest neighbors (KNN), decision tree (DT), random forest (RF), AdaBoost, XGBoost, category boosting (CatBoost), and light gradient boosting machine (LightGBM). A numerical example is presented using a standard double-hump gridshell considering two characteristics of deformation as objective functions. The combination of the grid search approach and k-fold cross-validation (CV) is implemented for fine-tuning the parameters of ML models. The results of the comparative study indicate that the LightGBM model presents the highest prediction accuracy. Finally, interpretable ML approaches, including Shapely additive explanations (SHAP), partial dependence plot (PDP), and accumulated local effects (ALE), are applied to explain the predictions of the ML model since it is essential to understand the effect of various values of input parameters on objective functions. As a result of interpretability approaches, an optimum gridshell structure is obtained and new opportunities are verified for form-finding investigation of GFRP elastic gridshells during lifting construction.

Development and Evaluation of Machine Learning-based Prediction Models for Wastewater Treatment Plant (머신러닝 기반의 하수처리장 예측 모델 평가 및 개발)

  • Kyu Dae Shim;Hyo Sang Kim;Geun Soo Chang;Dong Kyun Kim;Young Mo Kim
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2023.05a
    • /
    • pp.499-499
    • /
    • 2023
  • 최근 컴퓨터 성능 향상과 새로운 머신러닝 알고리즘 개발됨에 따라, 각 분야별 연구자들이 이를 활용한 연구를 다양하게 수행하고 있으며, 하수처리시설의 경우에는 막대한 양의 운영자료가 축척됨에 따라 머신러닝을 활용한 다양한 연구가 가속화 되고 있다. 기존 하수처리장의 물리학적 모델은 적용된 영향 인자에 여러 가지 가정이 고려되어 모델 정확도가 부정확해지는 경향이 있었으며, 이러한 문제점을 보완하기 위해 하수처리장의 수집된 운영자료 및 머신러닝 기반의 예측 모델을 활용하여 예측 모델 정확도를 향상하는 선행 연구들이 진행되고 있다. A 하수처리장의 부지 내에 설치된 센서를 통하여 운영자료가 중앙제어실 서버에 실시간으로 저장되는 자료를 활용하여 NN (Neural Network), SVM (Support Vector Machine), RF (Random Forest) 등과 같은 다양한 머신러닝 모델을 적용하였고, 하수처리장 운영자료를 적용할 경우 어느 모델이 가장 높은 성능이 나타나는지 인사이트를 도출하고자 하였다. 금회 연구는 A 하수처리장을 대상으로 여러 머신러닝 기반 예측 모델을 개발하고, 각 모델의 예측정확도를 서로 평가함으로써, 머신러닝 모델 최적화를 수행할 수 있었다. 이번 연구에서 도출된 결과를 활용하여 하수처리장 예측 모델 최적화를 진행할 경우, 향후 비교적 짧은 시간에 하수처리장 머신러닝 기반 예측 모델 개발이 가능하다는 점에 의의가 있다.

  • PDF

Automated detection of panic disorder based on multimodal physiological signals using machine learning

  • Eun Hye Jang;Kwan Woo Choi;Ah Young Kim;Han Young Yu;Hong Jin Jeon;Sangwon Byun
    • ETRI Journal
    • /
    • v.45 no.1
    • /
    • pp.105-118
    • /
    • 2023
  • We tested the feasibility of automated discrimination of patients with panic disorder (PD) from healthy controls (HCs) based on multimodal physiological responses using machine learning. Electrocardiogram (ECG), electrodermal activity (EDA), respiration (RESP), and peripheral temperature (PT) of the participants were measured during three experimental phases: rest, stress, and recovery. Eleven physiological features were extracted from each phase and used as input data. Logistic regression (LoR), k-nearest neighbor (KNN), support vector machine (SVM), random forest (RF), and multilayer perceptron (MLP) algorithms were implemented with nested cross-validation. Linear regression analysis showed that ECG and PT features obtained in the stress and recovery phases were significant predictors of PD. We achieved the highest accuracy (75.61%) with MLP using all 33 features. With the exception of MLP, applying the significant predictors led to a higher accuracy than using 24 ECG features. These results suggest that combining multimodal physiological signals measured during various states of autonomic arousal has the potential to differentiate patients with PD from HCs.

Predicting idiopathic pulmonary fibrosis (IPF) disease in patients using machine approaches

  • Ali, Sikandar;Hussain, Ali;Kim, Hee-Cheol
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2021.05a
    • /
    • pp.144-146
    • /
    • 2021
  • Idiopathic pulmonary fibrosis (IPF) is one of the most dreadful lung diseases which effects the performance of the lung unpredictably. There is no any authentic natural history discovered yet pertaining to this disease and it has been very difficult for the physicians to diagnosis this disease. With the advent of Artificial intelligent and its related technologies this task has become a little bit easier. The aim of this paper is to develop and to explore the machine learning models for the prediction and diagnosis of this mysterious disease. For our study, we got IPF dataset from Haeundae Paik hospital consisting of 2425 patients. This dataset consists of 502 features. We applied different data preprocessing techniques for data cleaning while making the data fit for the machine learning implementation. After the preprocessing of the data, 18 features were selected for the experiment. In our experiment, we used different machine learning classifiers i.e., Multilayer perceptron (MLP), Support vector machine (SVM), and Random forest (RF). we compared the performance of each classifier. The experimental results showed that MLP outperformed all other compared models with 91.24% accuracy.

  • PDF

Identification of Pb-Zn ore under the condition of low count rate detection of slim hole based on PGNAA technology

  • Haolong Huang;Pingkun Cai;Wenbao Jia;Yan Zhang
    • Nuclear Engineering and Technology
    • /
    • v.55 no.5
    • /
    • pp.1708-1717
    • /
    • 2023
  • The grade analysis of lead-zinc ore is the basis for the optimal development and utilization of deposits. In this study, a method combining Prompt Gamma Neutron Activation Analysis (PGNAA) technology and machine learning is proposed for lead-zinc mine borehole logging, which can identify lead-zinc ores of different grades and gangue in the formation, providing real-time grade information qualitatively and semi-quantitatively. Firstly, Monte Carlo simulation is used to obtain a gamma-ray spectrum data set for training and testing machine learning classification algorithms. These spectra are broadened, normalized and separated into inelastic scattering and capture spectra, and then used to fit different classifier models. When the comprehensive grade boundary of high- and low-grade ores is set to 5%, the evaluation metrics calculated by the 5-fold cross-validation show that the SVM (Support Vector Machine), KNN (K-Nearest Neighbor), GNB (Gaussian Naive Bayes) and RF (Random Forest) models can effectively distinguish lead-zinc ore from gangue. At the same time, the GNB model has achieved the optimal accuracy of 91.45% when identifying high- and low-grade ores, and the F1 score for both types of ores is greater than 0.9.

A robust approach in prediction of RCFST columns using machine learning algorithm

  • Van-Thanh Pham;Seung-Eock Kim
    • Steel and Composite Structures
    • /
    • v.46 no.2
    • /
    • pp.153-173
    • /
    • 2023
  • Rectangular concrete-filled steel tubular (RCFST) column, a type of concrete-filled steel tubular (CFST), is widely used in compression members of structures because of its advantages. This paper proposes a robust machine learning-based framework for predicting the ultimate compressive strength of RCFST columns under both concentric and eccentric loading. The gradient boosting neural network (GBNN), an efficient and up-to-date ML algorithm, is utilized for developing a predictive model in the proposed framework. A total of 890 experimental data of RCFST columns, which is categorized into two datasets of concentric and eccentric compression, is carefully collected to serve as training and testing purposes. The accuracy of the proposed model is demonstrated by comparing its performance with seven state-of-the-art machine learning methods including decision tree (DT), random forest (RF), support vector machines (SVM), deep learning (DL), adaptive boosting (AdaBoost), extreme gradient boosting (XGBoost), and categorical gradient boosting (CatBoost). Four available design codes, including the European (EC4), American concrete institute (ACI), American institute of steel construction (AISC), and Australian/New Zealand (AS/NZS) are refereed in another comparison. The results demonstrate that the proposed GBNN method is a robust and powerful approach to obtain the ultimate strength of RCFST columns.

A novel method for vehicle load detection in cable-stayed bridge using graph neural network

  • Van-Thanh Pham;Hye-Sook Son;Cheol-Ho Kim;Yun Jang;Seung-Eock Kim
    • Steel and Composite Structures
    • /
    • v.46 no.6
    • /
    • pp.731-744
    • /
    • 2023
  • Vehicle load information is an important role in operating and ensuring the structural health of cable-stayed bridges. In this regard, an efficient and economic method is proposed for vehicle load detection based on the observed cable tension and vehicle position using a graph neural network (GNN). Datasets are first generated using the practical advanced analysis program (PAAP), a robust program for modeling and considering both geometric and material nonlinearities of bridge structures subjected to vehicle load with low computational costs. With the superiority of GNN, the proposed model is demonstrated to precisely capture complex nonlinear correlations between the input features and vehicle load in the output. Four popular machine learning methods including artificial neural network (ANN), decision tree (DT), random forest (RF), and support vector machines (SVM) are refereed in a comparison. A case study of a cable-stayed bridge with the typical truck is considered to evaluate the model's performance. The results demonstrate that the GNN-based model provides high accuracy and efficiency in prediction with satisfactory correlation coefficients, efficient determination values, and very small errors; and is a novel approach for vehicle load detection with the input data of the existing monitoring system.

Study of machine learning model for predicting non-small cell lung cancer metastasis using image texture feature (Image texture feature를 이용하여 비소세포폐암 전이 예측 머신러닝 모델 연구)

  • Hye Min Ju;Sang-Keun Woo
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2023.07a
    • /
    • pp.313-315
    • /
    • 2023
  • 본 논문에서는 18F-FDG PET과 CT에서 추출한 영상인자를 이용하여 비소세포폐암의 전이를 예측하는 머신러닝 모델을 생성하였다. 18F-FDG는 종양의 포도당 대사 시 사용되며 이를 추적하여 환자의 암 세포를 진단하는데 사용되는 의료영상 기법 중 하나이다. PET과 CT 영상에서 추출한 이미지 특징은 종양의 생물학적 특성을 반영하며 해당 ROI로부터 계산되어 정량화된 값이다. 본 연구에서는 환자의 의료영상으로부터 image texture 프절 전이 예측에 있어 유의한 인자인지를 확인하기 위하여 AUC를 계산하고 단변량 분석을 진행하였다. PET과 CT에서 각각 4개(GLRLM_GLNU, SHAPE_Compacity only for 3D ROI, SHAPE_Volume_vx, SHAPE_Volume_mL)와 2개(NGLDM_Busyness, TLG_ml)의 image texture feature를 모델의 생성에 사용하였다. 생성된 각 모델의 성능을 평가하기 위해 accuracy와 AUC를 계산하였으며 그 결과 random forest(RF) 모델의 예측 정확도가 가장 높았다. 추출된 PET과 CT image texture feature를 함께 사용하여 모델을 훈련하였을 때가 각각 따로 사용하였을 때 보다 예측 성능이 개선됨을 확인하였다. 추출된 영상인자가 림프절 전이를 나타내는 바이오마커로서의 가능성을 확인할 수 있었으며 이러한 연구 결과를 바탕으로 개인별 의료 영상을 기반으로 한 비소세포폐암의 치료 전략을 수립할 수 있을 것이라 기대된다.

  • PDF