• Title/Summary/Keyword: RandomForest

Search Result 1,051, Processing Time 0.023 seconds

Comparison of Solar Power Generation Forecasting Performance in Daejeon and Busan Based on Preprocessing Methods and Artificial Intelligence Techniques: Using Meteorological Observation and Forecast Data (전처리 방법과 인공지능 모델 차이에 따른 대전과 부산의 태양광 발전량 예측성능 비교: 기상관측자료와 예보자료를 이용하여)

  • Chae-Yeon Shim;Gyeong-Min Baek;Hyun-Su Park;Jong-Yeon Park
    • Atmosphere
    • /
    • v.34 no.2
    • /
    • pp.177-185
    • /
    • 2024
  • As increasing global interest in renewable energy due to the ongoing climate crisis, there is a growing need for efficient technologies to manage such resources. This study focuses on the predictive skill of daily solar power generation using weather observation and forecast data. Meteorological data from the Korea Meteorological Administration and solar power generation data from the Korea Power Exchange were utilized for the period from January 2017 to May 2023, considering both inland (Daejeon) and coastal (Busan) regions. Temperature, wind speed, relative humidity, and precipitation were selected as relevant meteorological variables for solar power prediction. All data was preprocessed by removing their systematic components to use only their residuals and the residual of solar data were further processed with weighted adjustments for homoscedasticity. Four models, MLR (Multiple Linear Regression), RF (Random Forest), DNN (Deep Neural Network), and RNN (Recurrent Neural Network), were employed for solar power prediction and their performances were evaluated based on predicted values utilizing observed meteorological data (used as a reference), 1-day-ahead forecast data (referred to as fore1), and 2-day-ahead forecast data (fore2). DNN-based prediction model exhibits superior performance in both regions, with RNN performing the least effectively. However, MLR and RF demonstrate competitive performance comparable to DNN. The disparities in the performance of the four different models are less pronounced than anticipated, underscoring the pivotal role of fitting models using residuals. This emphasizes that the utilized preprocessing approach, specifically leveraging residuals, is poised to play a crucial role in the future of solar power generation forecasting.

Gut microbiota derived from fecal microbiota transplantation enhances body weight of Mimas squabs

  • Jing Ren;Yumei Li;Hongyu Ni;Yan Zhang;Puze Zhao;Qingxing Xiao;Xiaoqing Hong;Ziyi Zhang;Yijing Yin;Xiaohui Li;Yonghong Zhang;Yuwei Yang
    • Animal Bioscience
    • /
    • v.37 no.8
    • /
    • pp.1428-1439
    • /
    • 2024
  • Objective: Compared to Mimas pigeons, Shiqi pigeons exhibit greater tolerance to coarse feeding because of their abundant gut microbiota. Here, to investigate the potential of utilizing intestinal flora derived from Shiqi pigeons, the intestinal flora and body indices of Mimas squabs were evaluated after fecal microbiota transplantation (FMT) from donors. Methods: A total of 90 one-day-old squabs were randomly divided into the control group (CON), the low-concentration group (LC) and the high-concentration group (HC): gavaged with 200 μL of bacterial solution at concentrations of 0, 0.1, and 0.2 g/15 mL, respectively. Results: The results suggested that FMT improved the body weight of Mimas squabs in the HC and LC groups (p<0.01), and 0.1 g/15 mL was the optimal dose during FMT. After 16S rRNA sequencing was performed, compared to those in the CON group, the abundance levels of microflora, especially Lactobacillus, Muribaculaceae, and Megasphaera (p<0.05), in the FMT-treated groups were markedly greater. Random forest analysis indicated that the main functions of key microbes involve pathways associated with metabolism, further illustrating their important role in the host body. Conclusion: FMT has been determined to be a viable method for augmenting the weight and intestinal microbiota of squabs, representing a unique avenue for enhancing the economic feasibility of squab breeding.

A Hybrid Multi-Level Feature Selection Framework for prediction of Chronic Disease

  • G.S. Raghavendra;Shanthi Mahesh;M.V.P. Chandrasekhara Rao
    • International Journal of Computer Science & Network Security
    • /
    • v.23 no.12
    • /
    • pp.101-106
    • /
    • 2023
  • Chronic illnesses are among the most common serious problems affecting human health. Early diagnosis of chronic diseases can assist to avoid or mitigate their consequences, potentially decreasing mortality rates. Using machine learning algorithms to identify risk factors is an exciting strategy. The issue with existing feature selection approaches is that each method provides a distinct set of properties that affect model correctness, and present methods cannot perform well on huge multidimensional datasets. We would like to introduce a novel model that contains a feature selection approach that selects optimal characteristics from big multidimensional data sets to provide reliable predictions of chronic illnesses without sacrificing data uniqueness.[1] To ensure the success of our proposed model, we employed balanced classes by employing hybrid balanced class sampling methods on the original dataset, as well as methods for data pre-processing and data transformation, to provide credible data for the training model. We ran and assessed our model on datasets with binary and multivalued classifications. We have used multiple datasets (Parkinson, arrythmia, breast cancer, kidney, diabetes). Suitable features are selected by using the Hybrid feature model consists of Lassocv, decision tree, random forest, gradient boosting,Adaboost, stochastic gradient descent and done voting of attributes which are common output from these methods.Accuracy of original dataset before applying framework is recorded and evaluated against reduced data set of attributes accuracy. The results are shown separately to provide comparisons. Based on the result analysis, we can conclude that our proposed model produced the highest accuracy on multi valued class datasets than on binary class attributes.[1]

Correlation between Vocational Training Evaluation Data and Employment Outcomes: A Study on Prediction Approaches through Machine Learning Models (직업훈련생 평가 데이터와 취업 결과의 상관관계: 머신러닝 모델을 통한 예측 방안 연구)

  • Jae-Sung Chun;Il-Young Moon
    • Journal of Practical Engineering Education
    • /
    • v.16 no.3_spc
    • /
    • pp.291-296
    • /
    • 2024
  • This study analyzed various machine learning models that predict employment outcomes after vocational training using pre-assessment data of disabled vocational trainees. The study selected and utilized the most appropriate machine learning models based on a data set containing various personal characteristics, including trainees' gender, age, and type of disability. Through this analysis, the goal is to improve the employment rate and job satisfaction of disabled trainees using only pre-assessment data. As a result, it presents a universal approach that can be applied not only to people with disabilities, but also to vocational trainees from a variety of backgrounds. This is expected to make an important contribution to the development and implementation of tailored vocational training programs, ultimately helping to achieve better employment outcomes and job satisfaction.

Experimental research on flow regime and transitional criterion of slug to churn-turbulent and churn-turbulent to annular flow in rectangular channels

  • Qingche He;Liang-ming Pan;Luteng Zhang;Wangtao Xu;Meiyue Yan
    • Nuclear Engineering and Technology
    • /
    • v.55 no.11
    • /
    • pp.3973-3982
    • /
    • 2023
  • As for two-phase flow in rectangular channels, the flow regimes especially like churn-turbulent and annular flow are significant for the physical problem like Countercurrent Flow Limitation (CCFL). In this study, the rectangular channels with cross-sections of 4 × 66 mm, 6 × 66 mm, 8 × 66 mm are adopted to investigate the flow regimes of air-water vertical upward two phase flow under adiabatic condition. The gas and liquid superficial velocities are 0 ≤ jg ≤ 20m/s and 0.25 ≤ jf ≤ 3m/s respectively which covering bubbly to annular flow. The flow regimes are identified by random forest algorithm and the flow regime maps are obtained. As the results, the transitional void fraction from slug to churn turbulent flow fluctuate from 0.47 to 0.58 which is significantly affected by the dimensional size of channel and flow rate. Besides, the void fraction at transitional points from churn-turbulent (slug) to annular flow are 0.66-0.67, which are independent with the gap size. Furthermore, a new criteria of slug to churn-turbulent flow is established in this study. In addition, by introducing the interfacial force model, the criteria of churn-turbulent (slug) flow to annular flow is verified.

Protecting Accounting Information Systems using Machine Learning Based Intrusion Detection

  • Biswajit Panja
    • International Journal of Computer Science & Network Security
    • /
    • v.24 no.5
    • /
    • pp.111-118
    • /
    • 2024
  • In general network-based intrusion detection system is designed to detect malicious behavior directed at a network or its resources. The key goal of this paper is to look at network data and identify whether it is normal traffic data or anomaly traffic data specifically for accounting information systems. In today's world, there are a variety of principles for detecting various forms of network-based intrusion. In this paper, we are using supervised machine learning techniques. Classification models are used to train and validate data. Using these algorithms we are training the system using a training dataset then we use this trained system to detect intrusion from the testing dataset. In our proposed method, we will detect whether the network data is normal or an anomaly. Using this method we can avoid unauthorized activity on the network and systems under that network. The Decision Tree and K-Nearest Neighbor are applied to the proposed model to classify abnormal to normal behaviors of network traffic data. In addition to that, Logistic Regression Classifier and Support Vector Classification algorithms are used in our model to support proposed concepts. Furthermore, a feature selection method is used to collect valuable information from the dataset to enhance the efficiency of the proposed approach. Random Forest machine learning algorithm is used, which assists the system to identify crucial aspects and focus on them rather than all the features them. The experimental findings revealed that the suggested method for network intrusion detection has a neglected false alarm rate, with the accuracy of the result expected to be between 95% and 100%. As a result of the high precision rate, this concept can be used to detect network data intrusion and prevent vulnerabilities on the network.

Predictive modeling algorithms for liver metastasis in colorectal cancer: A systematic review of the current literature

  • Isaac Seow-En;Ye Xin Koh;Yun Zhao;Boon Hwee Ang;Ivan En-Howe Tan;Aik Yong Chok;Emile John Kwong Wei Tan;Marianne Kit Har Au
    • Annals of Hepato-Biliary-Pancreatic Surgery
    • /
    • v.28 no.1
    • /
    • pp.14-24
    • /
    • 2024
  • This study aims to assess the quality and performance of predictive models for colorectal cancer liver metastasis (CRCLM). A systematic review was performed to identify relevant studies from various databases. Studies that described or validated predictive models for CRCLM were included. The methodological quality of the predictive models was assessed. Model performance was evaluated by the reported area under the receiver operating characteristic curve (AUC). Of the 117 articles screened, seven studies comprising 14 predictive models were included. The distribution of included predictive models was as follows: radiomics (n = 3), logistic regression (n = 3), Cox regression (n = 2), nomogram (n = 3), support vector machine (SVM, n = 2), random forest (n = 2), and convolutional neural network (CNN, n = 2). Age, sex, carcinoembryonic antigen, and tumor staging (T and N stage) were the most frequently used clinicopathological predictors for CRCLM. The mean AUCs ranged from 0.697 to 0.870, with 86% of the models demonstrating clear discriminative ability (AUC > 0.70). A hybrid approach combining clinical and radiomic features with SVM provided the best performance, achieving an AUC of 0.870. The overall risk of bias was identified as high in 71% of the included studies. This review highlights the potential of predictive modeling to accurately predict the occurrence of CRCLM. Integrating clinicopathological and radiomic features with machine learning algorithms demonstrates superior predictive capabilities.

Performance Evaluation of Machine Learning Model for Seismic Response Prediction of Nuclear Power Plant Structures considering Aging deterioration (원전 구조물의 경년열화를 고려한 지진응답예측 기계학습 모델의 성능평가)

  • Kim, Hyun-Su;Kim, Yukyung;Lee, So Yeon;Jang, Jun Su
    • Journal of Korean Association for Spatial Structures
    • /
    • v.24 no.3
    • /
    • pp.43-51
    • /
    • 2024
  • Dynamic responses of nuclear power plant structure subjected to earthquake loads should be carefully investigated for safety. Because nuclear power plant structure are usually constructed by material of reinforced concrete, the aging deterioration of R.C. have no small effect on structural behavior of nuclear power plant structure. Therefore, aging deterioration of R.C. nuclear power plant structure should be considered for exact prediction of seismic responses of the structure. In this study, a machine learning model for seismic response prediction of nuclear power plant structure was developed by considering aging deterioration. The OPR-1000 was selected as an example structure for numerical simulation. The OPR-1000 was originally designated as the Korean Standard Nuclear Power Plant (KSNP), and was re-designated as the OPR-1000 in 2005 for foreign sales. 500 artificial ground motions were generated based on site characteristics of Korea. Elastic modulus, damping ratio, poisson's ratio and density were selected to consider material property variation due to aging deterioration. Six machine learning algorithms such as, Decision Tree (DT), Random Forest (RF), Support Vector Machine (SVM), K-Nearest Neighbor (KNN), Artificial Neural Networks (ANN), eXtreme Gradient Boosting (XGBoost), were used t o construct seispic response prediction model. 13 intensity measures and 4 material properties were used input parameters of the training database. Performance evaluation was performed using metrics like root mean square error, mean square error, mean absolute error, and coefficient of determination. The optimization of hyperparameters was achieved through k-fold cross-validation and grid search techniques. The analysis results show that neural networks present good prediction performance considering aging deterioration.

Using Machine Learning Techniques to Predict Health-Related Quality of Life Factors in Patients with Hypertension (머신러닝 기법을 활용한 고혈압 환자의 건강 관련 삶의 질 요인 예측)

  • Jae-Hyeok Jeong;Sung-Hyoun Cho
    • Journal of The Korean Society of Integrative Medicine
    • /
    • v.12 no.3
    • /
    • pp.11-24
    • /
    • 2024
  • Purpose : This study aims to identify the factors influencing health-related quality of life through machine learning of the general characteristics of patients with hypertension and to provide a basis for related research on patients, such as intervention strategies and management guidelines in the field of physical therapy for health promotion. Methods : Annual data from the second Korean Health Panel (Version 2.0) from 2019 to 2020, conducted jointly by the Korea Health and Social Research Institute and the National Health Insurance Service, were analyzed (Korea Health Panel, 2024). The data used in this study was collected from January to July 2020, and the data was collected using computer-assisted face-to-face interviews. Of the 13,530 household members surveyed, 1,368 were selected as the final study participants after removing missing values from 3,448 individuals diagnosed with hypertension by a doctor. Results : The results showed that walking (P2) was the most significant factor affecting health-related quality of life in random forest, followed by perceived stress (HS1), body mass index (BMIc), total household income (TOTc), subjective health status (SRHc), marital status (Marr), and education level (Edu). Conclusion :To prevent and manage chronic diseases such as hypertension, as well as to provide customized interventions for patients in advanced stages of the disease, research should be conducted in the field of physical therapy to identify influencing factors using machine learning. Based on the findings of this study, we believe that there is a need for additional content that can be utilized in the field of physical therapy to improve the health-related quality of life of patients with hypertension, such as diagnostic assessment and intervention management guidelines for hypertension, and education on perceived stress and subjective health status.

Hybrid machine learning with HHO method for estimating ultimate shear strength of both rectangular and circular RC columns

  • Quang-Viet Vu;Van-Thanh Pham;Dai-Nhan Le;Zhengyi Kong;George Papazafeiropoulos;Viet-Ngoc Pham
    • Steel and Composite Structures
    • /
    • v.52 no.2
    • /
    • pp.145-163
    • /
    • 2024
  • This paper presents six novel hybrid machine learning (ML) models that combine support vector machines (SVM), Decision Tree (DT), Random Forest (RF), Gradient Boosting (GB), extreme gradient boosting (XGB), and categorical gradient boosting (CGB) with the Harris Hawks Optimization (HHO) algorithm. These models, namely HHO-SVM, HHO-DT, HHO-RF, HHO-GB, HHO-XGB, and HHO-CGB, are designed to predict the ultimate strength of both rectangular and circular reinforced concrete (RC) columns. The prediction models are established using a comprehensive database consisting of 325 experimental data for rectangular columns and 172 experimental data for circular columns. The ML model hyperparameters are optimized through a combination of cross-validation technique and the HHO. The performance of the hybrid ML models is evaluated and compared using various metrics, ultimately identifying the HHO-CGB model as the top-performing model for predicting the ultimate shear strength of both rectangular and circular RC columns. The mean R-value and mean a20-index are relatively high, reaching 0.991 and 0.959, respectively, while the mean absolute error and root mean square error are low (10.302 kN and 27.954 kN, respectively). Another comparison is conducted with four existing formulas to further validate the efficiency of the proposed HHO-CGB model. The Shapely Additive Explanations method is applied to analyze the contribution of each variable to the output within the HHO-CGB model, providing insights into the local and global influence of variables. The analysis reveals that the depth of the column, length of the column, and axial loading exert the most significant influence on the ultimate shear strength of RC columns. A user-friendly graphical interface tool is then developed based on the HHO-CGB to facilitate practical and cost-effective usage.