• 제목/요약/키워드: Random forest model

검색결과 574건 처리시간 0.049초

Performance Comparison Analysis of Artificial Intelligence Models for Estimating Remaining Capacity of Lithium-Ion Batteries

  • Kyu-Ha Kim;Byeong-Soo Jung;Sang-Hyun Lee
    • International Journal of Advanced Culture Technology
    • /
    • 제11권3호
    • /
    • pp.310-314
    • /
    • 2023
  • The purpose of this study is to predict the remaining capacity of lithium-ion batteries and evaluate their performance using five artificial intelligence models, including linear regression analysis, decision tree, random forest, neural network, and ensemble model. We is in the study, measured Excel data from the CS2 lithium-ion battery was used, and the prediction accuracy of the model was measured using evaluation indicators such as mean square error, mean absolute error, coefficient of determination, and root mean square error. As a result of this study, the Root Mean Square Error(RMSE) of the linear regression model was 0.045, the decision tree model was 0.038, the random forest model was 0.034, the neural network model was 0.032, and the ensemble model was 0.030. The ensemble model had the best prediction performance, with the neural network model taking second place. The decision tree model and random forest model also performed quite well, and the linear regression model showed poor prediction performance compared to other models. Therefore, through this study, ensemble models and neural network models are most suitable for predicting the remaining capacity of lithium-ion batteries, and decision tree and random forest models also showed good performance. Linear regression models showed relatively poor predictive performance. Therefore, it was concluded that it is appropriate to prioritize ensemble models and neural network models in order to improve the efficiency of battery management and energy systems.

임의효과를 이용한 충남지역 소나무림의 바이오매스 모형 개발 (The Development of Biomass Model for Pinus densiflora in Chungnam Region Using Random Effect)

  • 표정기;손영모
    • 한국산림과학회지
    • /
    • 제106권2호
    • /
    • pp.213-218
    • /
    • 2017
  • 본 연구의 목적은 임의효과(random effect)를 이용하여 충남지역 임령-바이오매스 모형을 개발하고 임의효과의 적용성을 평가하는데 있다. 충남지역 소나무림의 임령에 따른 바이오매스 모형 개발을 위해 임분 구조를 고려하여 전국의 중부지방소나무 임분에서 30개소(150그루)를 조사하고 임령과 바이오매스 자료를 수집하였다. 모형 개발에서 중부지방소나무의 임령-바이오매스 관계는 고정효과(fixed effect)이고 지역간 차이를 임의효과로 설정하였다. 임의효과에 따른 모형의 적합도를 검정하기 위해 아카이케의 정보기준(Akaike Information Criterion, AIC)을 참고하고 지역간 차이에 따른 분산-공분산 행렬과 오차항을 추정하였다. 추정된 공분산은 -1.0022, 오차항은 0.6240이고 분산-공분산 행렬을 이용한 임의효과 모형의 AIC는 377.7을 나타내어 선행 연구와 이질적인 차이는 없었다. 이러한 결과는 범주형 자료의 임의효과가 모형 개발에 반영된 결과로 판단된다. 본 연구의 결과는 임의효과를 이용하여 일부지역에 국한되어 개발되었던 바이오매스 모형 연구에 활용이 가능하다.

기계학습 알고리즘을 이용한 보행만족도 예측모형 개발 (Developing a Pedestrian Satisfaction Prediction Model Based on Machine Learning Algorithms)

  • 이제승;이현희
    • 국토계획
    • /
    • 제54권3호
    • /
    • pp.106-118
    • /
    • 2019
  • In order to develop pedestrian navigation service that provides optimal pedestrian routes based on pedestrian satisfaction levels, it is required to develop a prediction model that can estimate a pedestrian's satisfaction level given a certain condition. Thus, the aim of the present study is to develop a pedestrian satisfaction prediction model based on three machine learning algorithms: Logistic Regression, Random Forest, and Artificial Neural Network models. The 2009, 2012, 2013, 2014, and 2015 Pedestrian Satisfaction Survey Data in Seoul, Korea are used to train and test the machine learning models. As a result, the Random Forest model shows the best prediction performance among the three (Accuracy: 0.798, Recall: 0.906, Precision: 0.842, F1 Score: 0.873, AUC: 0.795). The performance of Artificial Neural Network is the second (Accuracy: 0.773, Recall: 0.917, Precision: 0.811, F1 Score: 0.868, AUC: 0.738) and Logistic Regression model's performance follows the second (Accuracy: 0.764, Recall: 1.000, Precision: 0.764, F1 Score: 0.868, AUC: 0.575). The precision score of the Random Forest model implies that approximately 84.2% of pedestrians may be satisfied if they walk the areas, suggested by the Random Forest model.

A Mixed-effects Height-Diameter Model for Pinus densiflora Trees in Gangwon Province, Korea

  • Lee, Young Jin;Coble, Dean W.;Pyo, Jung Kee;Kim, Sung Ho;Lee, Woo Kyun;Choi, Jung Kee
    • 한국산림과학회지
    • /
    • 제98권2호
    • /
    • pp.178-182
    • /
    • 2009
  • A new mixed-effects model was developed that predicts individual-tree total height for Pinus densiflora trees in Gangwon province as a function of individual-tree diameter (cm). The mixed-effects model contains two random-effects parameters. Maximum likelihood estimation was used to fit the model to 560 height-diameter observations of individual trees measured throughout Gwangwon province in 2007 as part of the National Forest Inventory Program in Korea. The new model is an improvement over fixed-effects models because it can be calibrated to a local area, such as an inventory plot or individual stand. The new model also appears to be an improvement over the Forest Resources Evaluation and Prediction Program for the ten calibration trees used in this study. An example is provided that describes how to estimate the random-effects parameters using ten calibration trees.

반복측정자료 분석을 위한 혼합모형의 적용성 검토: 강원지역 굴참나무 임분을 대상으로 (Applicability Evaluation of a Mixed Model for the Analysis of Repeated Inventory Data : A Case Study on Quercus variabilis Stands in Gangwon Region)

  • 표정기;이상태;서경원;이경재
    • 한국산림과학회지
    • /
    • 제104권1호
    • /
    • pp.111-116
    • /
    • 2015
  • 본 연구의 목적은 임의효과(random effect)를 포함하는 혼합모형(mixed model)을 이용하여 흉고직경과 수고의 변화량을 평가하는데 있다. 강원도 굴참나무 임분을 대상으로 흉고직경과 수고를 조사하고 3년 후 동일 임분을 재조사하였다. 혼합모형에서 굴참나무의 흉고직경-수고 관계는 고정효과(fixed effect)이고 초기측정과 반복측정의 흉고직경과 수고 차이를 임의효과로 설정하였다. 임의효과에 따른 모형의 적합도를 검정하기 위하여 아카이케의 정보기준(akaike information criterion, AIC)을 참고하고 반복 측정에 따른 분산-공분산 행렬과 오차항을 산정하였다. 추정된 공분산은 -0.0291이고 오차항은 0.1007을 나타내었다. 분산-공분산 행렬을 이용한 임의효과가 포함된 모형의 AIC(=-215.5)는 고정효과를 고려한 모형의 AIC(=-154.4)에 비해 낮은 수치를 나타내었다. 이러한 결과는 범주형 자료의 임의효과가 모형 개발에 반영되는 결과인 것으로 조사되었다. 그러므로, 본 연구에서 적용된 혼합모형은 반복 측정 자료를 이용한 모형 개발에 활용이 가능한 것으로 판단된다.

Forest Vertical Structure Mapping from Bi-Seasonal Sentinel-2 Images and UAV-Derived DSM Using Random Forest, Support Vector Machine, and XGBoost

  • Young-Woong Yoon;Hyung-Sup Jung
    • 대한원격탐사학회지
    • /
    • 제40권2호
    • /
    • pp.123-139
    • /
    • 2024
  • Forest vertical structure is vital for comprehending ecosystems and biodiversity, in addition to fundamental forest information. Currently, the forest vertical structure is predominantly assessed via an in-situ method, which is not only difficult to apply to inaccessible locations or large areas but also costly and requires substantial human resources. Therefore, mapping systems based on remote sensing data have been actively explored. Recently, research on analyzing and classifying images using machine learning techniques has been actively conducted and applied to map the vertical structure of forests accurately. In this study, Sentinel-2 and digital surface model images were obtained on two different dates separated by approximately one month, and the spectral index and tree height maps were generated separately. Furthermore, according to the acquisition time, the input data were separated into cases 1 and 2, which were then combined to generate case 3. Using these data, forest vetical structure mapping models based on random forest, support vector machine, and extreme gradient boost(XGBoost)were generated. Consequently, nine models were generated, with the XGBoost model in Case 3 performing the best, with an average precision of 0.99 and an F1 score of 0.91. We confirmed that generating a forest vertical structure mapping model utilizing bi-seasonal data and an appropriate model can result in an accuracy of 90% or higher.

Random Forest를 이용한 남한지역 쌀 수량 예측 연구 (Rice yield prediction in South Korea by using random forest)

  • 김준환;이주석;상완규;신평;조현숙;서명철
    • 한국농림기상학회지
    • /
    • 제21권2호
    • /
    • pp.75-84
    • /
    • 2019
  • 이 연구의 목적은 random forest 를 활용하여 기상요소만을 이용하여 우리나라 전체의 벼 평균수량을 예측하는데 있다. Random forest 는 예측에 사용되는 각 predictor variable 을 분리할 수 있는데 이를 통해 분리된 시계열 상의 추세가 비정상적인 증가형태를 보였다. 이는 결국 예측능력의 저하로 이어지기 때문에 이를 제거할 필요가 있고 본 연구에서는 이동 평균을 이용하여 제거한 후 예측을 하였다. 1991 년부터 2005 년까지의 기상자료와 수량자료를 학습에 사용하였고 2006 년부터 2015 년까지의 자료들을 검증용으로 사용하였다. 학습자료에 대해서는 상당히 정확한 예측 능력을 보여주었으나 검증 자료에서는 그렇지 못하였다. 그 이유를 분석하기 위해 학습 자료와 검증자료에 대해서 각각 변수 중요도를 산출하여 비교한 결과 두 자료 간에 월별 기상 자료에 대한 중요도가 변동되었음을 발견하였다. 이러하 차이가 발생한 이유는 학습자료와 검증 자료에서의 전국적으로 표준이앙기가 이동하여 벼의 생육기간 자체가 변하였기 때문이다. 따라서, 정확한 예측을 위해서는 지역별 파종기 또는 이앙기에 대한 자료가 필요하며 단순히 기상 자료만을 활용한 예측은 어려운 것으로 생긱된다.

기계학습을 활용한 주택매도 결정요인 분석 및 예측모델 구축 (Using Mechanical Learning Analysis of Determinants of Housing Sales and Establishment of Forecasting Model)

  • 김은미;김상봉;조은서
    • 지적과 국토정보
    • /
    • 제50권1호
    • /
    • pp.181-200
    • /
    • 2020
  • 본 연구는 OLS모형을 적용하여 주택보유기간에 영향을 미치는 결정요인을 추정한 후 SVM, Decision Tree, Random Forest, Gradient Boosting, XGBoost, LightGBM을 통해 각 모형별 예측력을 비교하였다. 예측력이 가장 높은 모델을 기반모델 삼아 앙상블 모형 중 하나인 Stacking모형을 적용하여 더욱 예측력이 높은 모형을 구축하여 주택시장의 주택거래량을 파악할 수 있다는 점에 선행 연구와의 차이가 있다. OLS분석 결과 매도이익, 주택가격, 가구원 수, 거주주택형태(단독주택, 아파트)이 주택보유기간에 영향을 미치는 것으로 나타났으며, RMSE를 기준삼아 각 머신러닝 모형과 예측력 비교한 결과 머신러닝 모델의 예측력이 더 높은 것으로 나타났다. 이후, 영향을 미치는 변수로 데이터를 재구축한 후 각 머신러닝을 적용하여 예측력을 비교하였으며, 분석 결과 Random Forest의 예측력이 가장 우수한 것으로 나타났다. 또한 예측력이 가장 높은 Random Forest, Decision Tree, Gradient Boosting, XGBoost모형을 개별모형으로 적용하고, Linear, Ridge, Lasso모형을 메타모델로 하여 Stacking 모형을 구축하였다. 분석 결과, Ridge모형일 때 RMSE값이 0.5181으로 가장 낮게 나타나 예측력이 가장 높은 모델을 구축하였다.

Correlation Analysis of Airline Customer Satisfaction using Random Forest with Deep Neural Network and Support Vector Machine Model

  • Hong, Sang Hoon;Kim, Bumsu;Jung, Yong Gyu
    • International Journal of Internet, Broadcasting and Communication
    • /
    • 제12권4호
    • /
    • pp.26-32
    • /
    • 2020
  • There are many airline customer evaluation data, but they are insufficient in terms of predicting customer satisfaction in practice. In particular, they are generally insufficient in case of verification of data value and development of a customer satisfaction prediction model based on customer evaluation data. In this paper, airline customer satisfaction analysis is conducted through an experiment of correlation analysis between customer evaluation data provided by Google's Kaggle. The difference in accuracy varied according to the three types, which are the overall variables, the top 4 and top 8 variables with the highest correlation. To build an airline customer satisfaction prediction model, they are applied to three classification algorithms of Random Forest, SVM, DNN and conduct a classification experiment. They are divided into training data and verification data by 7:3. As a result, the DNN model showed the lowest accuracy at 86.4%, while the SVM model at 89% and the Random Forest model at 95.7% showed the highest accuracy and performance.

기상 및 토양정보가 고랭지배추 단수예측에 미치는 영향 (The Effect of Highland Weather and Soil Information on the Prediction of Chinese Cabbage Weight)

  • 권태용;김래용;윤상후
    • 한국환경과학회지
    • /
    • 제28권8호
    • /
    • pp.701-707
    • /
    • 2019
  • Highland farming is agriculture that takes place 400 m above sea level and typically involves both low temperatures and long sunshine hours. Most highland Chinese cabbages are harvested in the Gangwon province. The Ubiquitous Sensor Network (USN) has been deployed to observe Chinese cabbages growth because of the lack of installed weather stations in the highlands. Five representative Chinese cabbage cultivation spots were selected for USN and meteorological data collection between 2015 and 2017. The purpose of this study is to develop a weight prediction model for Chinese cabbages using the meteorological and growth data that were collected one week prior. Both a regression and random forest model were considered for this study, with the regression assumptions being satisfied. The Root Mean Square Error (RMSE) was used to evaluate the predictive performance of the models. The variables influencing the weight of cabbage were the number of cabbage leaves, wind speed, precipitation and soil electrical conductivity in the regression model. In the random forest model, cabbage width, the number of cabbage leaves, soil temperature, precipitation, temperature, soil moisture at a depth of 30 cm, cabbage leaf width, soil electrical conductivity, humidity, and cabbage leaf length were screened. The RMSE of the random forest model was 265.478, a value that was relatively lower than that of the regression model (404.493); this is because the random forest model could explain nonlinearity.