• Title/Summary/Keyword: RandomForest

Search Result 1,014, Processing Time 0.028 seconds

Head Pose Estimation with Accumulated Historgram and Random Forest (누적 히스토그램과 랜덤 포레스트를 이용한 머리방향 추정)

  • Mun, Sung Hee;Lee, Chil woo
    • Smart Media Journal
    • /
    • v.5 no.1
    • /
    • pp.38-43
    • /
    • 2016
  • As smart environment is spread out in our living environments, the needs of an approach related to Human Computer Interaction(HCI) is increases. One of them is head pose estimation. it related to gaze direction estimation, since head has a close relationship to eyes by the body structure. It's a key factor in identifying person's intention or the target of interest, hence it is an essential research in HCI. In this paper, we propose an approach for head pose estimation with pre-defined several directions by random forest classifier. We use canny edge detector to extract feature of the different facial image which is obtained between input image and averaged frontal facial image for extraction of rotation information of input image. From that, we obtain the binary edge image, and make two accumulated histograms which are obtained by counting the number of pixel which has non-zero value along each of the axes. This two accumulated histograms are used to feature of the facial image. We use CAS-PEAL-R1 Dataset for training and testing to random forest classifier, and obtained 80.6% accuracy.

Analysis of the Feature Importance of Occupational Accidents Occurring at Construction Sites on the Severity of Lost Workdays (건설 현장에서 발생한 업무상 재해가 근로손실일수 심각도에 미치는 특징 중요도 분석)

  • Kang, Kyung-Su;Choi, Jae-Hyun;Ryu, Han-Guk
    • Journal of the Korea Institute of Building Construction
    • /
    • v.21 no.2
    • /
    • pp.165-174
    • /
    • 2021
  • The construction industry causes the most accidents and fatalities among all industries. Although many efforts have been made to reduce safety accidents in construction, the study on the lost workdays that return to work place is insufficient. Therefore, this study proposes a model that classifies the lost workdays lost into moderate and severity, and derives the importance of variable and analyzes important factors through the trained random forest model. We analyze the learning process of the random forest which is a black box model, and extracted important variables that impact on the severity of the lost workdays through the extracted feature importance. The factors existing inside were analyzed through the extracted variables. The purpose of this study is to analyze the accident case data at the construction site through a random forest model and to review variables that have a high impact on the lost workdays. In the future, this sutdy can apply to improve construction safety management and reduce the accident of industrial accidents.

Development of Random Forest Model for Sewer-induced Sinkhole Susceptibility (손상 하수관으로 인한 지반함몰의 위험도 평가를 위한 랜덤 포레스트 모델 개발)

  • Kim, Joonyoung;Kang, Jae Mo;Baek, Sung-Ha
    • Journal of the Korean Geotechnical Society
    • /
    • v.37 no.12
    • /
    • pp.117-125
    • /
    • 2021
  • The occurrence of ground subsidence and sinkhole in downtown areas, which threatens the safety of citizens, has been frequently reported. Among the various mechanisms of a sinkhole, soil erosion through the damaged part of the sewer pipe was found to be the main cause in Seoul. In this study, a random forest model for predicting the occurrence of sinkholes caused by damaged sewer pipes based on sewage pipe information was trained using the information on the sewage pipe and the locations of the sinkhole occurrence case in Seoul. The random forest model showed excellent performance in the prediction of sinkhole occurrence after the optimization of its hyperparameters. In addition, it was confirmed that the sewage pipe length, elevation above sea level, slope, depth of landfill, and the risk of ground subsidence were affected in the order of sewage pipe information used as input variables. The results of this study are expected to be used as basic data for the preparation of a sinkhole susceptibility map and the establishment of an underground cavity exploration plan and a sewage pipe maintenance plan.

Selection of Performance of Bias Correction using TOPSIS method (TOPSIS 방법을 이용한 편의 보정 방법 선정)

  • Song, Young Hoon;Chung, Eun Sung
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2019.05a
    • /
    • pp.306-306
    • /
    • 2019
  • 전지구적 기온상승으로 인해 미래기후의 관한 연구가 중요시 되고 있다. 위와 같은 현상으로 인하여 다양한 기후변화 연구가 진행되고 있다. 미래기후 연구에는 GCM (General Circulation Model) 모의 결과가 이용된다. 격자 자료로 구성된 GCM은 연구 지점으로 지역적 상세화와 연구지역의 관측자료 사이의 편이 보정(bias correction)이 필수적이다. 위와 같은 근거로 편이 보정 방법의 선택은 매우 중요하며 편의 보정의 방법에 따라서 결과가 다르게 도출될 수 있다. 또한 국내외 연구에서는 다양한 상세화 기법과 편이 보정 기법을 분석 및 평가하는 연구가 진행되고 있으며, 편의 기법 중 대표적인 기법인 Quantile mapping과 Random Forest 기법이 있다. Quantile mapping 기법은 GCM의 과거 모의 데이터와의 편이 보정에 있어서 우수하게 나타났으나, GCM 데이터의 미래 예측 기간(2010년~2018년)까지의 데이터에서는 극한 강수를 정량적으로 분석 가능한 Random Forest 기법이 편이 보정 과정에서 성능이 우수할 것으로 판단된다. 본 연구에서는 우리나라 21개 관측소를 기준으로 총 4개의 GCM(GISS, CSIRO, CCSM4,MIROC5)의 과거 기간 자료(1970년~2005년)를 실제 관측소에서 관측된 강수량을 편의 보정하는 방법에 있어서 편의 보정 기법의 성능을 비교한 결과와 GCM 미래 예측 기간 자료(2010년~2018년)에서의 편의 보정 기법의 성능 결과를 비교하였다. 이를 토대로 편이 보정 기법의 결과를 6개의 평가지수를 이용하여 정량적으로 분석하였으며, 다기준의사결정기법인 TOPSIS(Technique for Order of Preference by Similarity to Ideal Solution)를 이용하여 편이 보정기법들의 성능에 있어서 우선순위를 선정하였다. 본 연구에서 편이 보정 방법으로 Quantile mapping 방법을 사용했으며, Quantile mapping의 기법으로는 비모수 변환법(non-parametric transformation)과 분포기반 변환법(distribution derived transformation)이 사용되었다. 또한 머신러닝 방법 중 하나인 Random Forest 방법을 동시에 사용하여 결과를 비교하였다. 또한 GCM 자료가 격자식으로 제공하고 있기 때문에 관측소 강수량도 공간적으로 환산하여야 하는데, 본 연구에서는 역거리 가중치법(inverse distance weighting, IDW) 방법을 이용하였다.

  • PDF

Feature selection and prediction modeling of drug responsiveness in Pharmacogenomics (약물유전체학에서 약물반응 예측모형과 변수선택 방법)

  • Kim, Kyuhwan;Kim, Wonkuk
    • The Korean Journal of Applied Statistics
    • /
    • v.34 no.2
    • /
    • pp.153-166
    • /
    • 2021
  • A main goal of pharmacogenomics studies is to predict individual's drug responsiveness based on high dimensional genetic variables. Due to a large number of variables, feature selection is required in order to reduce the number of variables. The selected features are used to construct a predictive model using machine learning algorithms. In the present study, we applied several hybrid feature selection methods such as combinations of logistic regression, ReliefF, TurF, random forest, and LASSO to a next generation sequencing data set of 400 epilepsy patients. We then applied the selected features to machine learning methods including random forest, gradient boosting, and support vector machine as well as a stacking ensemble method. Our results showed that the stacking model with a hybrid feature selection of random forest and ReliefF performs better than with other combinations of approaches. Based on a 5-fold cross validation partition, the mean test accuracy value of the best model was 0.727 and the mean test AUC value of the best model was 0.761. It also appeared that the stacking models outperform than single machine learning predictive models when using the same selected features.

Activity Type Detection Of Random Forest Model Using UWB Radar And Indoor Environmental Measurement Sensor (UWB 레이더와 실내 환경 측정 센서를 이용한 랜덤 포레스트 모델의 재실활동 유형 감지)

  • Park, Jin Su;Jeong, Ji Seong;Yang, Chul Seung;Lee, Jeong Gi
    • The Journal of the Convergence on Culture Technology
    • /
    • v.8 no.6
    • /
    • pp.899-904
    • /
    • 2022
  • As the world becomes an aging society due to a decrease in the birth rate and an increase in life expectancy, a system for health management of the elderly population is needed. Among them, various studies on occupancy and activity types are being conducted for smart home care services for indoor health management. In this paper, we propose a random forest model that classifies activity type as well as occupancy status through indoor temperature and humidity, CO2, fine dust values and UWB radar positioning for smart home care service. The experiment measures indoor environment and occupant positioning data at 2-second intervals using three sensors that measure indoor temperature and humidity, CO2, and fine dust and two UWB radars. The measured data is divided into 80% training set data and 20% test set data after correcting outliers and missing values, and the random forest model is applied to evaluate the list of important variables, accuracy, sensitivity, and specificity.

A Study on Predictive Modeling of I-131 Radioactivity Based on Machine Learning (머신러닝 기반 고용량 I-131의 용량 예측 모델에 관한 연구)

  • Yeon-Wook You;Chung-Wun Lee;Jung-Soo Kim
    • Journal of radiological science and technology
    • /
    • v.46 no.2
    • /
    • pp.131-139
    • /
    • 2023
  • High-dose I-131 used for the treatment of thyroid cancer causes localized exposure among radiology technologists handling it. There is a delay between the calibration date and when the dose of I-131 is administered to a patient. Therefore, it is necessary to directly measure the radioactivity of the administered dose using a dose calibrator. In this study, we attempted to apply machine learning modeling to measured external dose rates from shielded I-131 in order to predict their radioactivity. External dose rates were measured at 1 m, 0.3 m, and 0.1 m distances from a shielded container with the I-131, with a total of 868 sets of measurements taken. For the modeling process, we utilized the hold-out method to partition the data with a 7:3 ratio (609 for the training set:259 for the test set). For the machine learning algorithms, we chose linear regression, decision tree, random forest and XGBoost. To evaluate the models, we calculated root mean square error (RMSE), mean square error (MSE), and mean absolute error (MAE) to evaluate accuracy and R2 to evaluate explanatory power. Evaluation results are as follows. Linear regression (RMSE 268.15, MSE 71901.87, MAE 231.68, R2 0.92), decision tree (RMSE 108.89, MSE 11856.92, MAE 19.24, R2 0.99), random forest (RMSE 8.89, MSE 79.10, MAE 6.55, R2 0.99), XGBoost (RMSE 10.21, MSE 104.22, MAE 7.68, R2 0.99). The random forest model achieved the highest predictive ability. Improving the model's performance in the future is expected to contribute to lowering exposure among radiology technologists.

Ship Type Prediction using Random Forest with Limited Ship Information (제한적 선박 정보와 무작위의 숲 분류기를 이용한 선종 예측)

  • Ho-Kun Jeon;Jae Rim Han
    • Proceedings of the Korean Institute of Navigation and Port Research Conference
    • /
    • 2022.06a
    • /
    • pp.106-107
    • /
    • 2022
  • The ship type identification of the surrounding ship is important information for navigators and VTS officers since they can estimate the maneuverability and near-future route of the ships. However, it is more than frequent that the information is not provided due to transmission trouble and seafarers' unfamiliarity with AIS. Thus, this study suggests predicting ship types through the Random Forest classifier after preparing a training and test dataset that contains ship features and types. The AIS data for Ulsan coast in 2018 was used for this study. The method may provide the effect that many navigators and VTS officers discuss and share the experience of predicting ship types.

  • PDF

A Study on Pre-evaluation of Tree Species Classification Possibility of CAS500-4 Using RapidEye Satellite Imageries (농림위성 활용 수종분류 가능성 평가를 위한 래피드아이 영상 기반 시험 분석)

  • Kwon, Soo-Kyung;Kim, Kyoung-Min;Lim, Joongbin
    • Korean Journal of Remote Sensing
    • /
    • v.37 no.2
    • /
    • pp.291-304
    • /
    • 2021
  • Updating a forest type map is essential for sustainable forest resource management and monitoring to cope with climate change and various environmental problems. According to the necessity of efficient and wide-area forestry remote sensing, CAS500-4 (Compact Advanced Satellite 500-4; The agriculture and forestry satellite) project has been confirmed and scheduled for launch in 2023. Before launching and utilizing CAS500-4, this study aimed to pre-evaluation the possibility of satellite-based tree species classification using RapidEye, which has similar specifications to the CAS500-4. In this study, the study area was the Chuncheon forest management complex, Gangwon-do. The spectral information was extracted from the growing season image. And the GLCM texture information was derived from the growing and non-growing seasons NIR bands. Both information were used to classification with random forest machine learning method. In this study, tree species were classified into nine classes to the coniferous tree (Korean red pine, Korean pine, Japanese larch), broad-leaved trees (Mongolian oak, Oriental cork oak, East Asian white birch, Korean Castanea, and other broad-leaved trees), and mixed forest. Finally, the classification accuracy was calculated by comparing the forest type map and classification results. As a result, the accuracy was 39.41% when only spectral information was used and 69.29% when both spectral information and texture information was used. For future study, the applicability of the CAS500-4 will be improved by substituting additional variables that more effectively reflect vegetation's ecological characteristics.

Classification ofWarm Temperate Vegetations and GIS-based Forest Management System

  • Cho, Sung-Min
    • International journal of advanced smart convergence
    • /
    • v.10 no.1
    • /
    • pp.216-224
    • /
    • 2021
  • Aim of this research was to classify forest types at Wando in Jeonnam Province and develop warm temperate forest management system with application of Remote Sensing and GIS. Another emphasis was given to the analysis of satellite images to compare forest type changes over 10 year periods from 2009 to 2019. We have accomplished this study by using ArcGIS Pro and ENVI. For this research, Landsat satellite images were obtained by means of terrestrial, airborne and satellite imagery. Based on the field survey data, all land uses and forest types were divided into 5 forest classes; Evergreen broad-leaved forest, Evergreen Coniferous forest, Deciduous broad-leaved forest, Mixed fores, and others. Supervised classification was carried out with a random forest classifier based on manually collected training polygons in ROI. Accuracy assessment of the different forest types and land-cover classifications was calculated based on the reference polygons. Comparison of forest changes over 10 year periods resulted in different vegetation biomass volumes, producing the loss of deciduous forests in 2019 probably due to the expansion of residential areas and rapid deforestation.