• Title/Summary/Keyword: regression trees

Search Result 245, Processing Time 0.027 seconds

An assessment of machine learning models for slump flow and examining redundant features

  • Unlu, Ramazan
    • Computers and Concrete
    • /
    • v.25 no.6
    • /
    • pp.565-574
    • /
    • 2020
  • Over the years, several machine learning approaches have been proposed and utilized to create a prediction model for the high-performance concrete (HPC) slump flow. Despite HPC is a highly complex material, predicting its pattern is a rather ambitious process. Hence, choosing and applying the correct method remain a crucial task. Like some other problems, prediction of HPC slump flow suffers from abnormal attributes which might both have an influence on prediction accuracy and increases variance. In recent years, different studies are proposed to optimize the prediction accuracy for HPC slump flow. However, more state-of-the-art regression algorithms can be implemented to create a better model. This study focuses on several methods with different mathematical backgrounds to get the best possible results. Four well-known algorithms Support Vector Regression, M5P Trees, Random Forest, and MLPReg are implemented with optimum parameters as base learners. Also, redundant features are examined to better understand both how ingredients influence on prediction models and whether possible to achieve acceptable results with a few components. Based on the findings, the MLPReg algorithm with optimum parameters gives better results than others in terms of commonly used statistical error evaluation metrics. Besides, chosen algorithms can give rather accurate results using just a few attributes of a slump flow dataset.

Interesting Node Finding Criteria for Regression Trees (회귀의사결정나무에서의 관심노드 찾는 분류 기준법)

  • 이영섭
    • The Korean Journal of Applied Statistics
    • /
    • v.16 no.1
    • /
    • pp.45-53
    • /
    • 2003
  • One of decision tree method is regression trees which are used to predict a continuous response. The general splitting criteria in tree growing are based on a compromise in the impurity between the left and the right child node. By picking or the more interesting subsets and ignoring the other, the proposed new splitting criteria in this paper do not split based on a compromise of child nodes anymore. The tree structure by the new criteria might be unbalanced but plausible. It can find a interesting subset as early as possible and express it by a simple clause. As a result, it is very interpretable by sacrificing a little bit of accuracy.

Carbon Storage and Uptake by Evergreen Trees for Urban Landscape - For Pinus densiflora and Pinus koraiensis - (도시 상록 조경수의 탄소저장 및 흡수 - 소나무와 잣나무를 대상으로 -)

  • Jo, Hyun-Kil;Kim, Jin-Young;Park, Hye-Mi
    • Korean Journal of Environment and Ecology
    • /
    • v.27 no.5
    • /
    • pp.571-578
    • /
    • 2013
  • This study generated regression models through a direct harvesting method to estimate carbon storage and uptake by Pinus densiflora and Pinus koraiensis, the major evergreen tree species in urban landscape, and established essential information to quantify carbon reduction by urban trees. Open-grown landscape tree individuals for each species were sampled reflecting various diameter sizes at a given interval. The study measured biomass for each part including the roots of sample trees to compute the total carbon storage per tree. Annual carbon uptake per tree was quantified by analyzing radial growth rates of stem samples at breast height. The study then derived a regression model easily applicable in estimating carbon storage and uptake per tree for the two species by using diameter at breast height (DBH) as an independent variable. All the regression models showed high fitness with $r^2$ values of higher than 0.98. While carbon storage and uptake by young trees tended to be greater for P. densiflora than for P. koraiensis in the same diameter sizes, those by mature trees with DBH sizes of larger than 20 cm showed results to the contrary due to a difference in growth rates. A tree of P. densiflora and P. koraiensis with DBH of 25 cm stored 115.6 kg and 130.0 kg of carbon, respectively, and annually sequestered 9.4 kg and 14.6 kg. The study has broken new grounds to overcome limitations of the past studies which quantified carbon reduction of the study species by substituting, due to a difficulty in direct cutting and root digging of landscape trees, coefficients from forest trees such as biomass expansion factors, ratios of below ground/above ground biomass, and diameter growth rates.

Regression Trees with. Unbiased Variable Selection (변수선택 편향이 없는 회귀나무를 만들기 위한 알고리즘)

  • 김진흠;김민호
    • The Korean Journal of Applied Statistics
    • /
    • v.17 no.3
    • /
    • pp.459-473
    • /
    • 2004
  • It has well known that an exhaustive search algorithm suggested by Breiman et. a1.(1984) has a trend to select the variable having relatively many possible splits as an splitting rule. We propose an algorithm to overcome this variable selection bias problem and then construct unbiased regression trees based on the algorithm. The proposed algorithm runs two steps of selecting a split variable and determining a split rule for binary split based on the split variable. Simulation studies were performed to compare the proposed algorithm with Breiman et a1.(1984)'s CART(Classification and Regression Tree) in terms of degree of variable selection bias, variable selection power, and MSE(Mean Squared Error). Also, we illustrate the proposed algorithm with real data sets.

Model Selection for Tree-Structured Regression

  • Kim, Sung-Ho
    • Journal of the Korean Statistical Society
    • /
    • v.25 no.1
    • /
    • pp.1-24
    • /
    • 1996
  • In selecting a final tree, Breiman, Friedman, Olshen, and Stone(1984) compare the prediction risks of a pair of tree, where one contains the other, using the standard error of the prediction risk of the larger one. This paper proposes an approach to selection of a final tree by using the standard error of the difference of the prediction risks between a pair of trees rather than the standard error of the larger one. This approach is compared with CART's for simulated data from a simple regression model. Asymptotic results of the approaches are also derived and compared to each other. Both the asymptotic and the simulation results indicate that final trees by CART tend to be smaller than desired.

  • PDF

Integrity Assessment Models for Bridge Structures Using Fuzzy Decision-Making (퍼지의사결정을 이용한 교량 구조물의 건전성평가 모델)

  • 안영기;김성칠
    • Journal of the Korea Concrete Institute
    • /
    • v.14 no.6
    • /
    • pp.1022-1031
    • /
    • 2002
  • This paper presents efficient models for bridge structures using CART-ANFIS (classification and regression tree-adaptive neuro fuzzy inference system). A fuzzy decision tree partitions the input space of a data set into mutually exclusive regions, each region is assigned a label, a value, or an action to characterize its data points. Fuzzy decision trees used for classification problems are often called fuzzy classification trees, and each terminal node contains a label that indicates the predicted class of a given feature vector. In the same vein, decision trees used for regression problems are often called fuzzy regression trees, and the terminal node labels may be constants or equations that specify the predicted output value of a given input vector. Note that CART can select relevant inputs and do tree partitioning of the input space, while ANFIS refines the regression and makes it continuous and smooth everywhere. Thus it can be seen that CART and ANFIS are complementary and their combination constitutes a solid approach to fuzzy modeling.

The Development of Models and the Characteristics for Subway Noise Using the Classification and Regression Trees (CART 분석을 이용한 지하철 소음모형 개발 및 특성 연구)

  • Kim, Tae-Ho;Lee, Jae-Myung;Won, Jai-Mu;Song, In-Suk
    • Journal of the Korean Society for Railway
    • /
    • v.10 no.5
    • /
    • pp.480-486
    • /
    • 2007
  • The subway is a necessary public transportation in big cities, which many citizens are using now. However, the demands for subway inner circumstance by citizens are growing recently. Among them, the noise problem is the hot issue to be solved. So, in this study we classified the characteristics of subway noise using the classification and regression trees (CART) based on noise level data in line No. 5 in Seoul. After that We developed the models for effect of subway noise and analyzed the characteristics through it. The result of this study is that we need to consider the type of geometry design and operational factors when the problem of subway noise improves, because the factors which weigh with subway noise are different by type of geometry and operational part.

The Effect of Urban Trees on Residential Solar Energy Potential (도심 수목이 분산형 주거 태양광에너지 잠재량에 미치는 영향)

  • Ko, Yekang
    • Journal of the Korean Institute of Landscape Architecture
    • /
    • v.42 no.1
    • /
    • pp.41-49
    • /
    • 2014
  • This study spatially assesses the impact of trees on residential rooftop solar energy potential using urban three-dimensional models derived from Light Detection and Ranging(LiDAR) data in San Francisco, California. In recent years on-site solar energy generation in cities has become an essential agenda in municipal climate action plans. However, it can be limited by neighboring environments such as shade from topography, buildings and trees. Of all these effects, the impact of trees on rooftop photovoltaics(PVs) requires careful attention because improper situation of solar panels without considering trees can result in inefficient solar energy generation, tree removal, and/or increasing building energy demand and urban heat island effect. Using ArcMap 9.3.1, we calculated the incoming annual solar radiation on individual rooftops in San Francisco and the reduced insolation affected by trees. Furthermore, we performed a multiple regression analysis to see what attributes of trees in a neighborhood(tree density, tree heights, and the variance of tree heights) affect rooftop insolation. The result shows that annual total residential rooftops insolation in San Francisco is 18,326,671 MWh and annual total light-loss reduction caused by trees is 326,406 MWh, which is about 1.78%. The annual insolation shows a wide range of values from $34.4kWh/m^2/year$ to $1,348.4kWh/m^2/year$. The result spatially maps the locations that show the various levels of impact from trees. The result from multiple regression shows that tree density, average tree heights and the variation of tree heights in a neighborhood have statistically significant effects on the rooftop solar potential. The results can be linked to municipal energy planning in order to manage potential conflicts as cities with low to medium population density begin implementing on-site solar energy generation. Rooftop solar energy generation makes the best contribution towards achieving sustainability when PVs are optimally located while pursuing the preservation of urban trees.

Data-driven approach to machine condition prognosis using least square regression trees

  • Tran, Van Tung;Yang, Bo-Suk;Oh, Myung-Suck
    • Proceedings of the Korean Society for Noise and Vibration Engineering Conference
    • /
    • 2007.11a
    • /
    • pp.886-890
    • /
    • 2007
  • Machine fault prognosis techniques have been considered profoundly in the recent time due to their profit for reducing unexpected faults or unscheduled maintenance. With those techniques, the working conditions of components, the trending of fault propagation, and the time-to-failure are forecasted precisely before they reach the failure thresholds. In this work, we propose an approach of Least Square Regression Tree (LSRT), which is an extension of the Classification and Regression Tree (CART), in association with one-step-ahead prediction of time-series forecasting technique to predict the future conditions of machines. In this technique, the number of available observations is firstly determined by using Cao's method and LSRT is employed as prognosis system in the next step. The proposed approach is evaluated by real data of low methane compressor. Furthermore, the comparison between the predicted results of CART and LSRT are carried out to prove the accuracy. The predicted results show that LSRT offers a potential for machine condition prognosis.

  • PDF