• Title/Summary/Keyword: classification trees

Search Result 313, Processing Time 0.025 seconds

A comparative assessment of bagging ensemble models for modeling concrete slump flow

  • Aydogmus, Hacer Yumurtaci;Erdal, Halil Ibrahim;Karakurt, Onur;Namli, Ersin;Turkan, Yusuf S.;Erdal, Hamit
    • Computers and Concrete
    • /
    • v.16 no.5
    • /
    • pp.741-757
    • /
    • 2015
  • In the last decade, several modeling approaches have been proposed and applied to estimate the high-performance concrete (HPC) slump flow. While HPC is a highly complex material, modeling its behavior is a very difficult issue. Thus, the selection and application of proper modeling methods remain therefore a crucial task. Like many other applications, HPC slump flow prediction suffers from noise which negatively affects the prediction accuracy and increases the variance. In the recent years, ensemble learning methods have introduced to optimize the prediction accuracy and reduce the prediction error. This study investigates the potential usage of bagging (Bag), which is among the most popular ensemble learning methods, in building ensemble models. Four well-known artificial intelligence models (i.e., classification and regression trees CART, support vector machines SVM, multilayer perceptron MLP and radial basis function neural networks RBF) are deployed as base learner. As a result of this study, bagging ensemble models (i.e., Bag-SVM, Bag-RT, Bag-MLP and Bag-RBF) are found superior to their base learners (i.e., SVM, CART, MLP and RBF) and bagging could noticeable optimize prediction accuracy and reduce the prediction error of proposed predictive models.

TREE FORM CLASSIFICATION OF OWNER PAYMENT BEHAVIOUR

  • Hanh Tran;David G. Carmichael;Maria C. A. Balatbat
    • International conference on construction engineering and project management
    • /
    • 2011.02a
    • /
    • pp.526-533
    • /
    • 2011
  • Contracting is said to be a high-risk business, and a common cause of business failure is related to cash management. A contractor's financial viability depends heavily on how actual payments from an owner deviate from those defined in the contract. The paper presents a method for contractors to evaluate the punctuality and fullness of owner payments based on historical behaviour. It does this by classifying owners according to their late and incomplete payment practices. A payment profile of an owner, in the form of aging claims submitted by the contractor, is used as a basis for the method's development. Regression trees are constructed based on three predictor variables, namely, the average time to payment following a claim, the total amount ending up being paid within a certain period and the level of variability in claim response times. The Tree package in the publicly available R program is used for building the trees. The analysis is particularly useful for contractors at the pre-tendering stage, when contractors predict the likely payment scenario in an upcoming project. Based on the method, the contractor can decide whether to tender or not tender, or adjust its financial preparations accordingly. The paper is a contribution in risk management applied to claim and dispute resolution practice. It is argued that by contractors having a better understanding of owner payment behaviour, fewer disputes and contractor business failures will occur.

  • PDF

MAPPING OF EUCALYPTUS PLANTATIONS THROUGH TEMPORAL SATELLITE DATA IN CHINA

  • Heo, Joon;Jayakumar, S.;Lee, Jung-Bin
    • Proceedings of the KSRS Conference
    • /
    • 2007.10a
    • /
    • pp.471-474
    • /
    • 2007
  • Eucalyptus plantations play a major role in the China's ecological, social, economic and other aspects and presently China is the second largest producer of Eucalyptus in the world next to Brazil. It was introduced as an ornamental tree during 1890 but later it became a commercial crop. During 1960s large number of Eucalyptus timber were used for railway sleepers and it was also used as shelter belt for rubber trees. It becomes one of the important national resources of commercial timber once the production reached to 5 million $m^{3}/yr$. Through Eucalyptus oil, it brought about 20% of foreign exchange. In the present study, it was aimed to estimate the Eucalyptus growing area in the southern Guangdong in China in terms of aerial extent and changes between 1991 and 2001 using Landsat TM and ETM+ data. Object based classification technique and subsequent temporal change detection analysis were followed to identify the changes between the periods. In the present study, the total area was divided into three classes viz., plantation area with trees, plantation area without trees and others. Object oriented classification was found to be more accurate in the present study. Overall increase of about 23.62 $km^{2}$ was noted between 1991 and 2001 in the plantation area. With reference to the present study area, the growth of Eucalyptus growing area was 7.4% in the 10 year periods. From this study it is clear that the area under Eucalyptus cultivation is growing considerably year by year in China. However, elaborate study must be conducted considering larger areas to accurately predict the growth of Eucalyptus growing areas.

  • PDF

Generation of Efficient Fuzzy Classification Rules for Intrusion Detection (침입 탐지를 위한 효율적인 퍼지 분류 규칙 생성)

  • Kim, Sung-Eun;Khil, A-Ra;Kim, Myung-Won
    • Journal of KIISE:Software and Applications
    • /
    • v.34 no.6
    • /
    • pp.519-529
    • /
    • 2007
  • In this paper, we investigate the use of fuzzy rules for efficient intrusion detection. We use evolutionary algorithm to optimize the set of fuzzy rules for intrusion detection by constructing fuzzy decision trees. For efficient execution of evolutionary algorithm we use supervised clustering to generate an initial set of membership functions for fuzzy rules. In our method both performance and complexity of fuzzy rules (or fuzzy decision trees) are taken into account in fitness evaluation. We also use evaluation with data partition, membership degree caching and zero-pruning to reduce time for construction and evaluation of fuzzy decision trees. For performance evaluation, we experimented with our method over the intrusion detection data of KDD'99 Cup, and confirmed that our method outperformed the existing methods. Compared with the KDD'99 Cup winner, the accuracy was increased by 1.54% while the cost was reduced by 20.8%.

Analysis of Morphological Characteristics and Variation in Five Populations of Zabelia tyaihyonii in South Korea

  • Nam, Jae Ik;Kim, Mun Seop;Song, Jeong Ho;Seo, Jeong Min;Choi, Go Eun;Kim, Young Ki
    • Journal of People, Plants, and Environment
    • /
    • v.24 no.6
    • /
    • pp.619-628
    • /
    • 2021
  • Background and objective: Native to the limestone zones of the Korean Peninsula, Zabelia tyaihyonii is a popular plant for landscaping. As it is now classified as a rare species, the conservation of its genetic resources is necessary. Methods: In this study, which aimed to understand the morphological variation of Z. tyaihyonii, 18 characteristics of Z. tyaihyonii from five habitats were examined. Results: Of these 18 characteristics, 16 characteristics showed significant differences among sites, and the coefficient of variation ranged from 5.4% (for corolla lobe number) to 31.3% (for flower number). Notable variations were observed in the size of flower and calyx lobe. When the corolla length and calyx lobe length were used as the classification key of Z. tyaihyonii, the sites were divided into those with small, intermediate, and large values. Hair was observed on the filament of all samples, a finding which conflicts with an earlier report. Rather than classifying Z. tyaihyonii into different species on the basis of corolla length (COL) and calyx lobe length (CALL) values, we recommend modifying the species description to incorporate the variation in these characteristics of interest. Principal component analysis results showed that the first main component was highly correlated with the traits related to the size of the calyx lobe (length: 0.819, width: 0.758), and the second main component was highly correlated with the traits related with the size of the inflorescence (length: 0.790, width: 0.626). Conclusion: Several notable variations were identified among the characteristics related to inflorescence and calyx lobe. There is little genetic exchange among groups, or each group is influenced by micro environmental factors, because sites that are located nearby. In addition, the difference between COL and CALL, which is used as the classification key for Z. tyaihyonii, was divided into small group, large group, and intermediate group, regardless of the sites' geographical distance.

Ecological Studies on Several Forest Communities in Kwangnung. A Study of the Site Index and the ground vegetation of Larch (광릉삼림의 생태학적 연구 낙엽송의 Site Index와 임상식생에 관하여)

  • 차종환
    • Journal of Plant Biology
    • /
    • v.9 no.1_2
    • /
    • pp.7-16
    • /
    • 1966
  • In order to determine the factors related to site quality, 13 areas of Larch growing in the Kwangung and its vicinity forest as sample plots, were examined. Sample plots included various site classes as well as age classes. Three were divided into two groups (major and minor trees). Average height of dominant trees was determined through messurement of 5 to 6 dominant tree in each sample plots. Average height of dominant 30 year-old trees was the basis for site index. A Standard Yield Table for the larch produced in Kwangnung forest was made by various data, which included age class 5, ranging from 10 to 45 years. The relationship of the height of the trees, the site conditions, and ground vegetation are investigated in this paper. The site indexes of 40 forest class age in 28-B and 28-G forest classes of the larch associations for ground vegetation had comparatively rarge differences due to the sampled areas. The relation of the direction of forest communities to the height and the diameter of the tree shwoed that its communiteis of northest and northwest parts appeared higher valueof the height and the diameter. The diameter and the height of trees were closely realted to each other. The samller the occupied area per tree and the smaller the average distance among trees, the more density was increased. The larger the density was the lower height of the trees. In the ground vegetation of the larch communities, there seems to be a definite correlation between the height of trees and the occupied area per tree or the average distance among the trees. The height of trees and site index of two larch communities were as follow: 28-B forest class site index 20.8, height 24.0m, 28-G forest class site index 18.4, height 20.9m. The ground layer was analyzed by the method of Quadrat(20/20sq. cm) with an interval of 1M. It set up 40 Quadrats of the larch communiteis. The community structure of the ground vegetation of two larch was analyzed, and important value was calculated and then evaluated. The ground vegetation under the larch had developed Burmannii Beauv stratal society below the 28-B and 28-G the forest class. Accordingly, the first important value of Burmannii Beauv was found in two ground vegetation below the larch. Therefore, this species could be quantitatively considered as the forest indicator species. Common species of each community appeared 18 species out of 34 species in the ground vegetation under two larch communities. The ground vegetation of the 28-B forest class showed more than that of the 28-G forest class. the similarity of the ground vegetation was measrued by the Frequency Index Community Coefficient. The differences between the associations were lcearly manifested by the ground vegetation tested by Gleason's Frequency Index of Community Coefficient for the analysis of each stratal society of all associations. According to F.I.C.C. the ground vegetation under two larch(28-B and 28-G) forest classes showed higher value. An investigation into the relationship of physical and chemical properties of soil and site was considered the next step to be taken in the study of the larch site classification.

  • PDF

Change Prediction for Potential Habitats of Warm-temperate Evergreen Broad-leaved Trees in Korea by Climate Change (기후변화에 따른 한반도 난온대 상록활엽수의 잠재 생육지 변화 예측)

  • Yun, Jong-Hak;Nakao, Katsuhiro;Park, Chan-Ho;Lee, Byoung-Yoon;Oh, Kyoung-Hee
    • Korean Journal of Environment and Ecology
    • /
    • v.25 no.4
    • /
    • pp.590-600
    • /
    • 2011
  • The research was carried out for prediction of the potential habitats of warm-temperate evergreen broad-leaved trees under the current climate(1961~1990) and three climate change scenario(2081~2100) (CCCMA-A2, CSIRO-A2 and HADCM3-A2) using classification tree(CT) model. Presence/absence records of warm-temperate evergreen broad-leaved trees were extracted from actual distribution data as response variables, and four climatic variables (warmth index, WI; minimum temperature of the coldest month, TMC; summer precipitation, PRS; and winter precipitation, PRW) were used as predictor variables. Potential habitats(PH) was predicted 28,230$km^2$ under the current climate and 77,140~89,285$km^2$ under the three climate change scenarios. The PH masked by land use(PHLU) was predicted 8,274$km^2$ and the proportion of PHLU within PH was 29.3% under the current climate. The PH masked by land use(PHLU) was predicted 35,177~45,170$km^2$ and increased 26.9~36.9% under the three climate change scenarios. The expansion of warm-temperate evergreen broad-leaved trees by climate change progressed habitat fragmentation by restriction of land use. The habitats increase of warm-temperate evergreen broad-leaved trees had been expected competitive with warm-temperate deciduous broadleaf forest and suggested the expand and northward shift of warm-temperate evergreen broad-leaved forest zone.

ACCOUNTING FOR IMPORTANCE OF VARIABLES IN MUL TI-SENSOR DATA FUSION USING RANDOM FORESTS

  • Park No-Wook;Chi Kwang-Hoon
    • Proceedings of the KSRS Conference
    • /
    • 2005.10a
    • /
    • pp.283-285
    • /
    • 2005
  • To account for the importance of variable in multi-sensor data fusion, random forests are applied to supervised land-cover classification. The random forests approach is a non-parametric ensemble classifier based on CART-like trees. Its distinguished feature is that the importance of variable can be estimated by randomly permuting the variable of interest in all the out-of-bag samples for each classifier. Supervised classification with a multi-sensor remote sensing data set including optical and polarimetric SAR data was carried out to illustrate the applicability of random forests. From the experimental result, the random forests approach could extract important variables or bands for land-cover discrimination and showed good performance, as compared with other non-parametric data fusion algorithms.

  • PDF

Performance Comparison of Mahalanobis-Taguchi System and Logistic Regression : A Case Study (마할라노비스-다구치 시스템과 로지스틱 회귀의 성능비교 : 사례연구)

  • Lee, Seung-Hoon;Lim, Geun
    • Journal of Korean Institute of Industrial Engineers
    • /
    • v.39 no.5
    • /
    • pp.393-402
    • /
    • 2013
  • The Mahalanobis-Taguchi System (MTS) is a diagnostic and predictive method for multivariate data. In the MTS, the Mahalanobis space (MS) of reference group is obtained using the standardized variables of normal data. The Mahalanobis space can be used for multi-class classification. Once this MS is established, the useful set of variables is identified to assist in the model analysis or diagnosis using orthogonal arrays and signal-to-noise ratios. And other several techniques have already been used for classification, such as linear discriminant analysis and logistic regression, decision trees, neural networks, etc. The goal of this case study is to compare the ability of the Mahalanobis-Taguchi System and logistic regression using a data set.

Performance Comparison of Decision Trees of J48 and Reduced-Error Pruning

  • Jin, Hoon;Jung, Yong Gyu
    • International journal of advanced smart convergence
    • /
    • v.5 no.1
    • /
    • pp.30-33
    • /
    • 2016
  • With the advent of big data, data mining is more increasingly utilized in various decision-making fields by extracting hidden and meaningful information from large amounts of data. Even as exponential increase of the request of unrevealing the hidden meaning behind data, it becomes more and more important to decide to select which data mining algorithm and how to use it. There are several mainly used data mining algorithms in biology and clinics highlighted; Logistic regression, Neural networks, Supportvector machine, and variety of statistical techniques. In this paper it is attempted to compare the classification performance of an exemplary algorithm J48 and REPTree of ML algorithms. It is confirmed that more accurate classification algorithm is provided by the performance comparison results. More accurate prediction is possible with the algorithm for the goal of experiment. Based on this, it is expected to be relatively difficult visually detailed classification and distinction.