• Title/Summary/Keyword: decision trees

Search Result 312, Processing Time 0.02 seconds

Induction of Decision Tress Using the Threshold Concept (Threshold를 이용한 의사결정나무의 생성)

  • 이후석;김재련
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.21 no.45
    • /
    • pp.57-65
    • /
    • 1998
  • This paper addresses the data classification using the induction of decision trees. A weakness of other techniques of induction of decision trees is that decision trees are too large because they construct decision trees until leaf nodes have a single class. Our study include both overcoming this weakness and constructing decision trees which is small and accurate. First, we construct the decision trees using classification threshold and exception threshold in construction stage. Next, we present two stage pruning method using classification threshold and reduced error pruning in pruning stage. Empirical results show that our method obtain the decision trees which is accurate and small.

  • PDF

Fuaay Decision Tree Induction to Obliquely Partitioning a Feature Space (특징공간을 사선 분할하는 퍼지 결정트리 유도)

  • Lee, Woo-Hang;Lee, Keon-Myung
    • Journal of KIISE:Software and Applications
    • /
    • v.29 no.3
    • /
    • pp.156-166
    • /
    • 2002
  • Decision tree induction is a kind of useful machine learning approach for extracting classification rules from a set of feature-based examples. According to the partitioning style of the feature space, decision trees are categorized into univariate decision trees and multivariate decision trees. Due to observation error, uncertainty, subjective judgment, and so on, real-world data are prone to contain some errors in their feature values. For the purpose of making decision trees robust against such errors, there have been various trials to incorporate fuzzy techniques into decision tree construction. Several researches hove been done on incorporating fuzzy techniques into univariate decision trees. However, for multivariate decision trees, few research has been done in the line of such study. This paper proposes a fuzzy decision tree induction method that builds fuzzy multivariate decision trees named fuzzy oblique decision trees, To show the effectiveness of the proposed method, it also presents some experimental results.

Knowledge Representation Using Decision Trees Constructed Based on Binary Splits

  • Azad, Mohammad
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.14 no.10
    • /
    • pp.4007-4024
    • /
    • 2020
  • It is tremendously important to construct decision trees to use as a tool for knowledge representation from a given decision table. However, the usual algorithms may split the decision table based on each value, which is not efficient for numerical attributes. The methodology of this paper is to split the given decision table into binary groups as like the CART algorithm, that uses binary split to work for both categorical and numerical attributes. The difference is that it uses split for each attribute established by the directed acyclic graph in a dynamic programming fashion whereas, the CART uses binary split among all considered attributes in a greedy fashion. The aim of this paper is to study the effect of binary splits in comparison with each value splits when building the decision trees. Such effect can be studied by comparing the number of nodes, local and global misclassification rate among the constructed decision trees based on three proposed algorithms.

A Decision Tree Induction using Genetic Programming with Sequentially Selected Features (순차적으로 선택된 특성과 유전 프로그래밍을 이용한 결정나무)

  • Kim Hyo-Jung;Park Chong-Sun
    • Korean Management Science Review
    • /
    • v.23 no.1
    • /
    • pp.63-74
    • /
    • 2006
  • Decision tree induction algorithm is one of the most widely used methods in classification problems. However, they could be trapped into a local minimum and have no reasonable means to escape from it if tree algorithm uses top-down search algorithm. Further, if irrelevant or redundant features are included in the data set, tree algorithms produces trees that are less accurate than those from the data set with only relevant features. We propose a hybrid algorithm to generate decision tree that uses genetic programming with sequentially selected features. Correlation-based Feature Selection (CFS) method is adopted to find relevant features which are fed to genetic programming sequentially to find optimal trees at each iteration. The new proposed algorithm produce simpler and more understandable decision trees as compared with other decision trees and it is also effective in producing similar or better trees with relatively smaller set of features in the view of cross-validation accuracy.

Decision Tree-Based Feature-Selective Neural Network Model: Case of House Price Estimation (의사결정나무를 활용한 신경망 모형의 입력특성 선택: 주택가격 추정 사례)

  • Yoon Han-Seong
    • Journal of Korea Society of Digital Industry and Information Management
    • /
    • v.19 no.1
    • /
    • pp.109-118
    • /
    • 2023
  • Data-based analysis methods have become used more for estimating or predicting housing prices, and neural network models and decision trees in the field of big data are also widely used more and more. Neural network models are often evaluated to be superior to existing statistical models in terms of estimation or prediction accuracy. However, there is ambiguity in determining the input feature of the input layer of the neural network model, that is, the type and number of input features, and decision trees are sometimes used to overcome these disadvantages. In this paper, we evaluate the existing methods of using decision trees and propose the method of using decision trees to prioritize input feature selection in neural network models. This can be a complementary or combined analysis method of the neural network model and decision tree, and the validity was confirmed by applying the proposed method to house price estimation. Through several comparisons, it has been summarized that the selection of appropriate input characteristics according to priority can increase the estimation power of the model.

A Recursive Partitioning Rule for Binary Decision Trees

  • Kim, Sang-Guin
    • Communications for Statistical Applications and Methods
    • /
    • v.10 no.2
    • /
    • pp.471-478
    • /
    • 2003
  • In this paper, we reconsider the Kolmogorov-Smirnoff distance as a split criterion for binary decision trees and suggest an algorithm to obtain the Kolmogorov-Smirnoff distance more efficiently when the input variable have more than three categories. The Kolmogorov-Smirnoff distance is shown to have the property of exclusive preference. Empirical results, comparing the Kolmogorov-Smirnoff distance to the Gini index, show that the Kolmogorov-Smirnoff distance grows more accurate trees in terms of misclassification rate.

Modeling of Environmental Survey by Decision Trees

  • Park, Hee-Chang;Cho, Kwang-Hyun
    • 한국데이터정보과학회:학술대회논문집
    • /
    • 2004.10a
    • /
    • pp.63-75
    • /
    • 2004
  • The decision tree approach is most useful in classification problems and to divide the search space into rectangular regions. Decision tree algorithms are used extensively for data mining in many domains such as retail target marketing, fraud dection, data reduction and variable screening, category merging, etc. We analyze Gyeongnam social indicator survey data using decision tree techniques for environmental information. We can use these decision tree outputs for environmental preservation and improvement.

  • PDF

Modeling of Environmental Survey by Decision Trees

  • Park, Hee-Chang;Cho, Kwang-Hyun
    • Journal of the Korean Data and Information Science Society
    • /
    • v.15 no.4
    • /
    • pp.759-771
    • /
    • 2004
  • The decision tree approach is most useful in classification problems and to divide the search space into rectangular regions. Decision tree algorithms are used extensively for data mining in many domains such as retail target marketing, fraud dection, data reduction and variable screening, category merging, etc. We analyze Gyeongnam social indicator survey data using decision tree techniques for environmental information. We can use these decision tree outputs for environmental preservation and improvement.

  • PDF

A review of tree-based Bayesian methods

  • Linero, Antonio R.
    • Communications for Statistical Applications and Methods
    • /
    • v.24 no.6
    • /
    • pp.543-559
    • /
    • 2017
  • Tree-based regression and classification ensembles form a standard part of the data-science toolkit. Many commonly used methods take an algorithmic view, proposing greedy methods for constructing decision trees; examples include the classification and regression trees algorithm, boosted decision trees, and random forests. Recent history has seen a surge of interest in Bayesian techniques for constructing decision tree ensembles, with these methods frequently outperforming their algorithmic counterparts. The goal of this article is to survey the landscape surrounding Bayesian decision tree methods, and to discuss recent modeling and computational developments. We provide connections between Bayesian tree-based methods and existing machine learning techniques, and outline several recent theoretical developments establishing frequentist consistency and rates of convergence for the posterior distribution. The methodology we present is applicable for a wide variety of statistical tasks including regression, classification, modeling of count data, and many others. We illustrate the methodology on both simulated and real datasets.

Integrity Assessment for Reinforced Concrete Structures Using Fuzzy Decision Making (퍼지의사결정을 이용한 RC구조물의 건전성평가)

  • 박철수;손용우;이증빈
    • Proceedings of the Computational Structural Engineering Institute Conference
    • /
    • 2002.04a
    • /
    • pp.274-283
    • /
    • 2002
  • This paper presents an efficient models for reinforeced concrete structures using CART-ANFIS(classification and regression tree-adaptive neuro fuzzy inference system). a fuzzy decision tree parttitions the input space of a data set into mutually exclusive regions, each of which is assigned a label, a value, or an action to characterize its data points. Fuzzy decision trees used for classification problems are often called fuzzy classification trees, and each terminal node contains a label that indicates the predicted class of a given feature vector. In the same vein, decision trees used for regression problems are often called fuzzy regression trees, and the terminal node labels may be constants or equations that specify the Predicted output value of a given input vector. Note that CART can select relevant inputs and do tree partitioning of the input space, while ANFIS refines the regression and makes it everywhere continuous and smooth. Thus it can be seen that CART and ANFIS are complementary and their combination constitutes a solid approach to fuzzy modeling.

  • PDF