• Title/Summary/Keyword: classification trees

Search Result 316, Processing Time 0.028 seconds

CHAID Algorithm by Cube-based Proportional Sampling

  • Park, Hee-Chang;Cho, Kwang-Hyun
    • Journal of the Korean Data and Information Science Society
    • /
    • v.15 no.4
    • /
    • pp.803-816
    • /
    • 2004
  • The decision tree approach is most useful in classification problems and to divide the search space into rectangular regions. Decision tree algorithms are used extensively for data mining in many domains such as retail target marketing, fraud dection, data reduction and variable screening, category merging, etc. CHAID uses the chi-squired statistic to determine splitting and is an exploratory method used to study the relationship between a dependent variable and a series of predictor variables. In this paper we propose CHAID algorithm by cube-based proportional sampling and explore CHAID algorithm in view of accuracy and speed by the number of variables.

  • PDF

Random Forest Model for Silicon-to-SPICE Gap and FinFET Design Attribute Identification

  • Won, Hyosig;Shimazu, Katsuhiro
    • IEIE Transactions on Smart Processing and Computing
    • /
    • v.5 no.5
    • /
    • pp.358-365
    • /
    • 2016
  • We propose a novel application of random forest, a machine learning-based general classification algorithm, to analyze the influence of design attributes on the silicon-to-SPICE (S2S) gap. To improve modeling accuracy, we introduce magnification of learning data as well as randomization for the counting of design attributes to be used for each tree in the forest. From the automatically generated decision trees, we can extract the so-called importance and impact indices, which identify the most significant design attributes determining the S2S gap. We apply the proposed method to actual silicon data, and observe that the identified design attributes show a clear trend in the S2S gap. We finally unveil 10nm key fin-shaped field effect transistor (FinFET) structures that result in a large S2S gap using the measurement data from 10nm test vehicles specialized for model-hardware correlation.

Decision Trees For Multiple Abstraction Level of Data (데이터의 다중 추상화 수준을 위한 결정 트리)

  • 정민아;이도현
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2001.04b
    • /
    • pp.82-84
    • /
    • 2001
  • 데이터 분류(classification)란 이미 분류된 객체집단군 즉, 학습 데이터에 대한 분석을 바탕으로 아직 분류되지 않는 개체의 소속 집단을 결정하는 작업이다. 현재까지 제안된 여러 가지 분류 모델 중 결정 트리(decision tree)는 인간이 이해하기 쉬운 형태를 갖고 있기 때문에 탐사적인 데이터 마이닝(exploatory)작업에 특히 유용하다. 본 논문에서는 결정 트리 분류에 다중 추상화 수준 문제(multiple abstraction level problem)를 소개하고 이러한 문제를 다루기 위한 실용적인 방법을 제안한다. 데이터의 다중 추상화 수준 문제를 해결하기 위해 추상화 수준을 강제로 같게 하는 것이 문제를 해결할 수 없다는 것을 보인 후, 데이터 값들 사이의 일반화, 세분화 관련성을 그대로 유지하면서 존재하는 유용화할 수 있는 방법을 제시한다.

  • PDF

문서지문기법을 이용한 웹 문서의 자동 분류

  • Kim Jin-Hwa
    • Proceedings of the Korean Operations and Management Science Society Conference
    • /
    • 2004.10a
    • /
    • pp.407-429
    • /
    • 2004
  • As documents in webs are increasing explosively due to the rapid development of electronic documents, an efficient system classifying documents automatically is required. In this study, a new document classification method, which is called Document Finger Print Method, is suggested to classify web documents automatically and efficiently. The performance of the suggested method is evaluated alone with other existing methods such as key words based method, weighted key words based method, neural networks, and decision trees. An experiment is designed with 10 documents categories and 59 randomly selected words. The result shows that the suggested algorithm has a superior classifying performance compared to other methods. The most important advantage of this method is that the suggested method works well without the size limits of the number of words in documents.

  • PDF

State Evaluation of RC Bridge Girders by Inductive Case Learning (귀납적 사례학습에 의한 RC교량 주형의 상태평가)

  • 안승수;김기현;박광림;황진하
    • Proceedings of the Computational Structural Engineering Institute Conference
    • /
    • 2000.10a
    • /
    • pp.159-165
    • /
    • 2000
  • A new state evaluation approach for structural safety is presented in this study. To reduce the subjectivity of the view and judgement of each expert founded on a limited body of knowledge in cognitive and inferential process of safety assessment, we introduced inductive learning method in AI. Inductive learning derives generalization from experiences. Decision tree induction algorithm analyzes the domain knowledge, produce rules via decision trees and then allow us to determine the classification of an object from case examples. The training set of state evaluation is constructed according to the selected attributes from working reports of RC bridge girders.

  • PDF

Pruning the Boosting Ensemble of Decision Trees

  • Yoon, Young-Joo;Song, Moon-Sup
    • Communications for Statistical Applications and Methods
    • /
    • v.13 no.2
    • /
    • pp.449-466
    • /
    • 2006
  • We propose to use variable selection methods based on penalized regression for pruning decision tree ensembles. Pruning methods based on LASSO and SCAD are compared with the cluster pruning method. Comparative studies are performed on some artificial datasets and real datasets. According to the results of comparative studies, the proposed methods based on penalized regression reduce the size of boosting ensembles without decreasing accuracy significantly and have better performance than the cluster pruning method. In terms of classification noise, the proposed pruning methods can mitigate the weakness of AdaBoost to some degree.

Study on the ensemble methods with kernel ridge regression

  • Kim, Sun-Hwa;Cho, Dae-Hyeon;Seok, Kyung-Ha
    • Journal of the Korean Data and Information Science Society
    • /
    • v.23 no.2
    • /
    • pp.375-383
    • /
    • 2012
  • The purpose of the ensemble methods is to increase the accuracy of prediction through combining many classifiers. According to recent studies, it is proved that random forests and forward stagewise regression have good accuracies in classification problems. However they have great prediction error in separation boundary points because they used decision tree as a base learner. In this study, we use the kernel ridge regression instead of the decision trees in random forests and boosting. The usefulness of our proposed ensemble methods was shown by the simulation results of the prostate cancer and the Boston housing data.

A Study on the Groves for making enclosed Village in Rural Human Settlement Circle (농촌정주생활권내의 마을 비보숲의 실태에 관한 연구 - 전북 진안군 지역을 중심으로 -)

  • 박재철
    • Journal of the Korean Institute of Landscape Architecture
    • /
    • v.26 no.3
    • /
    • pp.152-161
    • /
    • 1998
  • The purpose of this study was to identify remained real state of groves of enclosed village in human settlement circle. That was practiced in case of Chinan-Gun region which traditional elements had well been conservated. 48 village groves were found by site survey, reference and interview in Chinan-Gun region. 27 groves of 48 village groves were clarified as complementing village grove by classification of grove character. It was identified through survey that many were partially destructed by development and human use. The results of this study showed general, socio-behavioral characteristics, characteristics of forest state and vegetation structure of complementing village groves. Length, area, form, type, motive, location, relationship of those were analyzed to identify general characteristics. Facilities, human behavior and ownership of those were analyzed to identify socio-behavoral characteristics. Dominent species, appearing rate, height, width, density and biodiversity of upper trees were analyzed to identify forest state and vegetation structure. Interrelation of each factor were analiged and comparative review with previous studies was achieved.

  • PDF

Bi-directional Reflectance Effects on Mangrove Classification of IKONOS Multi-angular Images

  • Rubio, M.C.D.;Nadaoka, K.;Paringit, E.C.
    • Proceedings of the KSRS Conference
    • /
    • 2003.11a
    • /
    • pp.4-6
    • /
    • 2003
  • Optical signals from an object may vary at different conditions caused by differences in light source and sensor position. Knowledge of these variations is necessary to enable calibration of the satellite images and confirmation of the sun and sensor angles influences of the spectral signals from the objects. With the use high -resolution Ikonos$^{TM}$ multi-angular images, the bi- directional reflectance effects of mangrove trees were observed when three datasets were compared. The influence of bi- directional reflectance may affect the accuracy of interpreting satellite imagery and obtaining biophysical parameters mangrove and other vegetation by indirect means.

  • PDF

Boosting neural networks with an application to bankruptcy prediction (부스팅 인공신경망을 활용한 부실예측모형의 성과개선)

  • Kim, Myoung-Jong;Kang, Dae-Ki
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2009.05a
    • /
    • pp.872-875
    • /
    • 2009
  • In a bankruptcy prediction model, the accuracy is one of crucial performance measures due to its significant economic impacts. Ensemble is one of widely used methods for improving the performance of classification and prediction models. Two popular ensemble methods, Bagging and Boosting, have been applied with great success to various machine learning problems using mostly decision trees as base classifiers. In this paper, we analyze the performance of boosted neural networks for improving the performance of traditional neural networks on bankruptcy prediction tasks. Experimental results on Korean firms indicated that the boosted neural networks showed the improved performance over traditional neural networks.

  • PDF