Browse > Article
http://dx.doi.org/10.5391/JKIIS.2003.13.4.386

A Study on the Node Split in Decision Tree with Multivariate Target Variables  

Kim, Seong-Jun (강릉대학교 산업시스템공학과)
Publication Information
Journal of the Korean Institute of Intelligent Systems / v.13, no.4, 2003 , pp. 386-390 More about this Journal
Abstract
Data mining is a process of discovering useful patterns for decision making from an amount of data. It has recently received much attention in a wide range of business and engineering fields. Classifying a group into subgroups is one of the most important subjects in data mining. Tree-based methods, known as decision trees, provide an efficient way to finding the classification model. The primary concern in tree learning is to minimize a node impurity, which is evaluated using a target variable in the data set. However, there are situations where multiple target variable should be taken into account, for example, such as manufacturing process monitoring, marketing science, and clinical and health analysis. The purpose of this article is to present some methods for measuring the node impurity, which are applicable to data sets with multivariate target variables. For illustration, a numerical cxample is given with discussion.
Keywords
데이터마이닝;의사결정나무;분류;다변량 목표변수;노드불순도;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Heping Zhang, "Classification Trees with Multiple Binary Responses," Journal of the American Statistical Association, Vol. 93, No. 441, pp. 180-193, 1998.   DOI   ScienceOn
2 UCI Repository of Machine Learning Databases, 1998.
3 Katharina D. C. Stark and Dirk U. Pfeiffer, "The Application of Non-parametric Techniques to Solve Classification Problems in Complex Data Sets in Veterinary Epidemiology An Example," Intelligent Data Analysis, Vol. 3, pp. 23-35, 1999.   DOI   ScienceOn
4 장남식 외 2인, 데이터마이닝, 대청, 2000.
5 Leo Breiman, Jerome H. Friedman, Richard A. Olshen, and Charles J. Stone, Classification and Regression Trees, Boca Raton, FL: Chapman & HaIl/CRC, 1984.
6 Seong-Jun Kim and Kang B. Lee, "Constructing Decision Trees with Multiple Response Variables," International Journal of Management and Decision Making, Vol. 6, 2003, to appear.
7 Antonio Ciampi, Djamel A. Zighed, and Jeremy Clech, "Trees and Induction Graphs for Multivariate Response," Lecture Notes In Computer Science, No. 1910, pp. 359-366, 2000.
8 Roberta Siciliano and Francesco Mola, "Multivariate Data Analysis and Modeling Through Classification and Regression Trees," Computational Statistics & Data Analysis, Vol. 32, pp. 285-301, 2000.   DOI   ScienceOn
9 Indranil Bose and Radha K. Mahapatra, "Business Data Mining A Machine Learning Perspective," Information & Management, Vol. 39, pp. 211-225, 2001.   DOI   ScienceOn