Search | Korea Science

SUPPORT Applications for Classification Trees

Lee, Sang-Bock;Park, Sun-Young
- Journal of the Korean Data and Information Science Society
- /
- v.15 no.3
- /
- pp.565-574
- /
- 2004
Classification tree algorithms including as CART by Brieman et al.(1984) in some aspects, recursively partition the data space with the aim of making the distribution of the class variable as pure as within each partition and consist of several steps. SUPPORT(smoothed and unsmoothed piecewise-polynomial regression trees) method of Chaudhuri et al(1994), a weighted averaging technique is used to combine piecewise polynomial fits into a smooth one. We focus on applying SUPPORT to a binary class variable. Logistic model is considered in the caculation techniques and the results are shown good classification rates compared with other methods as CART, QUEST, and CHAID.
PDF

A Study on Risk Evaluation and Classification of Fire Equipments for Certification (소방용품의 강제인증을 위한 위험도평가 및 품목분류에 관한 연구)

Choi, Gi-Heung
- Journal of the Korean Society of Safety
- /
- v.24 no.6
- /
- pp.7-12
- /
- 2009
This study focuses on the classification of fire equipments for certification based on the risk evaluation. In general, known statistics on fire equipment-related accidents needs to be used for risk evaluation. When statistics is not available, however, expected frequency and severity of accident for individual equipment can be taken into account in evaluating the related risks. Based on the level of inherent risks, each equipment is then classified into three categories for certification. For equipments that risk evaluation is not possible, characteristics of those products such as reliability are considered for classification. Once classified, each equipment is assigned an appropriate certification module.
PDF KSCI

Selection of markers in the framework of multivariate receiver operating characteristic curve analysis in binary classification

Sameera, G;Vishnu, Vardhan R
- Communications for Statistical Applications and Methods
- /
- v.26 no.2
- /
- pp.79-89
- /
- 2019
Classification models pertaining to receiver operating characteristic (ROC) curve analysis have been extended from univariate to multivariate setup by linearly combining available multiple markers. One such classification model is the multivariate ROC curve analysis. However, not all markers contribute in a real scenario and may mask the contribution of other markers in classifying the individuals/objects. This paper addresses this issue by developing an algorithm that helps in identifying the important markers that are significant and true contributors. The proposed variable selection framework is supported by real datasets and a simulation study, it is shown to provide insight about the individual marker's significance in providing a classifier rule/linear combination with good extent of classification.
https://doi.org/10.29220/CSAM.2019.26.2.079 인용 PDF KSCI

A Decision Tree Algorithm using Genetic Programming

Park, Chongsun;Ko, Young Kyong
- Communications for Statistical Applications and Methods
- /
- v.10 no.3
- /
- pp.845-857
- /
- 2003
We explore the use of genetic programming to evolve decision trees directly for classification problems with both discrete and continuous predictors. We demonstrate that the derived hypotheses of standard algorithms can substantially deviated from the optimum. This deviation is partly due to their top-down style procedures. The performance of the system is measured on a set of real and simulated data sets and compared with the performance of well-known algorithms like CHAID, CART, C5.0, and QUEST. Proposed algorithm seems to be effective in handling problems caused by top-down style procedures of existing algorithms.
https://doi.org/10.5351/CKSS.2003.10.3.845 인용 PDF KSCI

Robust Variable Selection in Classification Tree

Jang Jeong Yee;Jeong Kwang Mo
- Proceedings of the Korean Statistical Society Conference
- /
- 2001.11a
- /
- pp.89-94
- /
- 2001
In this study we focus on variable selection in decision tree growing structure. Some of the splitting rules and variable selection algorithms are discussed. We propose a competitive variable selection method based on Kruskal-Wallis test, which is a nonparametric version of ANOVA F-test. Through a Monte Carlo study we note that CART has serious bias in variable selection towards categorical variables having many values, and also QUEST using F-test is not so powerful to select informative variables under heavy tailed distributions.
PDF

Improving Bagging Predictors

Kim, Hyun-Joong;Chung, Dong-Jun
- Proceedings of the Korean Statistical Society Conference
- /
- 2005.11a
- /
- pp.141-146
- /
- 2005
Ensemble method has been known as one of the most powerful classification tools that can improve prediction accuracy. Ensemble method also has been understood as ‘perturb and combine’ strategy. Many studies have tried to develop ensemble methods by improving perturbation. In this paper, we propose two new ensemble methods that improve combining, based on the idea of pattern matching. In the experiment with simulation data and with real dataset, the proposed ensemble methods peformed better than bagging. The proposed ensemble methods give the most accurate prediction when the pruned tree was used as the base learner.
PDF

Multivariate Procedure for Variable Selection and Classification of High Dimensional Heterogeneous Data

Mehmood, Tahir;Rasheed, Zahid
- Communications for Statistical Applications and Methods
- /
- v.22 no.6
- /
- pp.575-587
- /
- 2015
The development in data collection techniques results in high dimensional data sets, where discrimination is an important and commonly encountered problem that are crucial to resolve when high dimensional data is heterogeneous (non-common variance covariance structure for classes). An example of this is to classify microbial habitat preferences based on codon/bi-codon usage. Habitat preference is important to study for evolutionary genetic relationships and may help industry produce specific enzymes. Most classification procedures assume homogeneity (common variance covariance structure for all classes), which is not guaranteed in most high dimensional data sets. We have introduced regularized elimination in partial least square coupled with QDA (rePLS-QDA) for the parsimonious variable selection and classification of high dimensional heterogeneous data sets based on recently introduced regularized elimination for variable selection in partial least square (rePLS) and heterogeneous classification procedure quadratic discriminant analysis (QDA). A comparison of proposed and existing methods is conducted over the simulated data set; in addition, the proposed procedure is implemented to classify microbial habitat preferences by their codon/bi-codon usage. Five bacterial habitats (Aquatic, Host Associated, Multiple, Specialized and Terrestrial) are modeled. The classification accuracy of each habitat is satisfactory and ranges from 89.1% to 100% on test data. Interesting codon/bi-codons usage, their mutual interactions influential for respective habitat preference are identified. The proposed method also produced results that concurred with known biological characteristics that will help researchers better understand divergence of species.
https://doi.org/10.5351/CSAM.2015.22.6.575 인용 PDF KSCI

Rule-Based Classification Analysis Using Entropy Distribution (엔트로피 분포를 이용한 규칙기반 분류분석 연구)

Lee, Jung-Jin;Park, Hae-Ki
- Communications for Statistical Applications and Methods
- /
- v.17 no.4
- /
- pp.527-540
- /
- 2010
Rule-based classification analysis is widely used for massive datamining because it is easy to understand and its algorithm is uncomplicated. In this classification analysis, majority vote of rules or weighted combination of rules using their supports are frequently used in order to combine rules. We propose a method to combine rules by using the multinomial distribution in this paper. Iterative proportional fitting algorithm is used to estimate the multinomial distribution which maximizes entropy constrained on rules' support. Simulation experiments show that this method can compete with other well known classification models in the case of two similar populations.
https://doi.org/10.5351/CKSS.2010.17.4.527 인용 PDF KSCI

A New Method for Classification of Structural Textures

Lee, Bongkyu
- International Journal of Control, Automation, and Systems
- /
- v.2 no.1
- /
- pp.125-133
- /
- 2004
In this paper, we present a new method that combines the characteristics of edge in-formation and second-order neural networks for the classification of structural textures. The edges of a texture are extracted using an edge detection approach. From this edge information, classification features called second-order features are obtained. These features are fed into a second-order neural network for training and subsequent classification. It will be shown that the main disadvantage of using structural methods in texture classifications, namely, the difficulty of the extraction of texels, is overcome by the proposed method.
PDF KSCI

An Application of the Balanced Quadratic Classification Rule on the Discriminant Analysis in Growth Curve Model (성장곡선모형의 판별분석에서 균형이차분류법의 적용)

Shim, Kyu-Bark
- Journal of Korean Society for Quality Management
- /
- v.23 no.2
- /
- pp.53-67
- /
- 1995
The problem considered here is to find the optimal discriminant analysis method in growth curve model. It has been studied how to find correct prior probability for the effective classification in discriminant analysis. We use the balanced condition to calculate prior probability. From the informative simulation study, new classification rule for the growth curve model is suggested. The suggested classification rule has better classification result than the other previously suggested method in terms of error rate criterion.
PDF

Search Result 867, Processing Time 0.022 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)