• Title/Summary/Keyword: tree classification method

Search Result 355, Processing Time 0.027 seconds

SUPPORT Applications for Classification Trees

  • Lee, Sang-Bock;Park, Sun-Young
    • Journal of the Korean Data and Information Science Society
    • /
    • v.15 no.3
    • /
    • pp.565-574
    • /
    • 2004
  • Classification tree algorithms including as CART by Brieman et al.(1984) in some aspects, recursively partition the data space with the aim of making the distribution of the class variable as pure as within each partition and consist of several steps. SUPPORT(smoothed and unsmoothed piecewise-polynomial regression trees) method of Chaudhuri et al(1994), a weighted averaging technique is used to combine piecewise polynomial fits into a smooth one. We focus on applying SUPPORT to a binary class variable. Logistic model is considered in the caculation techniques and the results are shown good classification rates compared with other methods as CART, QUEST, and CHAID.

  • PDF

CANCER CLASSIFICATION AND PREDICTION USING MULTIVARIATE ANALYSIS

  • Shon, Ho-Sun;Lee, Heon-Gyu;Ryu, Keun-Ho
    • Proceedings of the KSRS Conference
    • /
    • v.2
    • /
    • pp.706-709
    • /
    • 2006
  • Cancer is one of the major causes of death; however, the survival rate can be increased if discovered at an early stage for timely treatment. According to the statistics of the World Health Organization of 2002, breast cancer was the most prevalent cancer for all cancers occurring in women worldwide, and it account for 16.8% of entire cancers inflicting Korean women today. In order to classify the type of breast cancer whether it is benign or malignant, this study was conducted with the use of the discriminant analysis and the decision tree of data mining with the breast cancer data disclosed on the web. The discriminant analysis is a statistical method to seek certain discriminant criteria and discriminant function to separate the population groups on the basis of observation values obtained from two or more population groups, and use the values obtained to allow the existing observation value to the population group thereto. The decision tree analyzes the record of data collected in the part to show it with the pattern existing in between them, namely, the combination of attribute for the characteristics of each class and make the classification model tree. Through this type of analysis, it may obtain the systematic information on the factors that cause the breast cancer in advance and prevent the risk of recurrence after the surgery.

  • PDF

A GA-based Binary Classification Method for Bankruptcy Prediction (도산예측을 위한 유전 알고리듬 기반 이진분류기법의 개발)

  • Min, Jae-H.;Jeong, Chul-Woo
    • Journal of the Korean Operations Research and Management Science Society
    • /
    • v.33 no.2
    • /
    • pp.1-16
    • /
    • 2008
  • The purpose of this paper is to propose a new binary classification method for predicting corporate failure based on genetic algorithm, and to validate its prediction power through empirical analysis. Establishing virtual companies representing bankrupt companies and non-bankrupt ones respectively, the proposed method measures the similarity between the virtual companies and the subject for prediction, and classifies the subject into either bankrupt or non-bankrupt one. The values of the classification variables of the virtual companies and the weights of the variables are determined by the proper model to maximize the hit ratio of training data set using genetic algorithm. In order to test the validity of the proposed method, we compare its prediction accuracy with ones of other existing methods such as multi-discriminant analysis, logistic regression, decision tree, and artificial neural network, and it is shown that the binary classification method we propose in this paper can serve as a premising alternative to the existing methods for bankruptcy prediction.

The Comparison of OC1 and CART for Prosodic Boundary Index Prediction (운율 경계강도 예측을 위한 OC1의 적용 및 CART와의 비교)

  • 임동식;김진영;김선미
    • The Journal of the Acoustical Society of Korea
    • /
    • v.18 no.4
    • /
    • pp.60-64
    • /
    • 1999
  • In this paper, we apply CART(Classification And Regression tree) and OC1(Oblique Classifier1) which methods are widely used for continuous speech recognition and synthesis. We prediet prosodic boundary index by applying CART and OC1, which combine right depth of tree-structured method and To_Right of link grammar method with tri_gram model. We assigned four prosodic boundary index level from 0 to 3. Experimental results show that OC1 method is superior to CART method. In other words, in spite of OC1's having fewer nodes than CART, it can make more improved prediction than CART.

  • PDF

PD classification by using ANFIS method (ANFIS 분류기법을 이용한 부분방전원의 분류)

  • Park, Seong-Hee;Yoon, Jae-Hun;Kim, Byong-Chul;Lim, Kee-Jo;Kang, Seong-Hwa
    • Proceedings of the Korean Institute of Electrical and Electronic Material Engineers Conference
    • /
    • 2007.11a
    • /
    • pp.467-467
    • /
    • 2007
  • Solid insulation exposed to voltage is degraded by electrical tree process. And the degradation of the insulation is accelerated by voltage application. For this experimental, specimen of electrical tree model is made by XLPE (cross-linked polyethylene). And the size of the specimen is $7^*5^*7mm^3$. Distance between needle and plane is 2 mm. Voltages applied to acceleration test are ranged 12 to 15 kV. And distribution characteristic of degraded stage is studied too. By PD detecting and data processing, discharge data was acquired from PD detecting system (Biddle instrument). The system presents statistical distribution of phase resolved. Moreover, the processing time of electrical tree is recorded to know the speed of degradation according to voltage. Finally, it's used PD classification by ANFIS method.

  • PDF

Robust Feature Selection and Shot Change Detection Method Using the Neural Networks (강인한 특징 변수 선별과 신경망을 이용한 장면 전환점 검출 기법)

  • Hong, Seung-Bum;Hong, Gyo-Young
    • Journal of Korea Multimedia Society
    • /
    • v.7 no.7
    • /
    • pp.877-885
    • /
    • 2004
  • In this paper, we propose an enhancement shot change detection method using the neural net and the robust feature selection out of multiple features. The previous shot change detection methods usually used single feature and fixed threshold between consecutive frames. However, contents such as color, shape, background, and texture change simultaneously at shot change points in a video sequence. Therefore, in this paper, we detect the shot changes effectively using robust features, which are supplementary each other, rather than using single feature. In this paper, we use the typical CART (classification and regression tree) of data mining method to select the robust features, and the backpropagation neural net to determine the threshold of the each selected features. And to evaluation the performance of the robust feature selection, we compare the proposed method to the PCA(principal component analysis) method of the typical feature selection. According to the experimental result. it was revealed that the performance of our method had better that than the PCA method.

  • PDF

A Development of Suicidal Ideation Prediction Model and Decision Rules for the Elderly: Decision Tree Approach (의사결정나무 기법을 이용한 노인들의 자살생각 예측모형 및 의사결정 규칙 개발)

  • Kim, Deok Hyun;Yoo, Dong Hee;Jeong, Dae Yul
    • The Journal of Information Systems
    • /
    • v.28 no.3
    • /
    • pp.249-276
    • /
    • 2019
  • Purpose The purpose of this study is to develop a prediction model and decision rules for the elderly's suicidal ideation based on the Korean Welfare Panel survey data. By utilizing this data, we obtained many decision rules to predict the elderly's suicide ideation. Design/methodology/approach This study used classification analysis to derive decision rules to predict on the basis of decision tree technique. Weka 3.8 is used as the data mining tool in this study. The decision tree algorithm uses J48, also known as C4.5. In addition, 66.6% of the total data was divided into learning data and verification data. We considered all possible variables based on previous studies in predicting suicidal ideation of the elderly. Finally, 99 variables including the target variable were used. Classification analysis was performed by introducing sampling technique through backward elimination and data balancing. Findings As a result, there were significant differences between the data sets. The selected data sets have different, various decision tree and several rules. Based on the decision tree method, we derived the rules for suicide prevention. The decision tree derives not only the rules for the suicidal ideation of the depressed group, but also the rules for the suicidal ideation of the non-depressed group. In addition, in developing the predictive model, the problem of over-fitting due to the data imbalance phenomenon was directly identified through the application of data balancing. We could conclude that it is necessary to balance the data on the target variables in order to perform the correct classification analysis without over-fitting. In addition, although data balancing is applied, it is shown that performance is not inferior in prediction rate when compared with a biased prediction model.

Weather Classification and Fog Detection using Hierarchical Image Tree Model and k-mean Segmentation in Single Outdoor Image (싱글 야외 영상에서 계층적 이미지 트리 모델과 k-평균 세분화를 이용한 날씨 분류와 안개 검출)

  • Park, Ki-Hong
    • Journal of Digital Contents Society
    • /
    • v.18 no.8
    • /
    • pp.1635-1640
    • /
    • 2017
  • In this paper, a hierarchical image tree model for weather classification is defined in a single outdoor image, and a weather classification algorithm using image intensity and k-mean segmentation image is proposed. In the first level of the hierarchical image tree model, the indoor and outdoor images are distinguished. Whether the outdoor image is daytime, night, or sunrise/sunset image is judged using the intensity and the k-means segmentation image at the second level. In the last level, if it is classified as daytime image at the second level, it is finally estimated whether it is sunny or foggy image based on edge map and fog rate. Some experiments are conducted so as to verify the weather classification, and as a result, the proposed method shows that weather features are effectively detected in a given image.

Design and Performance Measurement of a Genetic Algorithm-based Group Classification Method : The Case of Bond Rating (유전 알고리듬 기반 집단분류기법의 개발과 성과평가 : 채권등급 평가를 중심으로)

  • Min, Jae-H.;Jeong, Chul-Woo
    • Journal of the Korean Operations Research and Management Science Society
    • /
    • v.32 no.1
    • /
    • pp.61-75
    • /
    • 2007
  • The purpose of this paper is to develop a new group classification method based on genetic algorithm and to com-pare its prediction performance with those of existing methods in the area of bond rating. To serve this purpose, we conduct various experiments with pilot and general models. Specifically, we first conduct experiments employing two pilot models : the one searching for the cluster center of each group and the other one searching for both the cluster center and the attribute weights in order to maximize classification accuracy. The results from the pilot experiments show that the performance of the latter in terms of classification accuracy ratio is higher than that of the former which provides the rationale of searching for both the cluster center of each group and the attribute weights to improve classification accuracy. With this lesson in mind, we design two generalized models employing genetic algorithm : the one is to maximize the classification accuracy and the other one is to minimize the total misclassification cost. We compare the performance of these two models with those of existing statistical and artificial intelligent models such as MDA, ANN, and Decision Tree, and conclude that the genetic algorithm-based group classification method that we propose in this paper significantly outperforms the other methods in respect of classification accuracy ratio as well as misclassification cost.

A Comparison of Pixel- and Segment-based Classification for Tree Species Classification using QuickBird Imagery (QuickBird 위성영상을 이용한 수종분류에서 픽셀과 분할기반 분류방법의 정확도 비교)

  • Chung, Sang Young;Yim, Jong Su;Shin, Man Yong
    • Journal of Korean Society of Forest Science
    • /
    • v.100 no.4
    • /
    • pp.540-547
    • /
    • 2011
  • This study was conducted to compare classification accuracy by tree species using QuickBird imagery for pixel- and segment-based classifications that have been mostly applied to classify land covers. A total of 398 points was used as training and reference data. Based on this points, the points were classified into fourteen land cover classes: four coniferous and seven deciduous tree species in forest classes, and three non-forested classes. In pixel-based classification, three images obtained by using raw spectral values, three tasseled indices, and three components from principal component analysis were produced. For the both classification processes, the maximum likelihood method was applied. In the pixel-based classification, it was resulted that the classification accuracy with raw spectral values was better than those by the other band combinations. As resulted that, the segment-based classification with a scale factor of 50% provided the most accurate classification (overall accuracy:76% and ${\hat{k}}$ value:0.74) compared to the other scale factors and pixel-based classification.