• Title/Summary/Keyword: classification trees

Search Result 316, Processing Time 0.024 seconds

Fuaay Decision Tree Induction to Obliquely Partitioning a Feature Space (특징공간을 사선 분할하는 퍼지 결정트리 유도)

  • Lee, Woo-Hang;Lee, Keon-Myung
    • Journal of KIISE:Software and Applications
    • /
    • v.29 no.3
    • /
    • pp.156-166
    • /
    • 2002
  • Decision tree induction is a kind of useful machine learning approach for extracting classification rules from a set of feature-based examples. According to the partitioning style of the feature space, decision trees are categorized into univariate decision trees and multivariate decision trees. Due to observation error, uncertainty, subjective judgment, and so on, real-world data are prone to contain some errors in their feature values. For the purpose of making decision trees robust against such errors, there have been various trials to incorporate fuzzy techniques into decision tree construction. Several researches hove been done on incorporating fuzzy techniques into univariate decision trees. However, for multivariate decision trees, few research has been done in the line of such study. This paper proposes a fuzzy decision tree induction method that builds fuzzy multivariate decision trees named fuzzy oblique decision trees, To show the effectiveness of the proposed method, it also presents some experimental results.

Split Effect in Ensemble

  • Chung, Dong-Jun;Kim, Hyun-Joong
    • Proceedings of the Korean Statistical Society Conference
    • /
    • 2005.11a
    • /
    • pp.193-197
    • /
    • 2005
  • Classification tree is one of the most suitable base learners for ensemble. For past decade, it was found that bagging gives the most accurate prediction when used with unpruned tree and boosting with stump. Researchers have tried to understand the relationship between the size of trees and the accuracy of ensemble. With experiment, it is found that large trees make boosting overfit the dataset and stumps help avoid it. It means that the accuracy of each classifier needs to be sacrificed for better weighting at each iteration. Hence, split effect in boosting can be explained with the trade-off between the accuracy of each classifier and better weighting on the misclassified points. In bagging, combining larger trees give more accurate prediction because bagging does not have such trade-off, thus it is advisable to make each classifier as accurate as possible.

  • PDF

Evaluations of predicted models fitted for data mining - comparisons of classification accuracy and training time for 4 algorithms (데이터마이닝기법상에서 적합된 예측모형의 평가 -4개분류예측모형의 오분류율 및 훈련시간 비교평가 중심으로)

  • Lee, Sang-Bock
    • Journal of the Korean Data and Information Science Society
    • /
    • v.12 no.2
    • /
    • pp.113-124
    • /
    • 2001
  • CHAID, logistic regression, bagging trees, and bagging trees are compared on SAS artificial data set as HMEQ in terms of classification accuracy and training time. In error rates, bagging trees is at the top, although its run time is slower than those of others. The run time of logistic regression is best among given models, but there is no uniformly efficient model satisfied in both criteria.

  • PDF

Detection of the Damaged Trees by Pine Wilt Disease Using IKONOS Image

  • Lee, S.H.;Cho, H.K.;Kim, J.B.;Jo, M.H.
    • Proceedings of the KSRS Conference
    • /
    • 2003.11a
    • /
    • pp.709-711
    • /
    • 2003
  • The purpose of this study is to detect the damaged red pine trees by pine wilt disease using high resolution satellite image of IKONOS Geo. IKONOS images are segmented with eCognition image processing software. A segment based maximum likelihood classification was performed to delineate the pine stand. The pine stands are regarded as a potential damage area. In order to develop a methodology to detect the location of damaged trees from the high resolution satellite image, black and white aerial photographs were used as a simulated image. The developed method based on filtering technique. A local maximum filter was adapted to detect the location of individual tree. This report presents a part of the first year results of an ongoing project.

  • PDF

The Development of Models and the Characteristics for Subway Noise Using the Classification and Regression Trees (CART 분석을 이용한 지하철 소음모형 개발 및 특성 연구)

  • Kim, Tae-Ho;Lee, Jae-Myung;Won, Jai-Mu;Song, In-Suk
    • Journal of the Korean Society for Railway
    • /
    • v.10 no.5
    • /
    • pp.480-486
    • /
    • 2007
  • The subway is a necessary public transportation in big cities, which many citizens are using now. However, the demands for subway inner circumstance by citizens are growing recently. Among them, the noise problem is the hot issue to be solved. So, in this study we classified the characteristics of subway noise using the classification and regression trees (CART) based on noise level data in line No. 5 in Seoul. After that We developed the models for effect of subway noise and analyzed the characteristics through it. The result of this study is that we need to consider the type of geometry design and operational factors when the problem of subway noise improves, because the factors which weigh with subway noise are different by type of geometry and operational part.

Natural Spread Pattern of Damaged Area by Pine Wilt Disease Using Geostatistical Analysis (공간통계학적 방법에 의한 소나무 재선충 피해의 자연적 확산유형분석)

  • Son, Min-Ho;Lee, Woo-Kyun;Lee, Seung-Ho;Cho, Hyun-Kook;Lee, Jun-Hak
    • Journal of Korean Society of Forest Science
    • /
    • v.95 no.3
    • /
    • pp.240-249
    • /
    • 2006
  • Recently, dispersion of damaged forest by pine wilt disease has been regarded as a serious social issue. Damages by pine wilt disease have been spreaded by natural area expansion of the vectors in the damaged area, while the national wide damage spread has induced by human-involved carrying infected trees out of damaged area. In this study, damaged trees were detected and located on the digital map by aerial photograph and terrestrial surveys. The spatial distribution pattern of damaged trees, and the relationship of spatial distribution of damaged trees and some geomorphological factors were geostatistically analysed. Finally, we maked natural spread pattern map of pine wilt disease using geostatistical CART(Classification and Regression Trees) model. This study verified that geostatistical analysis and CART model are useful tools for understanding spatial distribution and natural spread pattern of pine wilt diseases.

Study on Forest Vegetation Classification with Remote Sensing

  • Yuan, Jinguo;Long, Limin
    • Proceedings of the KSRS Conference
    • /
    • 2002.10a
    • /
    • pp.250-255
    • /
    • 2002
  • This paper describes the study methods of identifying forest vegetation types, based on this study, forest vegetation classification method based on vegetation index is proposed. According to reflectance data of vegetation canopy and soil line equation NIR=1.506R+0.0076 in Jingyuetan, Changchun, China, many vegetation index are calculated and analyzed. The relationships between vegetation index and vegetation types are that PVI identifies broadleaf forest and conifer forest the most easily, the next is TSAVI and MSAVI, but their calculation is complex. RVI values of different conifer trees vary obviously, so RVI can classify conifer trees. In a word, combination of PVI and RVI is evaluated to classify different vegetation types.

  • PDF

Feature Selection for Multi-Class Support Vector Machines Using an Impurity Measure of Classification Trees: An Application to the Credit Rating of S&P 500 Companies

  • Hong, Tae-Ho;Park, Ji-Young
    • Asia pacific journal of information systems
    • /
    • v.21 no.2
    • /
    • pp.43-58
    • /
    • 2011
  • Support vector machines (SVMs), a machine learning technique, has been applied to not only binary classification problems such as bankruptcy prediction but also multi-class problems such as corporate credit ratings. However, in general, the performance of SVMs can be easily worse than the best alternative model to SVMs according to the selection of predictors, even though SVMs has the distinguishing feature of successfully classifying and predicting in a lot of dichotomous or multi-class problems. For overcoming the weakness of SVMs, this study has proposed an approach for selecting features for multi-class SVMs that utilize the impurity measures of classification trees. For the selection of the input features, we employed the C4.5 and CART algorithms, including the stepwise method of discriminant analysis, which is a well-known method for selecting features. We have built a multi-class SVMs model for credit rating using the above method and presented experimental results with data regarding S&P 500 companies.

Predicting the Performance of Forecasting Strategies for Naval Spare Parts Demand: A Machine Learning Approach

  • Moon, Seongmin
    • Management Science and Financial Engineering
    • /
    • v.19 no.1
    • /
    • pp.1-10
    • /
    • 2013
  • Hierarchical forecasting strategy does not always outperform direct forecasting strategy. The performance generally depends on demand features. This research guides the use of the alternative forecasting strategies according to demand features. This paper developed and evaluated various classification models such as logistic regression (LR), artificial neural networks (ANN), decision trees (DT), boosted trees (BT), and random forests (RF) for predicting the relative performance of the alternative forecasting strategies for the South Korean navy's spare parts demand which has non-normal characteristics. ANN minimized classification errors and inventory costs, whereas LR minimized the Brier scores and the sum of forecasting errors.

A Study on the UAV-based Vegetable Index Comparison for Detection of Pine Wilt Disease Trees (소나무재선충병 피해목 탐지를 위한 UAV기반의 식생지수 비교 연구)

  • Jung, Yoon-Young;Kim, Sang-Wook
    • Journal of Cadastre & Land InformatiX
    • /
    • v.50 no.1
    • /
    • pp.201-214
    • /
    • 2020
  • This study aimed to early detect damaged trees by pine wilt disease using the vegetation indices of UAV images. The location data of 193 pine wilt disease trees were constructed through field surveys and vegetation index analyses of NDVI, GNDVI, NDRE and SAVI were performed using multi-spectral UAV images at the same time. K-Means algorithm was adopted to classify damaged trees and confusion matrix was used to compare and analyze the classification accuracy. The results of the study are summarized as follows. First, the overall accuracy of the classification was analyzed in order of NDVI (88.04%, Kappa coefficient 0.76) > GNDVI (86.01%, Kappa coefficient 0.72) > NDRE (77.35%, Kappa coefficient 0.55) > SAVI (76.84%, Kappa coefficient 0.54) and showed the highest accuracy of NDVI. Second, K-Means unsupervised classification method using NDVI or GNDVI is possible to some extent to find out the damaged trees. In particular, this technique is to help early detection of damaged trees due to its intensive operation, low user intervention and relatively simple analysis process. In the future, it is expected that the utilization of time series images or the application of deep learning techniques will increase the accuracy of classification.