• Title/Summary/Keyword: classification trees

Search Result 316, Processing Time 0.022 seconds

CLASSIFICATION OF TREES EACH OF WHOSE ASSOCIATED ACYCLIC MATRICES WITH DISTINCT DIAGONAL ENTRIES HAS DISTINCT EIGENVALUES

  • Kim, In-Jae;Shader, Bryan L.
    • Bulletin of the Korean Mathematical Society
    • /
    • v.45 no.1
    • /
    • pp.95-99
    • /
    • 2008
  • It is known that each eigenvalue of a real symmetric, irreducible, tridiagonal matrix has multiplicity 1. The graph of such a matrix is a path. In this paper, we extend the result by classifying those trees for which each of the associated acyclic matrices has distinct eigenvalues whenever the diagonal entries are distinct.

Modeling of Environmental Survey by Decision Trees

  • Park, Hee-Chang;Cho, Kwang-Hyun
    • 한국데이터정보과학회:학술대회논문집
    • /
    • 2004.10a
    • /
    • pp.63-75
    • /
    • 2004
  • The decision tree approach is most useful in classification problems and to divide the search space into rectangular regions. Decision tree algorithms are used extensively for data mining in many domains such as retail target marketing, fraud dection, data reduction and variable screening, category merging, etc. We analyze Gyeongnam social indicator survey data using decision tree techniques for environmental information. We can use these decision tree outputs for environmental preservation and improvement.

  • PDF

Modeling of Environmental Survey by Decision Trees

  • Park, Hee-Chang;Cho, Kwang-Hyun
    • Journal of the Korean Data and Information Science Society
    • /
    • v.15 no.4
    • /
    • pp.759-771
    • /
    • 2004
  • The decision tree approach is most useful in classification problems and to divide the search space into rectangular regions. Decision tree algorithms are used extensively for data mining in many domains such as retail target marketing, fraud dection, data reduction and variable screening, category merging, etc. We analyze Gyeongnam social indicator survey data using decision tree techniques for environmental information. We can use these decision tree outputs for environmental preservation and improvement.

  • PDF

Classification Accuracy Improvement for Decision Tree (의사결정트리의 분류 정확도 향상)

  • Rezene, Mehari Marta;Park, Sanghyun
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2017.04a
    • /
    • pp.787-790
    • /
    • 2017
  • Data quality is the main issue in the classification problems; generally, the presence of noisy instances in the training dataset will not lead to robust classification performance. Such instances may cause the generated decision tree to suffer from over-fitting and its accuracy may decrease. Decision trees are useful, efficient, and commonly used for solving various real world classification problems in data mining. In this paper, we introduce a preprocessing technique to improve the classification accuracy rates of the C4.5 decision tree algorithm. In the proposed preprocessing method, we applied the naive Bayes classifier to remove the noisy instances from the training dataset. We applied our proposed method to a real e-commerce sales dataset to test the performance of the proposed algorithm against the existing C4.5 decision tree classifier. As the experimental results, the proposed method improved the classification accuracy by 8.5% and 14.32% using training dataset and 10-fold crossvalidation, respectively.

Study on Development of Classification Model and Implementation for Diagnosis System of Sasang Constitution (사상체질 분류모형 개발 및 진단시스템의 구현에 관한 연구)

  • Beum, Soo-Gyun;Jeon, Mi-Ran;Oh, Am-Suk
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2008.08a
    • /
    • pp.155-159
    • /
    • 2008
  • In this thesis, in order to develop a new classification model of Sasang Constitutional medical types, which is helpful for improving the accuracy of diagnosis of medical types. various data-mining classification models such as discriminant analysis. decision trees analysis, neural networks analysis, logistics regression analysis, clustering analysis which are main classification methods were applied to the questionnaires of medical type classification. In this manner, a model which scientifically classifies constitutional medical types in the field of Sasang Constitutional Medicine, one of a traditional Korean medicine, has been developed. Also, the above-mentioned analysis models were systematically compared and analyzed. In this study, a classification of Sasang constitutional medical types was developed based on the discriminate analysis model and decision trees analysis model of which accuracy is relatively high, of which analysis procedure is easy to understand and to explain and which are easy to implement. Also, a diagnosis system of Sasang constitution was implemented applying the two analysis models.

  • PDF

Estimation Carbon Storage of Urban Street trees Using UAV Imagery and SfM Technique (UAV 영상과 SfM 기술을 이용한 가로수의 탄소저장량 추정)

  • Kim, Da-Seul;Lee, Dong-Kun;Heo, Han-Kyul
    • Journal of the Korean Society of Environmental Restoration Technology
    • /
    • v.22 no.6
    • /
    • pp.1-14
    • /
    • 2019
  • Carbon storage is one of the regulating ecosystem services provided by urban street trees. It is important that evaluating the economic value of ecosystem services accurately. The carbon storage of street trees was calculated by measuring the morphological parameter on the field. As the method is labor-intensive and time-consuming for the macro-scale research, remote sensing has been more widely used. The airborne Light Detection And Ranging (LiDAR) is used in obtaining the point clouds data of a densely planted area and extracting individual trees for the carbon storage estimation. However, the LiDAR has limitations such as high cost and complicated operations. In addition, trees change over time they need to be frequently. Therefore, Structure from Motion (SfM) photogrammetry with unmanned Aerial Vehicle (UAV) is a more suitable method for obtaining point clouds data. In this paper, a UAV loaded with a digital camera was employed to take oblique aerial images for generating point cloud of street trees. We extracted the diameter of breast height (DBH) from generated point cloud data to calculate the carbon storage. We compared DBH calculated from UAV data and measured data from the field in the selected area. The calculated DBH was used to estimate the carbon storage of street trees in the study area using a regression model. The results demonstrate the feasibility and effectiveness of applying UAV imagery and SfM technique to the carbon storage estimation of street trees. The technique can contribute to efficiently building inventories of the carbon storage of street trees in urban areas.

Ensemble Learning for Underwater Target Classification (수중 표적 식별을 위한 앙상블 학습)

  • Seok, Jongwon
    • Journal of Korea Multimedia Society
    • /
    • v.18 no.11
    • /
    • pp.1261-1267
    • /
    • 2015
  • The problem of underwater target detection and classification has been attracted a substantial amount of attention and studied from many researchers for both military and non-military purposes. The difficulty is complicate due to various environmental conditions. In this paper, we study classifier ensemble methods for active sonar target classification to improve the classification performance. In general, classifier ensemble method is useful for classifiers whose variances relatively large such as decision trees and neural networks. Bagging, Random selection samples, Random subspace and Rotation forest are selected as classifier ensemble methods. Using the four ensemble methods based on 31 neural network classifiers, the classification tests were carried out and performances were compared.

Tree size determination for classification ensemble

  • Choi, Sung Hoon;Kim, Hyunjoong
    • Journal of the Korean Data and Information Science Society
    • /
    • v.27 no.1
    • /
    • pp.255-264
    • /
    • 2016
  • Classification is a predictive modeling for a categorical target variable. Various classification ensemble methods, which predict with better accuracy by combining multiple classifiers, became a powerful machine learning and data mining paradigm. Well-known methodologies of classification ensemble are boosting, bagging and random forest. In this article, we assume that decision trees are used as classifiers in the ensemble. Further, we hypothesized that tree size affects classification accuracy. To study how the tree size in uences accuracy, we performed experiments using twenty-eight data sets. Then we compare the performances of ensemble algorithms; bagging, double-bagging, boosting and random forest, with different tree sizes in the experiment.

Effectiveness of Repeated Examination to Diagnose Enterobiasis in Nursery School Groups

  • Remm, Mare;Remm, Kalle
    • Parasites, Hosts and Diseases
    • /
    • v.47 no.3
    • /
    • pp.235-241
    • /
    • 2009
  • The aim of this study was to estimate the benefit from repeated examinations in the diagnosis of enterobiasis in nursery school groups, and to test the effectiveness of individual-based risk predictions using different methods. A total of 604 children were examined using double, and 96 using triple, anal swab examinations. The questionnaires for parents, structured observations, and interviews with supervisors were used to identify factors of possible infection risk. In order to model the risk of enterobiasis at individual level, a similarity-based machine learning and prediction software Constud was compared with data mining methods in the Statistica 8 Data Miner software package. Prevalence according to a single examination was 22.5%; the increase as a result of double examinations was 8.2%. Single swabs resulted in an estimated prevalence of 20.1% among children examined 3 times; double swabs increased this by 10.1%, and triple swabs by 7.3%. Random forest classification, boosting classification trees, and Constud correctly predicted about 2/3 of the results of the second examination. Constud estimated a mean prevalence of 31.5% in groups. Constud was able to yield the highest overall fit of individual-based predictions while boosting classification tree and random forest models were more effective in recognizing Enterobius positive persons. As a rule, the actual prevalence of enterobiasis is higher than indicated by a single examination. We suggest using either the values of the mean increase in prevalence after double examinations compared to single examinations or group estimations deduced from individual-level modelled risk predictions.

Land Cover Classification over East Asian Region Using Recent MODIS NDVI Data (2006-2008) (최근 MODIS 식생지수 자료(2006-2008)를 이용한 동아시아 지역 지면피복 분류)

  • Kang, Jeon-Ho;Suh, Myoung-Seok;Kwak, Chong-Heum
    • Atmosphere
    • /
    • v.20 no.4
    • /
    • pp.415-426
    • /
    • 2010
  • A Land cover map over East Asian region (Kongju national university Land Cover map: KLC) is classified by using support vector machine (SVM) and evaluated with ground truth data. The basic input data are the recent three years (2006-2008) of MODIS (MODerate Imaging Spectriradiometer) NDVI (normalized difference vegetation index) data. The spatial resolution and temporal frequency of MODIS NDVI are 1km and 16 days, respectively. To minimize the number of cloud contaminated pixels in the MODIS NDVI data, the maximum value composite is applied to the 16 days data. And correction of cloud contaminated pixels based on the spatiotemporal continuity assumption are applied to the monthly NDVI data. To reduce the dataset and improve the classification quality, 9 phenological data, such as, NDVI maximum, amplitude, average, and others, derived from the corrected monthly NDVI data. The 3 types of land cover maps (International Geosphere Biosphere Programme: IGBP, University of Maryland: UMd, and MODIS) were used to build up a "quasi" ground truth data set, which were composed of pixels where the three land cover maps classified as the same land cover type. The classification results show that the fractions of broadleaf trees and grasslands are greater, but those of the croplands and needleleaf trees are smaller compared to those of the IGBP or UMd. The validation results using in-situ observation database show that the percentages of pixels in agreement with the observations are 80%, 77%, 63%, 57% in MODIS, KLC, IGBP, UMd land cover data, respectively. The significant differences in land cover types among the MODIS, IGBP, UMd and KLC are mainly occurred at the southern China and Manchuria, where most of pixels are contaminated by cloud and snow during summer and winter, respectively. It shows that the quality of raw data is one of the most important factors in land cover classification.