• Title/Summary/Keyword: Decision Tree Classification

Search Result 449, Processing Time 0.035 seconds

A Comparative Study of Medical Data Classification Methods Based on Decision Tree and System Reconstruction Analysis

  • Tang, Tzung-I;Zheng, Gang;Huang, Yalou;Shu, Guangfu;Wang, Pengtao
    • Industrial Engineering and Management Systems
    • /
    • v.4 no.1
    • /
    • pp.102-108
    • /
    • 2005
  • This paper studies medical data classification methods, comparing decision tree and system reconstruction analysis as applied to heart disease medical data mining. The data we study is collected from patients with coronary heart disease. It has 1,723 records of 71 attributes each. We use the system-reconstruction method to weight it. We use decision tree algorithms, such as induction of decision trees (ID3), classification and regression tree (C4.5), classification and regression tree (CART), Chi-square automatic interaction detector (CHAID), and exhausted CHAID. We use the results to compare the correction rate, leaf number, and tree depth of different decision-tree algorithms. According to the experiments, we know that weighted data can improve the correction rate of coronary heart disease data but has little effect on the tree depth and leaf number.

Hybridized Decision Tree methods for Detecting Generic Attack on Ciphertext

  • Alsariera, Yazan Ahmad
    • International Journal of Computer Science & Network Security
    • /
    • v.21 no.7
    • /
    • pp.56-62
    • /
    • 2021
  • The surge in generic attacks execution against cipher text on the computer network has led to the continuous advancement of the mechanisms to protect information integrity and confidentiality. The implementation of explicit decision tree machine learning algorithm is reported to accurately classifier generic attacks better than some multi-classification algorithms as the multi-classification method suffers from detection oversight. However, there is a need to improve the accuracy and reduce the false alarm rate. Therefore, this study aims to improve generic attack classification by implementing two hybridized decision tree algorithms namely Naïve Bayes Decision tree (NBTree) and Logistic Model tree (LMT). The proposed hybridized methods were developed using the 10-fold cross-validation technique to avoid overfitting. The generic attack detector produced a 99.8% accuracy, an FPR score of 0.002 and an MCC score of 0.995. The performances of the proposed methods were better than the existing decision tree method. Similarly, the proposed method outperformed multi-classification methods for detecting generic attacks. Hence, it is recommended to implement hybridized decision tree method for detecting generic attacks on a computer network.

Classification Accuracy Improvement for Decision Tree (의사결정트리의 분류 정확도 향상)

  • Rezene, Mehari Marta;Park, Sanghyun
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2017.04a
    • /
    • pp.787-790
    • /
    • 2017
  • Data quality is the main issue in the classification problems; generally, the presence of noisy instances in the training dataset will not lead to robust classification performance. Such instances may cause the generated decision tree to suffer from over-fitting and its accuracy may decrease. Decision trees are useful, efficient, and commonly used for solving various real world classification problems in data mining. In this paper, we introduce a preprocessing technique to improve the classification accuracy rates of the C4.5 decision tree algorithm. In the proposed preprocessing method, we applied the naive Bayes classifier to remove the noisy instances from the training dataset. We applied our proposed method to a real e-commerce sales dataset to test the performance of the proposed algorithm against the existing C4.5 decision tree classifier. As the experimental results, the proposed method improved the classification accuracy by 8.5% and 14.32% using training dataset and 10-fold crossvalidation, respectively.

Two-Stage Decision Tree Analysis for Diagnosis of Personal Sasang Constitution Medicine Type (사상체질 판별을 위한 2단계 의사결정 나무 분석)

  • Jin, Hee-Jeong;Lee, Hae-Jung;Kim, Myoung-Geun;Kim, Hong-Gie;Kim, Jong-Yeol
    • Journal of Sasang Constitutional Medicine
    • /
    • v.22 no.3
    • /
    • pp.87-97
    • /
    • 2010
  • 1. Objectives: In SCM, a personal Sasang constitution must be determined accurately before any Sasang treatment. The purpose of this study is to develop an objective method for classification of Sasang constitution. 2. Methods: We collected samples from 5 centers where SCM is practiced, and applied two-stage decision tree analysis on these samples. We recruited samples from 5 centers. The collected data were from subjects whose response to herbal medicine was confirmed according to Sasang constitution. 3. Results: The two-stage decision tree model shows higher classification power than a simple decision tree model. This study also suggests that gender must be considered in the first stage to improve the accuracy of classification. 4. Conclusions: We identified important factors for classifying Sasang constitutions through two-stage decision tree analysis. The two-stage decision tree model shows higher classification power than a simple decision tree model.

Ensemble of Fuzzy Decision Tree for Efficient Indoor Space Recognition

  • Kim, Kisang;Choi, Hyung-Il
    • Journal of the Korea Society of Computer and Information
    • /
    • v.22 no.4
    • /
    • pp.33-39
    • /
    • 2017
  • In this paper, we expand the process of classification to an ensemble of fuzzy decision tree. For indoor space recognition, many research use Boosted Tree, consists of Adaboost and decision tree. The Boosted Tree extracts an optimal decision tree in stages. On each stage, Boosted Tree extracts the good decision tree by minimizing the weighted error of classification. This decision tree performs a hard decision. In most case, hard decision offer some error when they classify nearby a dividing point. Therefore, We suggest an ensemble of fuzzy decision tree, which offer some flexibility to the Boosted Tree algorithm as well as a high performance. In experimental results, we evaluate that the accuracy of suggested methods improved about 13% than the traditional one.

Feature Selection and Hyper-Parameter Tuning for Optimizing Decision Tree Algorithm on Heart Disease Classification

  • Tsehay Admassu Assegie;Sushma S.J;Bhavya B.G;Padmashree S
    • International Journal of Computer Science & Network Security
    • /
    • v.24 no.2
    • /
    • pp.150-154
    • /
    • 2024
  • In recent years, there are extensive researches on the applications of machine learning to the automation and decision support for medical experts during disease detection. However, the performance of machine learning still needs improvement so that machine learning model produces result that is more accurate and reliable for disease detection. Selecting the hyper-parameter that could produce the possible maximum classification accuracy on medical dataset is the most challenging task in developing decision support systems with machine learning algorithms for medical dataset classification. Moreover, selecting the features that best characterizes a disease is another challenge in developing machine-learning model with better classification accuracy. In this study, we have proposed an optimized decision tree model for heart disease classification by using heart disease dataset collected from kaggle data repository. The proposed model is evaluated and experimental test reveals that the performance of decision tree improves when an optimal number of features are used for training. Overall, the accuracy of the proposed decision tree model is 98.2% for heart disease classification.

Diagnostic Classification Scheme in Iranian Breast Cancer Patients using a Decision Tree

  • Malehi, Amal Saki
    • Asian Pacific Journal of Cancer Prevention
    • /
    • v.15 no.14
    • /
    • pp.5593-5596
    • /
    • 2014
  • Background: The objective of this study was to determine a diagnostic classification scheme using a decision tree based model. Materials and Methods: The study was conducted as a retrospective case-control study in Imam Khomeini hospital in Tehran during 2001 to 2009. Data, including demographic and clinical-pathological characteristics, were uniformly collected from 624 females, 312 of them were referred with positive diagnosis of breast cancer (cases) and 312 healthy women (controls). The decision tree was implemented to develop a diagnostic classification scheme using CART 6.0 Software. The AUC (area under curve), was measured as the overall performance of diagnostic classification of the decision tree. Results: Five variables as main risk factors of breast cancer and six subgroups as high risk were identified. The results indicated that increasing age, low age at menarche, single and divorced statues, irregular menarche pattern and family history of breast cancer are the important diagnostic factors in Iranian breast cancer patients. The sensitivity and specificity of the analysis were 66% and 86.9% respectively. The high AUC (0.82) also showed an excellent classification and diagnostic performance of the model. Conclusions: Decision tree based model appears to be suitable for identifying risk factors and high or low risk subgroups. It can also assists clinicians in making a decision, since it can identify underlying prognostic relationships and understanding the model is very explicit.

Comparison of Performance Measures for Credit-Card Delinquents Classification Models : Measured by Hit Ratio vs. by Utility (신용카드 연체자 분류모형의 성능평가 척도 비교 : 예측률과 유틸리티 중심으로)

  • Chung, Suk-Hoon;Suh, Yong-Moo
    • Journal of Information Technology Applications and Management
    • /
    • v.15 no.4
    • /
    • pp.21-36
    • /
    • 2008
  • As the great disturbance from abusing credit cards in Korea becomes stabilized, credit card companies need to interpret credit-card delinquents classification models from the viewpoint of profit. However, hit ratio which has been used as a measure of goodness of classification models just tells us how much correctly they classified rather than how much profits can be obtained as a result of using classification models. In this research, we tried to develop a new utility-based measure from the viewpoint of profit and then used this new measure to analyze two classification models(Neural Networks and Decision Tree models). We found that the hit ratio of neural model is higher than that of decision tree model, but the utility value of decision tree model is higher than that of neural model. This experiment shows the importance of utility based measure for credit-card delinquents classification models. We expect this new measure will contribute to increasing profits of credit card companies.

  • PDF

Evaluation Method of College English Education Effect Based on Improved Decision Tree Algorithm

  • Dou, Fang
    • Journal of Information Processing Systems
    • /
    • v.18 no.4
    • /
    • pp.500-509
    • /
    • 2022
  • With the rapid development of educational informatization, teaching methods become diversified characteristics, but a large number of information data restrict the evaluation on teaching subject and object in terms of the effect of English education. Therefore, this study adopts the concept of incremental learning and eigenvalue interval algorithm to improve the weighted decision tree, and builds an English education effect evaluation model based on association rules. According to the results, the average accuracy of information classification of the improved decision tree algorithm is 96.18%, the classification error rate can be as low as 0.02%, and the anti-fitting performance is good. The classification error rate between the improved decision tree algorithm and the original decision tree does not exceed 1%. The proposed educational evaluation method can effectively provide early warning of academic situation analysis, and improve the teachers' professional skills in an accelerated manner and perfect the education system.

Classification Method of Congestion Change Type for Efficient Traffic Management (효율적인 교통관리를 위한 혼잡상황변화 유형 분류기법 개발)

  • Shim, Sangwoo;Lee, Hwanpil;Lee, Kyujin;Choi, Keechoo
    • International Journal of Highway Engineering
    • /
    • v.16 no.4
    • /
    • pp.127-134
    • /
    • 2014
  • PURPOSES : To operate more efficient traffic management system, it is utmost important to detect the change in congestion level on a freeway segment rapidly and reliably. This study aims to develop classification method of congestion change type. METHODS: This research proposes two classification methods to capture the change of the congestion level on freeway segments using the dedicated short range communication (DSRC) data and the vehicle detection system (VDS) data. For developing the classification methods, the decision tree models were employed in which the independent variable is the change in congestion level and the covariates are the DSRC and VDS data collected from the freeway segments in Korea. RESULTS : The comparison results show that the decision tree model with DSRC data are better than the decision tree model with VDS data. Specifically, the decision tree model using DSRC data with better fits show approximately 95% accuracies. CONCLUSIONS : It is expected that the congestion change type classified using the decision tree models could play an important role in future freeway traffic management strategy.