• Title/Summary/Keyword: Tree Compare

Search Result 403, Processing Time 0.027 seconds

Optimization of Decision Tree for Classification Using a Particle Swarm

  • Cho, Yun-Ju;Lee, Hye-Seon;Jun, Chi-Hyuck
    • Industrial Engineering and Management Systems
    • /
    • v.10 no.4
    • /
    • pp.272-278
    • /
    • 2011
  • Decision tree as a classification tool is being used successfully in many areas such as medical diagnosis, customer churn prediction, signal detection and so on. The main advantage of decision tree classifiers is their capability to break down a complex structure into a collection of simpler structures, thus providing a solution that is easy to interpret. Since decision tree is a top-down algorithm using a divide and conquer induction process, there is a risk of reaching a local optimal solution. This paper proposes a procedure of optimally determining thresholds of the chosen variables for a decision tree using an adaptive particle swarm optimization (APSO). The proposed algorithm consists of two phases. First, we construct a decision tree and choose the relevant variables. Second, we find the optimum thresholds simultaneously using an APSO for those selected variables. To validate the proposed algorithm, several artificial and real datasets are used. We compare our results with the original CART results and show that the proposed algorithm is promising for improving prediction accuracy.

Carbon and Nitrogen Distribution of Tree Components in Larix kaempferi Carriere and Quercus variabilis Blume Stands in Gyeongnam Province

  • Kim, Choonsig
    • Journal of Korean Society of Forest Science
    • /
    • v.108 no.2
    • /
    • pp.139-146
    • /
    • 2019
  • This study was conducted to determine the carbon (C) and nitrogen (N) distribution within tree components (i.e., stem, branches, leaves, and roots) of the Japanese larch (Larix kaempferi Carriere) plantation and natural oriental cork oak (Quercus variabilis Blume) stands. Fifteen Japanese larch and 15 oriental cork oak trees were destructively sampled to compare the C and N stocks in the components of the trees from three different regions-Hadong-gun, Hamyang-gun and Sancheong-gun-in Gyeongnam Province, South Korea. Species-specific allometric equations were developed to estimate the C and N contents in the tree components based on the diameter at breast height (DBH). There were differences in mean C and N concentrations between the Japanese larch and the oriental cork oak. The mean C concentrations of the tree componentswere significantly higher in Japanese larch than in oriental cork oak; whereas, the N concentration in the stems was significantly lower in Japanese larch than in oriental cork oak. The allometric equations developed for C and N content were significant (p < 0.05) with a coefficient of determination ($R^2$) of 0.76 to 0.99. The C and N stocks in the tree components do not appear to be affected by the species such as Japanese larch plantations and oriental cork oak stands. This study emphasizes the importance of C and N concentrations to estimate the C and N distribution according to tree components in different tree species.

Comparative Study of Tokenizer Based on Learning for Sentiment Analysis (고객 감성 분석을 위한 학습 기반 토크나이저 비교 연구)

  • Kim, Wonjoon
    • Journal of Korean Society for Quality Management
    • /
    • v.48 no.3
    • /
    • pp.421-431
    • /
    • 2020
  • Purpose: The purpose of this study is to compare and analyze the tokenizer in natural language processing for customer satisfaction in sentiment analysis. Methods: In this study, a supervised learning-based tokenizer Mecab-Ko and an unsupervised learning-based tokenizer SentencePiece were used for comparison. Three algorithms: Naïve Bayes, k-Nearest Neighbor, and Decision Tree were selected to compare the performance of each tokenizer. For performance comparison, three metrics: accuracy, precision, and recall were used in the study. Results: The results of this study are as follows; Through performance evaluation and verification, it was confirmed that SentencePiece shows better classification performance than Mecab-Ko. In order to confirm the robustness of the derived results, independent t-tests were conducted on the evaluation results for the two types of the tokenizer. As a result of the study, it was confirmed that the classification performance of the SentencePiece tokenizer was high in the k-Nearest Neighbor and Decision Tree algorithms. In addition, the Decision Tree showed slightly higher accuracy among the three classification algorithms. Conclusion: The SentencePiece tokenizer can be used to classify and interpret customer sentiment based on online reviews in Korean more accurately. In addition, it seems that it is possible to give a specific meaning to a short word or a jargon, which is often used by users when evaluating products but is not defined in advance.

A Study on a car Insurance purchase Prediction Using Two-Class Logistic Regression and Two-Class Boosted Decision Tree

  • AN, Su Hyun;YEO, Seong Hee;KANG, Minsoo
    • Korean Journal of Artificial Intelligence
    • /
    • v.9 no.1
    • /
    • pp.9-14
    • /
    • 2021
  • This paper predicted a model that indicates whether to buy a car based on primary health insurance customer data. Currently, automobiles are being used to land transportation and living, and the scope of use and equipment is expanding. This rapid increase in automobiles has caused automobile insurance to emerge as an essential business target for insurance companies. Therefore, if the car insurance sales are predicted and sold using the information of existing health insurance customers, it can generate continuous profits in the insurance company's operating performance. Therefore, this paper aims to analyze existing customer characteristics and implement a predictive model to activate advertisements for customers interested in such auto insurance. The goal of this study is to maximize the profits of insurance companies by devising communication strategies that can optimize business models and profits for customers. This study was conducted through the Microsoft Azure program, and an automobile insurance purchase prediction model was implemented using Health Insurance Cross-sell Prediction data. The program algorithm uses Two-Class Logistic Regression and Two-Class Boosted Decision Tree at the same time to compare two models and predict and compare the results. According to the results of this study, when the Threshold is 0.3, the AUC is 0.837, and the accuracy is 0.833, which has high accuracy. Therefore, the result was that customers with health insurance could induce a positive reaction to auto insurance purchases.

Analysis of Bluethooth Scatternet Formation Algorithm over Mobile Networks (모바일 환경에서 블루투스 스캐터넷 형성 알고리즘 분석)

  • 이대봉;장종욱
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.7 no.2
    • /
    • pp.201-207
    • /
    • 2003
  • A multi-hop ad-hoc network can be formed out of Bluetooth capable devices through the use of scatternets. Many topologies can be deployed for scatternet; the most commonly proposed topology in the literature are tree structure and ring structure. We analyze the performance between tree formation algorithm and ring formation algorithm. We then present quantative result that compare the expected traffic performance between Blutooth ring scatternet and Bluetooth tree scatternet. The simulation results showes that Ring structure of scatterent have good performance than Tree structure over mobile environments.

Retargetable Intermediate Code Optimization System Using Tree Pattern Matching Techniques (트리패턴매칭기법의 재목적 가능한 중간코드 최적화 시스템)

  • Kim, Jeong-Suk;O, Se-Man
    • The Transactions of the Korea Information Processing Society
    • /
    • v.6 no.8
    • /
    • pp.2253-2261
    • /
    • 1999
  • ACK generates optimized code using the string pattern matching technique in pattern table generator and peephole optimizer. But string pattern matching method is not effective due to the many comparative actions in pattern selection. We designed and implemented the EM intermediate code optimizer using tree pattern matching algorithm composed of EM tree generator, optimization pattern table generator and tree pattern matcher. Tree pattern matching algorithm practices the pattern matching that centering around root node with refer to the pattern table, with traversing the EM tree by top-down method. As a result, compare to ACK string pattern matching methods, we found that the optimized code effected to pattern selection time, and contributed to improved the pattern selection time by about 10.8%.

  • PDF

The Design & Analysis of Time-Stamp Protocol with H-Binary Tree (H-이진트리체제의 타임스탬프 프로토콜 분석)

  • 정경원;도경화;전문석
    • The Journal of Society for e-Business Studies
    • /
    • v.7 no.2
    • /
    • pp.157-171
    • /
    • 2002
  • We want to find a timestamping method which improves efficient performance and have high-level security to send secured messages in the digital signature and the law of e-commerces. Our paper shows a H-binary tree of time stamp to use a time stamp protocol with high security and performance in the packets of sending messages. We implement and analyze the protocols, show to compare with previous RSA methods. Our proposed protocol has O(log n) time complexity and high-performance.

  • PDF

Tree-based Deployment Algorithm in Mobile Sensor Networks (이동 센서 네트워크에서 트리 기반의 배치 알고리즘)

  • Moon, Chong-Chun;Park, Jae-Hyun
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.12 no.11
    • /
    • pp.1138-1143
    • /
    • 2006
  • Sensor deployment is an important issue in the mobile wireless sensor network. In this paper, we propose a deployment algorithm for mobile sensor network to spread out mobile sensor nodes widely as well as regularly. Since the proposed algorithm uses tree topology in deploying the sensor nodes, calculating power as well as spreading speed can be reduced compare to other deployment algorithms. The performance of the proposed algorithm is simulated using NS-2 simulator and demonstrated.

Research on improving correctness of cardiac disorder data classifier by applying Best-First decision tree method (Best-First decision tree 기법을 적용한 심전도 데이터 분류기의 정확도 향상에 관한 연구)

  • Lee, Hyun-Ju;Shin, Dong-Kyoo;Park, Hee-Won;Kim, Soo-Han;Shin, Dong-Il
    • Journal of Internet Computing and Services
    • /
    • v.12 no.6
    • /
    • pp.63-71
    • /
    • 2011
  • Cardiac disorder data are generally tested using the classifier and QRS-Complex and R-R interval which is used in this experiment are often extracted by ECG(Electrocardiogram) signals. The experimentation of ECG data with classifier is generally performed with SVM(Support Vector Machine) and MLP(Multilayer Perceptron) classifier, but this study experimented with Best-First Decision Tree(B-F Tree) derived from the Dicision Tree among Random Forest classifier algorithms to improve accuracy. To compare and analyze accuracy, experimentation of SVM, MLP, RBF(Radial Basic Function) Network and Decision Tree classifiers are performed and also compared the result of announced papers carried out under same interval and data. Comparing the accuracy of Random Forest classifier with above four ones, Random Forest is the best in accuracy. As though R-R interval was extracted using Band-pass filter in pre-processing of this experiment, in future, more filter study is needed to extract accurate interval.

Heart Disease Prediction Using Decision Tree With Kaggle Dataset

  • Noh, Young-Dan;Cho, Kyu-Cheol
    • Journal of the Korea Society of Computer and Information
    • /
    • v.27 no.5
    • /
    • pp.21-28
    • /
    • 2022
  • All health problems that occur in the circulatory system are refer to cardiovascular illness, such as heart and vascular diseases. Deaths from cardiovascular disorders are recorded one third of in total deaths in 2019 worldwide, and the number of deaths continues to rise. Therefore, if it is possible to predict diseases that has high mortality rate with patient's data and AI system, they would enable them to be detected and be treated in advance. In this study, models are produced to predict heart disease, which is one of the cardiovascular diseases, and compare the performance of models with Accuracy, Precision, and Recall, with description of the way of improving the performance of the Decision Tree(Decision Tree, KNN (K-Nearest Neighbor), SVM (Support Vector Machine), and DNN (Deep Neural Network) are used in this study.). Experiments were conducted using scikit-learn, Keras, and TensorFlow libraries using Python as Jupyter Notebook in macOS Big Sur. As a result of comparing the performance of the models, the Decision Tree demonstrates the highest performance, thus, it is recommended to use the Decision Tree in this study.