• Title/Summary/Keyword: C4.5 의사 결정 트리

Search Result 18, Processing Time 0.029 seconds

A Comparative Study on The Effective Use of Decision Tree Algorithms (의사결정 트리의 효용성 제고 방안에 관한 비교 연구)

  • Sug, Hyon-Tai
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2009.01a
    • /
    • pp.321-324
    • /
    • 2009
  • 비교적 적은 크기이면서 예측력에 있어 만족할 만한 의사결정목을 생성하는 방법으로서 적절한 크기의 샘플링을 제안하였다. 일반적으로 샘플의 크기가 작을수록 작은 의사결정목이 생성되므로 적절한 예측 정확도를 갖는 작은 트리를 생성하기를 원할 경우 적당한 크기의 샘플링을 하는 것이 트리의 최적화를 위한 계산을 더 시행하는 것보다 바람직하다고 할 수 있으며, 이와 같은 사실은 현재 알려진 가장 대표적 의사결정목 생성 알고리즘인 C4.5 및 CART를 사용하여 실험으로서 보여주었다.

  • PDF

Analysis of Leaf Node Ranking Methods for Spatial Event Prediction (의사결정트리에서 공간사건 예측을 위한 리프노드 등급 결정 방법 분석)

  • Yeon, Young-Kwang
    • Journal of the Korean Association of Geographic Information Studies
    • /
    • v.17 no.4
    • /
    • pp.101-111
    • /
    • 2014
  • Spatial events are predictable using data mining classification algorithms. Decision trees have been used as one of representative classification algorithms. And they were normally used in the classification tasks that have label class values. However since using rule ranking methods, spatial prediction have been applied in the spatial prediction problems. This paper compared rule ranking methods for the spatial prediction application using a decision tree. For the comparison experiment, C4.5 decision tree algorithm, and rule ranking methods such as Laplace, M-estimate and m-branch were implemented. As a spatial prediction case study, landslide which is one of representative spatial event occurs in the natural environment was applied. Among the rule ranking methods, in the results of accuracy evaluation, m-branch showed the better accuracy than other methods. However in case of m-brach and M-estimate required additional time-consuming procedure for searching optimal parameter values. Thus according to the application areas, the methods can be selectively used. The spatial prediction using a decision tree can be used not only for spatial predictions, but also for causal analysis in the specific event occurrence location.

Pridict of Liver cirrhosis susceptibility using Decision tree with SNP (Decision Tree와 SNP정보를 이용한 간경화 환자의 감수성 예측)

  • Kim, Dong-Hoi;Uhmn, Saang-Yong;Cho, Sung-Won;Ham, Ki-Baek;Kim, Jin
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2006.10a
    • /
    • pp.63-66
    • /
    • 2006
  • 본 논문에서는 SNP데이터를 이용하여 간경화에 대한 감수성을 예측하기 위해 의사결정 트리를 이용하였다. 데이터는 간경화 환자와 정상환자 총 116명의 데이터를 사용하였으며, Feature 값으로는 간질환과 밀접한 연관성을 갖는 28개의 SNP데이터를 사용하였다. 실험방법은 각각의 SNP에 대하여 의사결정트리로 분류율을 측정한 후 가장 높은 분류율을 가지는 SNP부터 조합해 나가는 방식으로 C4.5 의사결정트리를 이용 leave-one-out cross validation으로 간경화와 정상을 구분하는 정확도를 측정하였다. 실험결과 간 질환 관련 SNP중 IL1RN-S130S, IRNGR2-Q64R, IL-10(-592), IL1B_S35S 4개의 SNP조합에서 65.52%의 정확도를 얻을 수 있었다.

  • PDF

Effective Diagnostic Method Of Breast Cancer Data Using Decision Tree (Decision Tree를 이용한 효과적인 유방암 진단)

  • Jung, Yong-Gyu;Lee, Seung-Ho;Sung, Ho-Joong
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.10 no.5
    • /
    • pp.57-62
    • /
    • 2010
  • Recently, decision tree techniques have been studied in terms of quick searching and extracting of massive data in medical fields. Although many different techniques have been developed such as CART, C4.5 and CHAID which are belong to a pie in Clermont decision tree classification algorithm, those methods can jeopardize remained data by the binary method during procedures. In brief, C4.5 method composes a decision tree by entropy levels. In contrast, CART method does by entropy matrix in categorical or continuous data. Therefore, we compared C4.5 and CART methods which were belong to a same pie using breast cancer data to evaluate their performance respectively. To convince data accuracy, we performed cross-validation of results in this paper.

Development of Decision Tree Software and Protein Profiling using Surface Enhanced laser Desorption/lonization - Time of Flight - Mass Spectrometry (SELDI-TOF-MS) in Papillary Thyroid Cancer (의사결정트리 프로그램 개발 및 갑상선유두암에서 질량분석법을 이용한 단백질 패턴 분석)

  • Yoon, Joon-Kee;Lee, Jun;An, Young-Sil;Park, Bok-Nam;Yoon, Seok-Nam
    • Nuclear Medicine and Molecular Imaging
    • /
    • v.41 no.4
    • /
    • pp.299-308
    • /
    • 2007
  • Purpose: The aim of this study was to develop a bioinformatics software and to test it in serum samples of papillary thyroid cancer using mass spectrometry (SELDI-TOF-MS). Materials and Methods: Development of 'Protein analysis' software performing decision tree analysis was done by customizing C4.5. Sixty-one serum samples from 27 papillary thyroid cancer, 17 autoimmune thyroiditis, 17 controls were applied to 2 types of protein chips, CM10 (weak cation exchange) and IMAC3 (metal binding - Cu). Mass spectrometry was performed to reveal the protein expression profiles. Decision trees were generated using 'Protein analysis' software, and automatically detected biomarker candidates. Validation analysis was performed for CM10 chip by random sampling. Results: Decision tree software, which can perform training and validation from profiling data, was developed. For CM10 and IMAC3 chips, 23 of 113 and 8 of 41 protein peaks were significantly different among 3 groups (p<0.05), respectively. Decision tree correctly classified 3 groups with an error rate of 3.3% for CM10 and 2.0% for IMAC3, and 4 and 7 biomarker candidates were detected respectively. In 2 group comparisons, all cancer samples were correctly discriminated from non-cancer samples (error rate = 0%) for CM10 by single node and for IMAC3 by multiple nodes. Validation results from 5 test sets revealed SELDI-TOF-MS and decision tree correctly differentiated cancers from non-cancers (54/55, 98%), while predictability was moderate in 3 group classification (36/55, 65%). Conclusion: Our in-house software was able to successfully build decision trees and detect biomarker candidates, therefore it could be useful for biomarker discovery and clinical follow up of papillary thyroid cancer.

Design of a Hopeful Career Forecasting Program for the Career Education (진로교육을 위한 희망진로 예측프로그램 설계)

  • Kim, Geun-Ho;Kim, Eui-Jeong
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.22 no.8
    • /
    • pp.1055-1060
    • /
    • 2018
  • In the wake of the 4th Industrial Revolution, the problem of career education in schools has become a big issue. While various studies are being conducted on services or technologies to effectively handle artificial intelligence and big data, in the field of education, data on students is simply processed. Therefore, in this paper, we are going to design and present career prediction programs for students using artificial intelligence and big data. Using observational data from students at the institute, the decision tree is constructed with the C4.5 algorithm known to be most intelligent and effective in the decision tree and is used to predict students' path of hope. As a result, the coefficient of kappa exceeded 0.7 and showed a fairly low average error of 0.1 degrees. As shown in this study, a number of studies and data will be deployed to help guide students in their consultation and to provide them with classroom attitudes and directions.

A Study on the Combined Decision Tree(C4.5) and Neural Network Algorithm for Classification of Mobile Telecommunication Customer (이동통신고객 분류를 위한 의사결정나무(C4.5)와 신경망 결합 알고리즘에 관한 연구)

  • 이극노;이홍철
    • Journal of Intelligence and Information Systems
    • /
    • v.9 no.1
    • /
    • pp.139-155
    • /
    • 2003
  • This paper presents the new methodology of analyzing and classifying patterns of customers in mobile telecommunication market to enhance the performance of predicting the credit information based on the decision tree and neural network. With the application of variance selection process from decision tree, the systemic process of defining input vector's value and the rule generation were developed. In point of customer management, this research analyzes current customers and produces the patterns of them so that the company can maintain good customer relationship and makes special management on the customer who has huh potential of getting out of contract in advance. The real implementation of proposed method shows that the predicted accuracy is higher than existing methods such as decision tree(CART, C4.5), regression, neural network and combined model(CART and NN).

  • PDF

A Study on Factors of Education's Outcome using Decision Trees (의사결정트리를 이용한 교육성과 요인에 관한 연구)

  • Kim, Wan-Seop
    • Journal of Engineering Education Research
    • /
    • v.13 no.4
    • /
    • pp.51-59
    • /
    • 2010
  • In order to manage the lectures efficiently in the university and improve the educational outcome, the process is needed that make diagnosis of the present educational outcome of each classes on a lecture and find factors of educational outcome. In most studies for finding the factors of the efficient lecture, statistical methods such as association analysis, regression analysis are used usually, and recently decision tree analysis is employed, too. The decision tree analysis have the merits that is easy to understand a result model, and to be easy to apply for the decision making, but have the weaknesses that is not strong for characteristic of input data such as multicollinearity. This paper indicates the weaknesses of decision tree analysis, and suggests the experimental solution using multiple decision tree algorithm to supplement these problems. The experimental result shows that the suggested method is more effective in finding the reliable factors of the educational outcome.

  • PDF

Analysis on the Enemy's Main Strike Direction Using Decision Tree (의사결정트리를 이용한 적 주타격 방향 분석)

  • Kim, Moo-Soo;Park, Gun-Woo;Lee, Sang-Hoon
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2012.06b
    • /
    • pp.66-68
    • /
    • 2012
  • 적의 주타격 방향은 적 지휘관의 주요 결정사항 중에 하나이다. 이런 적의 주타격 방향에 영향을 미치는 요소들을 분석하여 예측할 수 있다면 전쟁에서 좀 더 유리한 여건을 조성할 수 있을 것이다. 그러나 현재 군에서는 과학적 분석방법이 아닌 분석관 및 지휘관의 경험에 의한 적 주타격 방향 분석이 주를 이루고 있다. 따라서 본 논문에서는 데이터 마이닝의 대표적 방법인 의사결정트리의 C4.5 알고리즘을 사용하여 북한군의 지휘관 결심지도를 분석하였다. 또한 도출된 분류 규칙을 통해 적 주타격 방향 영향요소를 식별하고 영향요소들 간의 관계 및 정도의 수준을 예측하였다. 분석결과 현재 군에서 사용하고 있는 정보와 유사하고 의미 있는 정보를 도출할 수 있었다.

Intelligent Service Reasoning Model Using Data Mining In Smart Home Environments (스마트 홈 환경에서 데이터 마이닝 기법을 이용한 지능형 서비스 추론 모델)

  • Kang, Myung-Seok;Kim, Hag-Bae
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.32 no.12B
    • /
    • pp.767-778
    • /
    • 2007
  • In this paper, we propose a Intelligent Service Reasoning (ISR) model using data mining in smart home environments. Our model creates a service tree used for service reasoning on the basis of C4.5 algorithm, one of decision tree algorithms, and reasons service that will be offered to users through quantitative weight estimation algorithm that uses quantitative characteristic rule and quantitative discriminant rule. The effectiveness in the performance of the developed model is validated through a smart home-network simulation.