• 제목/요약/키워드: Decision Tree Algorithm

검색결과 452건 처리시간 0.021초

Decision Tree의 Test Cost 개선에 관한 연구 (A Study of Improving on Test Costs in Decision Trees)

  • 석현태
    • 한국정보과학회:학술대회논문집
    • /
    • 한국정보과학회 2002년도 가을 학술발표논문집 Vol.29 No.2 (1)
    • /
    • pp.223-225
    • /
    • 2002
  • Decision tree는 목표 데이터에 대한 계층적 관점을 보여준다는 의미에서 데이터를 보다 잘 이해하는데 많은 도움이 되나 탐욕법(greedy algorithm)에 의한 트리 생성법의 한계로 인해 최적의 예측자라고는 할 수가 없다. 이와 같은 약점을 보완하기 위하여 일반적 방법으로 생성한 decision tree에 대하여 다차원 연관규칙 알고리즘을 적용함으로써 짱은 길이의 최적 부분 규칙집합을 구하는 방법을 제시하였고 실험을 통해 그와 같은 사실을 확인하였다.

  • PDF

A Decision Tree Algorithm using Genetic Programming

  • Park, Chongsun;Ko, Young Kyong
    • Communications for Statistical Applications and Methods
    • /
    • 제10권3호
    • /
    • pp.845-857
    • /
    • 2003
  • We explore the use of genetic programming to evolve decision trees directly for classification problems with both discrete and continuous predictors. We demonstrate that the derived hypotheses of standard algorithms can substantially deviated from the optimum. This deviation is partly due to their top-down style procedures. The performance of the system is measured on a set of real and simulated data sets and compared with the performance of well-known algorithms like CHAID, CART, C5.0, and QUEST. Proposed algorithm seems to be effective in handling problems caused by top-down style procedures of existing algorithms.

대표적인 의사결정나무 알고리즘의 해석력 비교 (Interpretability Comparison of Popular Decision Tree Algorithms)

  • 홍정식;황근성
    • 산업경영시스템학회지
    • /
    • 제44권2호
    • /
    • pp.15-23
    • /
    • 2021
  • Most of the open-source decision tree algorithms are based on three splitting criteria (Entropy, Gini Index, and Gain Ratio). Therefore, the advantages and disadvantages of these three popular algorithms need to be studied more thoroughly. Comparisons of the three algorithms were mainly performed with respect to the predictive performance. In this work, we conducted a comparative experiment on the splitting criteria of three decision trees, focusing on their interpretability. Depth, homogeneity, coverage, lift, and stability were used as indicators for measuring interpretability. To measure the stability of decision trees, we present a measure of the stability of the root node and the stability of the dominating rules based on a measure of the similarity of trees. Based on 10 data collected from UCI and Kaggle, we compare the interpretability of DT (Decision Tree) algorithms based on three splitting criteria. The results show that the GR (Gain Ratio) branch-based DT algorithm performs well in terms of lift and homogeneity, while the GINI (Gini Index) and ENT (Entropy) branch-based DT algorithms performs well in terms of coverage. With respect to stability, considering both the similarity of the dominating rule or the similarity of the root node, the DT algorithm according to the ENT splitting criterion shows the best results.

의사결정 트리를 이용한 학습 에이전트 단기주가예측 시스템 개발 (A Development for Short-term Stock Forecasting on Learning Agent System using Decision Tree Algorithm)

  • 서장훈;장현수
    • 대한안전경영과학회지
    • /
    • 제6권2호
    • /
    • pp.211-229
    • /
    • 2004
  • The basis of cyber trading has been sufficiently developed with innovative advancement of Internet Technology and the tendency of stock market investment has changed from long-term investment, which estimates the value of enterprises, to short-term investment, which focuses on getting short-term stock trading margin. Hence, this research shows a Short-term Stock Price Forecasting System on Learning Agent System using DTA(Decision Tree Algorithm) ; it collects real-time information of interest and favorite issues using Agent Technology through the Internet, and forms a decision tree, and creates a Rule-Base Database. Through this procedure the Short-term Stock Price Forecasting System provides customers with the prediction of the fluctuation of stock prices for each issue in near future and a point of sales and purchases. A Human being has the limitation of analytic ability and so through taking a look into and analyzing the fluctuation of stock prices, the Agent enables man to trace out the external factors of fluctuation of stock market on real-time. Therefore, we can check out the ups and downs of several issues at the same time and figure out the relationship and interrelation among many issues using the Agent. The SPFA (Stock Price Forecasting System) has such basic four phases as Data Collection, Data Processing, Learning, and Forecasting and Feedback.

퍼지 결정 트리를 이용한 효율적인 퍼지 규칙 생성 (Efficient Fuzzy Rule Generation Using Fuzzy Decision Tree)

  • 민창우;김명원;김수광
    • 전자공학회논문지C
    • /
    • 제35C권10호
    • /
    • pp.59-68
    • /
    • 1998
  • 데이터 마이닝의 목적은 유용한 패턴을 찾음으로써 데이터를 이해하는데 있으므로, 찾아진 패턴은 정확할뿐 아니라 이해하기 쉬워야한다. 따라서 정확하고 이해하기 쉬운 패턴을 추출하는 데이터 마이닝에 대한 연구가 필요하다. 본 논문에서는 퍼지 결정 트리를 이용한 효과적인 데이터 마이닝 알고리즘을 제안한다. 제안된 알고리즘은 ID3, C4.5와 같은 결정 트리 알고리즘의 이해하기 쉬운 장점과 퍼지의 표현력을 결합하여 간결하고 이해하기 쉬운 규칙을 생성한다. 제안된 알고리즘은 히스토그램에 기반하여 퍼지 소속함수를 생성하는 단계와 생성된 소속 함수를 이용하여 퍼지 결정 트리를 구성하는 두 단계로 이루어진다. 또한 제안된 방법의 타당성을 검증하기 위하여 표준적인 패턴 분류 벤치마크 데이터인 Iris 데이터와 Wisconsin Breast Cancer 데이터에 대한 실험 결과를 보인다.

  • PDF

A customer credit Prediction Researched to Improve Credit Stability based on Artificial Intelligence

  • MUN, Ji-Hui;JUNG, Sang Woo
    • 한국인공지능학회지
    • /
    • 제9권1호
    • /
    • pp.21-27
    • /
    • 2021
  • In this Paper, Since the 1990s, Korea's credit card industry has steadily developed. As a result, various problems have arisen, such as careless customer information management and loans to low-credit customers. This, in turn, had a high delinquency rate across the card industry and a negative impact on the economy. Therefore, in this paper, based on Azure, we analyze and predict the delinquency and delinquency periods of credit loans according to gender, own car, property, number of children, education level, marital status, and employment status through linear regression analysis and enhanced decision tree algorithm. These predictions can consequently reduce the likelihood of reckless credit lending and issuance of credit cards, reducing the number of bad creditors and reducing the risk of banks. In addition, after classifying and dividing the customer base based on the predicted result, it can be used as a basis for reducing the risk of credit loans by developing a credit product suitable for each customer. The predicted result through Azure showed that when predicting with Linear Regression and Boosted Decision Tree algorithm, the Boosted Decision Tree algorithm made more accurate prediction. In addition, we intend to increase the accuracy of the analysis by assigning a number to each data in the future and predicting again.

Decision Tree Generation Algorithm for Image-based Video Conferencing

  • Yunsick Sung;Jeonghoon Kwak;Jong Hyuk Park
    • Journal of Internet Technology
    • /
    • 제20권5호
    • /
    • pp.1535-1545
    • /
    • 2019
  • Recently, the diverse kinds of applications in multimedia computing have been developed for visual surveillance, healthcare, smart cities, and security. Video conferencing is one of core applications among multimedia applications. The Quality of Service of video conferencing is a major issue, because of limited network traffic. Video conferencing allow a large number of users to converse with each other. However, the huge amount of packets are generated in the process of transmitting and receiving the photographed images of users. Therefore, the number of packets in video conferencing needs to be reduced. Video conferencing can be conducted in virtual reality by sending only the control signals of virtual characters and showing virtual characters based on the received signals to represent the users, instead of the photographed images of the users, in real time. This paper proposes a method that determines representative photographed images by analyzing the collected photographed images of users, using KMedoids algorithm and a decision tree, and expresses the users based on the analyzed images. The decision tree used for video conferencing are generated automatically using the proposed method. Given that the behaviors in the decision tree is added or changed considering photographed images, it is possible to reproduce the decision tree by photographing the behavior of the user in real-time. In an experiment conducted, 63 consecutively photographed images were collected and a decision tree generated by using the silhouette images of the photographed images. Indices of the silhouette images were utilized to express a subject and one index was selected using a decision tree. The proposed method reduced the number of comparisons by a factor of 3.78 compared with the traditional method that uses correlation coefficient. Further, each user's image could be outputted by using only the control image table of the image and the index.

RFID의 효율적인 태그인식을 위한 Adaptive Decision 알고리즘 (Adaptive Decision Algorithm for an Improvement of RFID Anti-Collision)

  • 고영은;오경욱;방성일
    • 대한전자공학회논문지TC
    • /
    • 제44권4호
    • /
    • pp.1-9
    • /
    • 2007
  • 본 논문에서는 RFID Tag 충돌방지를 위한 Adaptive Decision 알고리즘에 대해 연구 하였다. 이를 위해 기존의 RFID Tag 충돌방지 기법인 ALOHA기반의 기법과 이진 검색 충돌방지 기반의 알고리즘을 먼저 비교?분석하였다. 기존 알고리즘은 태그를 인식하기 위한 탐색횟수와 전송하는 데이터량을 감소시키는데 한계점을 가지고 있었다. 제안한 Adaptive Decision 알고리즘은 인식범위 내의 태그를 구별하기 위해, 호출에 응답한 모든 태그의 ID 비트 별 '1'의 개수를 계산하고, 개수가 작은 그룹의 태그를 우선적으로 식별한다. 각 태그 ID 비트의 '1'의 개수는 리더의 메모리에 저장하고, 식별된 태그 ID 비트의 ‘1’의 개수를 감산한다. 이와 같은 과정을 반복함으로써 인식범위 내의 모든 태그를 식별한다. 논문에서 제안한 능동적인 태그 선택기준과 간단한 가감 과정을 통해 불필요한 탐색횟수를 줄 일 수 있다. 알고리즘의 성능평가는 태그를 인식하기 위한 리더의 반복횟수와 전송 데이터 량으로 나타내었다. 성능평가 결과, 기존의 알고리즘과 비교하여 Adaptive Decision 알고리즘의 반복횟수가 16.8% 감소되었고, 전송 데이터 량도 ¼배 감소된 것을 확인할 수 있었다.

Industrial Waste Database Analysis Using Data Mining Techniques

  • Cho, Kwang-Hyun;Park, Hee-Chang
    • Journal of the Korean Data and Information Science Society
    • /
    • 제17권2호
    • /
    • pp.455-465
    • /
    • 2006
  • Data mining is the method to find useful information for large amounts of data in database. It is used to find hidden knowledge by massive data, unexpectedly pattern, and relation to new rule. The methods of data mining are decision tree, association rules, clustering, neural network and so on. We analyze industrial waste database using data mining technique. We use k-means algorithm for clustering and C5.0 algorithm for decision tree and Apriori algorithm for association rule. We can use these outputs for environmental preservation and environmental improvement.

  • PDF

Industrial Waste Database Analysis Using Data Mining

  • 조광현;박희창
    • 한국데이터정보과학회:학술대회논문집
    • /
    • 한국데이터정보과학회 2006년도 PROCEEDINGS OF JOINT CONFERENCEOF KDISS AND KDAS
    • /
    • pp.241-251
    • /
    • 2006
  • Data mining is the method to find useful information for large amounts of data in database It is used to find hidden knowledge by massive data, unexpectedly pattern, relation to new rule. The methods of data mining are decision tree, association rules, clustering, neural network and so on. We analyze industrial waste database using data mining technique. We use k-means algorithm for clustering and C5.0 algorithm for decision tree and Apriori algorithm for association rule. We can use these analysis outputs for environmental preservation and environmental improvement.

  • PDF