• 제목/요약/키워드: Decision-tree

검색결과 1,677건 처리시간 0.028초

Decision Tree의 Test Cost 개선에 관한 연구 (A Study of Improving on Test Costs in Decision Trees)

  • 석현태
    • 한국정보과학회:학술대회논문집
    • /
    • 한국정보과학회 2002년도 가을 학술발표논문집 Vol.29 No.2 (1)
    • /
    • pp.223-225
    • /
    • 2002
  • Decision tree는 목표 데이터에 대한 계층적 관점을 보여준다는 의미에서 데이터를 보다 잘 이해하는데 많은 도움이 되나 탐욕법(greedy algorithm)에 의한 트리 생성법의 한계로 인해 최적의 예측자라고는 할 수가 없다. 이와 같은 약점을 보완하기 위하여 일반적 방법으로 생성한 decision tree에 대하여 다차원 연관규칙 알고리즘을 적용함으로써 짱은 길이의 최적 부분 규칙집합을 구하는 방법을 제시하였고 실험을 통해 그와 같은 사실을 확인하였다.

  • PDF

A review of tree-based Bayesian methods

  • Linero, Antonio R.
    • Communications for Statistical Applications and Methods
    • /
    • 제24권6호
    • /
    • pp.543-559
    • /
    • 2017
  • Tree-based regression and classification ensembles form a standard part of the data-science toolkit. Many commonly used methods take an algorithmic view, proposing greedy methods for constructing decision trees; examples include the classification and regression trees algorithm, boosted decision trees, and random forests. Recent history has seen a surge of interest in Bayesian techniques for constructing decision tree ensembles, with these methods frequently outperforming their algorithmic counterparts. The goal of this article is to survey the landscape surrounding Bayesian decision tree methods, and to discuss recent modeling and computational developments. We provide connections between Bayesian tree-based methods and existing machine learning techniques, and outline several recent theoretical developments establishing frequentist consistency and rates of convergence for the posterior distribution. The methodology we present is applicable for a wide variety of statistical tasks including regression, classification, modeling of count data, and many others. We illustrate the methodology on both simulated and real datasets.

A methodology for Internet Customer segmentation using Decision Trees

  • Cho, Y.B.;Kim, S.H.
    • 한국지능정보시스템학회:학술대회논문집
    • /
    • 한국지능정보시스템학회 2003년도 춘계학술대회
    • /
    • pp.206-213
    • /
    • 2003
  • Application of existing decision tree algorithms for Internet retail customer classification is apt to construct a bushy tree due to imprecise source data. Even excessive analysis may not guarantee the effectiveness of the business although the results are derived from fully detailed segments. Thus, it is necessary to determine the appropriate number of segments with a certain level of abstraction. In this study, we developed a stopping rule that considers the total amount of information gained while generating a rule tree. In addition to forwarding from root to intermediate nodes with a certain level of abstraction, the decision tree is investigated by the backtracking pruning method with misclassification loss information.

  • PDF

투자와 수출 및 환율의 고용에 대한 의사결정 나무, 랜덤 포레스트와 그래디언트 부스팅 머신러닝 모형 예측 (Investment, Export, and Exchange Rate on Prediction of Employment with Decision Tree, Random Forest, and Gradient Boosting Machine Learning Models)

  • 이재득
    • 무역학회지
    • /
    • 제46권2호
    • /
    • pp.281-299
    • /
    • 2021
  • This paper analyzes the feasibility of using machine learning methods to forecast the employment. The machine learning methods, such as decision tree, artificial neural network, and ensemble models such as random forest and gradient boosting regression tree were used to forecast the employment in Busan regional economy. The following were the main findings of the comparison of their predictive abilities. First, the forecasting power of machine learning methods can predict the employment well. Second, the forecasting values for the employment by decision tree models appeared somewhat differently according to the depth of decision trees. Third, the predictive power of artificial neural network model, however, does not show the high predictive power. Fourth, the ensemble models such as random forest and gradient boosting regression tree model show the higher predictive power. Thus, since the machine learning method can accurately predict the employment, we need to improve the accuracy of forecasting employment with the use of machine learning methods.

다중 공정계획을 가지는 정적/동적 유연 개별공정에 대한 의사결정 나무 기반 스케줄링 (Decision Tree based Scheduling for Static and Dynamic Flexible Job Shops with Multiple Process Plans)

  • 유재민;도형호;권용주;신정훈;김형원;남성호;이동호
    • 한국정밀공학회지
    • /
    • 제32권1호
    • /
    • pp.25-37
    • /
    • 2015
  • This paper suggests a decision tree based approach for flexible job shop scheduling with multiple process plans. The problem is to determine the operation/machine pairs and the sequence of the jobs assigned to each machine. Two decision tree based scheduling mechanisms are developed for static and dynamic flexible job shops. In the static case, all jobs are given in advance and the decision tree is used to select a priority dispatching rule to process all the jobs. Also, in the dynamic case, the jobs arrive over time and the decision tree, updated regularly, is used to select a priority rule in real-time according to a rescheduling strategy. The two decision tree based mechanisms were applied to a flexible job shop case with reconfigurable manufacturing cells and a conventional job shop, and the results are reported for various system performance measures.

매개 변수를 이용한 의사결정나무 생성에 관한 연구 (A study on decision tree creation using intervening variable)

  • 조광현;박희창
    • Journal of the Korean Data and Information Science Society
    • /
    • 제22권4호
    • /
    • pp.671-678
    • /
    • 2011
  • 데이터마이닝은 방대한 양의 데이터 속에서 쉽게 드러나지 않는 유용한 정보를 찾아내는 기법으로서 의사결정나무, 연관 규칙, 군집분석, 신경망 분석 등의 기법이 있으며, 이중 의사결정나무 알고리즘은 의사결정 규칙을 도표화하여 관심대상이 되는 집단을 몇 개의 소집단으로 분류하거나 예측을 수행하는 방법으로서 고객세분화, 고객 분류, 문제 예측 등의 여러 분야에서 유용하게 활용되고 있다. 일반적으로 의사결정나무의 모형 생성 시, 모형 생성의 기준 및 입력 변수의 수에 따라 복잡한 모형이 생성되기도 하며 특히 입력 변수의 수가 많을 경우 종종 모형 생성 및 해석에 있어 어려움을 격기도 한다. 이에 본 논문에서는 의사결정나무 생성 시, 입력 변수에 대한 매개 관계를 파악하여 나무 생성에 불필요한 입력 변수를 제거하는 방법을 제시하고 그 효율성을 파악하기 위하여 실제 자료에 적용하고자 한다.

결정트리 학습 알고리즘을 활용한 축구 게임 수비 NPC 제어 방법 (NPC Control Model for Defense in Soccer Game Applying the Decision Tree Learning Algorithm)

  • 조달호;이용호;김진형;박소영;이대웅
    • 한국게임학회 논문지
    • /
    • 제11권6호
    • /
    • pp.61-70
    • /
    • 2011
  • 본 논문에서는 결정트리 학습 알고리즘을 활용한 축구 게임 수비 NPC 제어 방법을 제안한다. 제안하는 방법은 실제 게임 사용자들의 이동 방향 패턴과 행동 패턴을 추출하여 결정트리학습 알고리즘에 적용한다. 그리고 학습된 결정트리를 바탕으로 NPC의 이동방향과 행동을 결정한다. 실험결과 제안하는 방법은 결정트리 학습에 시간이 다소 걸리지만, 학습된 결정트리를 바탕으로 이동방향이나 행동을 결정하는 시간은 약 0.001-0.003 ms(밀리초)가 소요되어 실시간으로 NPC를 제어할 수 있었다. 또한, 제안하는 방법은 현재 상태 정보 뿐만 아니라 이를 분석한 관계정보, 이전 상태 정보도 함께 활용하므로, 기존방법인 (Letia98)에 비해 이동방향 결정시 높은 정확도를 나타냈다.

네트워크 비정상 탐지를 위한 속성 축소를 반영한 의사결정나무 기술 (Decision Tree Techniques with Feature Reduction for Network Anomaly Detection)

  • 강구홍
    • 정보보호학회논문지
    • /
    • 제29권4호
    • /
    • pp.795-805
    • /
    • 2019
  • 최근 알려지지 않은 공격에 대처하기 위한 네트워크 비정상(anomaly) 탐지 기술에 대한 관심이 한층 높아지고 있다. 이러한 기술 개발을 위해 데이터 마이닝(data mining), 기계학습(machine learning), 그리고 딥러닝(deep learning)등을 활용한 다양한 연구가 진행되고 있다. 본 논문에서는 분류(classification) 문제를 다루는 데이터 마이닝 기술 중 가장 전통적인 방법 중 하나인 의사결정나무(decision tree)를 이용하여 NSL-KDD 데이터 셋을 대상으로 네트워크 비정상 탐지 가능성을 보여준다. 의사결정나무의 과대적합(over-fitting) 단점을 해소하기 위해 카이-제곱(chi-square) 테스트를 통해 최적의 속성 선택(feature selection)을 수행하고, 선택된 13개의 속성을 사용한 의사결정나무 모델 환경에서 NSL-KDD 시험 데이터 셋 KDDTest+에 대해 84% 그리고 KDDTest-21에 대해 70%의 네트워크 비정상 검출 정확도를 보였다. 제시된 정확도는 기존 의사결정나무 모델 적용 시 이들 시험 데이터 셋을 대상으로 알려진 정확도 81% 그리고 64% 수준과 비교해 약 3% 그리고 6% 각각 향상된 결과다.

의사결정나무를 활용한 2030년 도시 확장 예측 (Urban Sprawl prediction in 2030 using decision tree)

  • 김근한;최희선;김동범;정예림;진대용
    • 한국환경복원기술학회지
    • /
    • 제23권6호
    • /
    • pp.125-135
    • /
    • 2020
  • The uncontrolled urban expansion causes various social, economic problems and natural/environmental problems. Therefore, it is necessary to forecast urban expansion by identifying various factors related to urban expansion. This study aims to forecast it using a decision tree that is widely used in various areas. The study used geographic data such as the area of use, geographical data like elevation and slope, the environmental conservation value assessment map, and population density data for 2006 and 2018. It extracted the new urban expansion areas by comparing the residential, industrial, and commercial zones of the zoning in 2006 and 2018 and derived a decision tree using the 2006 data as independent variables. It is intended to forecast urban expansion in 2030 by applying the data for 2018 to the derived decision tree. The analysis result confirmed that the distance from the green area, the elevation, the grade of the environmental conservation value assessment map, and the distance from the industrial area were important factors in forecasting the urban area expansion. The AUC of 0.95051 showed excellent explanatory power in the ROC analysis performed to verify the accuracy. However, the forecast of the urban area expansion for 2018 using the decision tree was 15,459.98㎢, which was significantly different from the actual urban area of 4,144.93㎢ for 2018. Since many regions use decision tree to forecast urban expansion, they can be useful for identifying which factors affect urban expansion, although they are not suitable for forecasting the expansion of urban region in detail. Identifying such important factors for urban expansion is expected to provide information that can be used in future land, urban, and environmental planning.

Decision Tree Generation Algorithm for Image-based Video Conferencing

  • Yunsick Sung;Jeonghoon Kwak;Jong Hyuk Park
    • Journal of Internet Technology
    • /
    • 제20권5호
    • /
    • pp.1535-1545
    • /
    • 2019
  • Recently, the diverse kinds of applications in multimedia computing have been developed for visual surveillance, healthcare, smart cities, and security. Video conferencing is one of core applications among multimedia applications. The Quality of Service of video conferencing is a major issue, because of limited network traffic. Video conferencing allow a large number of users to converse with each other. However, the huge amount of packets are generated in the process of transmitting and receiving the photographed images of users. Therefore, the number of packets in video conferencing needs to be reduced. Video conferencing can be conducted in virtual reality by sending only the control signals of virtual characters and showing virtual characters based on the received signals to represent the users, instead of the photographed images of the users, in real time. This paper proposes a method that determines representative photographed images by analyzing the collected photographed images of users, using KMedoids algorithm and a decision tree, and expresses the users based on the analyzed images. The decision tree used for video conferencing are generated automatically using the proposed method. Given that the behaviors in the decision tree is added or changed considering photographed images, it is possible to reproduce the decision tree by photographing the behavior of the user in real-time. In an experiment conducted, 63 consecutively photographed images were collected and a decision tree generated by using the silhouette images of the photographed images. Indices of the silhouette images were utilized to express a subject and one index was selected using a decision tree. The proposed method reduced the number of comparisons by a factor of 3.78 compared with the traditional method that uses correlation coefficient. Further, each user's image could be outputted by using only the control image table of the image and the index.