• Title/Summary/Keyword: 의사결정나무알고리즘

Search Result 106, Processing Time 0.021 seconds

Evaluations of predicted models fitted for data mining - comparisons of classification accuracy and training time for 4 algorithms (데이터마이닝기법상에서 적합된 예측모형의 평가 -4개분류예측모형의 오분류율 및 훈련시간 비교평가 중심으로)

  • Lee, Sang-Bock
    • Journal of the Korean Data and Information Science Society
    • /
    • v.12 no.2
    • /
    • pp.113-124
    • /
    • 2001
  • CHAID, logistic regression, bagging trees, and bagging trees are compared on SAS artificial data set as HMEQ in terms of classification accuracy and training time. In error rates, bagging trees is at the top, although its run time is slower than those of others. The run time of logistic regression is best among given models, but there is no uniformly efficient model satisfied in both criteria.

  • PDF

A Study on the Data Fusion Method using Decision Rule for Data Enrichment (의사결정 규칙을 이용한 데이터 통합에 관한 연구)

  • Kim S.Y.;Chung S.S.
    • The Korean Journal of Applied Statistics
    • /
    • v.19 no.2
    • /
    • pp.291-303
    • /
    • 2006
  • Data mining is the work to extract information from existing data file. So, the one of best important thing in data mining process is the quality of data to be used. In this thesis, we propose the data fusion technique using decision rule for data enrichment that one phase to improve data quality in KDD process. Simulations were performed to compare the proposed data fusion technique with the existing techniques. As a result, our data fusion technique using decision rule is characterized with low MSE or misclassification rate in fusion variables.

Forecasting Export & Import Container Cargoes using a Decision Tree Analysis (의사결정나무분석을 이용한 컨테이너 수출입 물동량 예측)

  • Son, Yongjung;Kim, Hyunduk
    • Journal of Korea Port Economic Association
    • /
    • v.28 no.4
    • /
    • pp.193-207
    • /
    • 2012
  • The of purpose of this study is to predict export and import container volumes using a Decision Tree analysis. Factors which can influence the volume of container cargo are selected as independent variables; producer price index, consumer price index, index of export volume, index of import volume, index of industrial production, and exchange rate(won/dollar). The period of analysis is from january 2002 to December 2011 and monthly data are used. In this study, CRT(Classification and Regression Trees) algorithm is used. The main findings are summarized as followings. First, when index of export volume is larger than 152.35, monthly export volume is predicted with 858,19TEU. However, when index of export volume is between 115.90 and 152.35, monthly export volume is predicted with 716,582TEU. Second, when index of import volume is larger than 134.60, monthly import volume is predicted with 869,227TEU. However, when index of export volume is between 116.20 and 134.60, monthly import volume is predicted with 738,724TEU.

Inflow and outflow analysis of double majors using social network analysis (사회 연결망 분석을 이용한 복수전공 유입 및 유출 분석)

  • Cho, Jang-Sik
    • Journal of the Korean Data and Information Science Society
    • /
    • v.23 no.4
    • /
    • pp.693-701
    • /
    • 2012
  • Recently, the number of students who get double majors has tended to increase in many universities. As results, many problems occur because immoderate inflow of double-major students is concentrated in a specific popular department. In this paper, we study the characteristic of inflow and outflow of double majors using social network analysis and decision tree analysis. According to the results, SAT score affected the inflow of double majors the most. Additionally, department category, course evaluation score, employment rate also affected the inflow of double majors in the order named. On the other hand, department category affected the outflow of double majors the most. Additionally, SAT score, employment rate, course evaluation score also affected the outflow of double majors in the order named.

A recommendation system for assisting devices in long-term care insurance (의사결정나무기법을 활용한 장기요양 복지용구 권고모형 개발)

  • Han, Eun-Jeong;Park, Sanghee;Lee, JungSuk;Kim, Dong-Geon
    • The Korean Journal of Applied Statistics
    • /
    • v.31 no.6
    • /
    • pp.693-706
    • /
    • 2018
  • It is very important to support the elderly with disability ageing in place. Assisting devices can help them to live independently in their community; however, they have to be used appropriately to meet care needs. This study develops an assisting device recommendation system for the beneficiaries of long-term care insurance that include algorithms to decide the most appropriate type of assisting device for beneficiaries. We used long-term care (LTC) insurance data for grade assessment including 8,084 beneficiaries from July 2015 to June 2016. In addition, we collected standard care plans for assisting devices, that power-assessors made, considering their performance and ability that could subsequently be matched with grade assessment data. We used a decision-tree model in data-mining to develop the model. Finally, we developed 15 algorithms for recommending assisting devices. The findings might be useful in evidence-based care planning for assisting devices and can contribute to enhancing independence and safety in LTC.

Design and implementation of data mining tool using PHP and WEKA (피에이치피와 웨카를 이용한 데이터마이닝 도구의 설계 및 구현)

  • You, Young-Jae;Park, Hee-Chang
    • Journal of the Korean Data and Information Science Society
    • /
    • v.20 no.2
    • /
    • pp.425-433
    • /
    • 2009
  • Data mining is the method to find useful information for large amounts of data in database. It is used to find hidden knowledge by massive data, unexpectedly pattern, relation to new rule. We need a data mining tool to explore a lot of information. There are many data mining tools or solutions; E-Miner, Clementine, WEKA, and R. Almost of them are were focused on diversity and general purpose, and they are not useful for laymen. In this paper we design and implement a web-based data mining tool using PHP and WEKA. This system is easy to interpret results and so general users are able to handle. We implement Apriori algorithm of association rule, K-means algorithm of cluster analysis, and J48 algorithm of decision tree.

  • PDF

Factors affecting success and failure of Internet company business model using inductive learning based on ID3 algorithm (ID3 알고리즘 기반의 귀납적 추론을 활용한 인터넷 기업 비즈니스 모델의 성공과 실패에 영향을 미치는 요인에 관한 연구)

  • Jin, Dong-su
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.23 no.2
    • /
    • pp.111-116
    • /
    • 2019
  • New technologies such as the IoT, Big Data, and Artificial Intelligence, starting from the Web, mobile, and smart device, enable new business models that did not exist before, and various types of Internet companies based on these business models has been emerged. In this research, we examine the factors that influence the success and failure of Internet companies. To do this, we review the recent studies on business model and examine the variables affecting the success of Internet companies in terms of network effect, user interface, cooperation with actors, creating value for users. Using the five derived variables, we will select 14 Internet companies that succeeded and failed in seven commercial business model categories. We derive decision tree by applying inductive learning based on ID3 algorithm to the analysis result and derive rules that affect success and failure based on derived decision tree. With these rules, we want to present the strategic implications for actors to succeed in Internet companies.

Enhancing Workers' Job Tenure Using Directions Derived from Data Mining Techniques (데이터 마이닝 기법을 활용한 근로자의 고용유지 강화 방안 개발)

  • An, Minuk;Kim, Taeun;Yoo, Donghee
    • The Journal of the Korea Contents Association
    • /
    • v.18 no.5
    • /
    • pp.265-279
    • /
    • 2018
  • This study conducted an experiment using data mining techniques to develop prediction models of worker job turnover. The experiment used data from the '2015 Graduate Occupational Mobility Survey' by the Korea Employment Information Service. We developed the prediction models using a decision tree, Bayes net, and artificial neural network. We found that the decision tree-based prediction model reported the best accuracy. We also found that the six influential factors affecting employees' turnover intention are type of working time, job status, full-time or not full-time, regular working hours per week, regular working days per week, and personal development opportunities. From the decision tree-based prediction model, we derived 12 rules of employee turnover for all job types. Using the derived rules, we proposed helpful directions for enhancing workers' job tenure. In addition, we analyzed the influential factors affecting employees' job turnover intention according to four job types and derived rules for each: office (ten rules), culture and art (nine rules), construction (four rules), and information technology (six rules). Using the derived rules, we proposed customized directions for improving the job tenure for each group.

Building a Model for Estimate the Soil Organic Carbon Using Decision Tree Algorithm (의사결정나무를 이용한 토양유기탄소 추정 모델 제작)

  • Yoo, Su-Hong;Heo, Joon;Jung, Jae-Hoon;Han, Su-Hee
    • Journal of Korean Society for Geospatial Information Science
    • /
    • v.18 no.3
    • /
    • pp.29-35
    • /
    • 2010
  • Soil organic carbon (SOC), being a help to forest formation and control of carbon dioxide in the air, is found to be an important factor by which global warming is influenced. Excavating the samples by whole area is very inefficient method to discovering the distribution of SOC. So, the development of suitable model for expecting the relative amount of the SOC makes better use of expecting the SOC. In the present study, a model based on a decision tree algorithm is introduced to estimate the amount of SOC along with accessing influencing factors such as altitude, aspect, slope and type of trees. The model was applied to a real site and validated by 10-fold cross validation using two softwares, See 5 and Weka. From the results given by See 5, it can be concluded that the amount of SOC in surface layers is highly related to the type of trees, while it is, in middle depth layers, dominated by both type of trees and altitude. The estimation accuracy was rated as 70.8% in surface layers and 64.7% in middle depth layers. A similar result was, in surface layers, given by Weka, but aspect was, in middle depth layers, found to be a meaningful factor along with types of trees and altitude. The estimation accuracy was rated as 68.87% and 60.65% in surface and middle depth layers. The introduced model is, from the tests, conceived to be useful to estimation of SOC amount and its application to SOC map production for wide areas.

Ananlyzing Customer Management Data by Datamining (Focused on Apartment Customer Classification) (데이터마이닝을 통한 고객관리데이터의 분석 (아파트고객 세분화를 중심으로))

  • Baek, Shin Jung
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2004.05a
    • /
    • pp.69-72
    • /
    • 2004
  • 기업간의 경쟁이 심화되고 정보의 중요성에 대한 인식이 확대되어 가는 상황에서 다량의 데이터로부터 가치 있는 데이터를 추출하는 CRM 데이터 마이닝은 중대한 관심사가 아닐 수 없다. 본 연구는 데이터마이닝의 여러 활용 분야 중 고객세분화를 위해 최근 많이 사용되고 있는 데이터마이닝 기법인 로지스틱 회귀분석, 의사결정나무, 신경망 알고리즘 기법들을 비교하며, 이를 실제 아파트 고객의 데이터를 이용하여 검증하고자 한다. 따라서, 아파트 고객 세분화를 위한 데이터마이닝 수행시 기법 선택의 기준과 비교 평가의 기준을 제시하는 데 연구목적 있다.

  • PDF