• Title/Summary/Keyword: 의사결정나무 모형

Search Result 228, Processing Time 0.023 seconds

A Comparison of Predicting Movie Success between Artificial Neural Network and Decision Tree (기계학습 기반의 영화흥행예측 방법 비교: 인공신경망과 의사결정나무를 중심으로)

  • Kwon, Shin-Hye;Park, Kyung-Woo;Chang, Byeng-Hee
    • Asia-pacific Journal of Multimedia Services Convergent with Art, Humanities, and Sociology
    • /
    • v.7 no.4
    • /
    • pp.593-601
    • /
    • 2017
  • In this paper, we constructed the model of production/investment, distribution, and screening by using variables that can be considered at each stage according to the value chain stage of the movie industry. To increase the predictive power of the model, a regression analysis was used to derive meaningful variables. Based on the given variables, we compared the difference in predictive power between the artificial neural network, which is a machine learning analysis method, and the decision tree analysis method. As a result, the accuracy of artificial neural network was higher than that of decision trees when all variables were added in production/ investment model and distribution model. However, decision trees were more accurate when selected variables were applied according to regression analysis results. In the screening model, the accuracy of the artificial neural network was higher than the accuracy of the decision tree regardless of whether the regression analysis result was reflected or not. This paper has an implication which we tried to improve the performance of movie prediction model by using machine learning analysis. In addition, we tried to overcome a limitation of linear approach by reflecting the results of regression analysis to ANN and decision tree model.

의사결정나무를 이용한 개인휴대통신 해지자 분석

  • 최종후;서두성
    • Proceedings of the Korean Operations and Management Science Society Conference
    • /
    • 1998.10a
    • /
    • pp.377-380
    • /
    • 1998
  • 본 논문에서는 최근 데이터마이닝의 도구로 활발하게 소개되고 있는 의사결정나무 분석을 이용하여 개인휴대통신의 해지자 분석을 실시한다. 또한 로지스틱 회귀모형을 이용하여 가입고객의 해지 가능성에 대한 점수화를 시도한다.

  • PDF

A study on the comparison of descriptive variables reduction methods in decision tree induction: A case of prediction models of pension insurance in life insurance company (생명보험사의 개인연금 보험예측 사례를 통해서 본 의사결정나무 분석의 설명변수 축소에 관한 비교 연구)

  • Lee, Yong-Goo;Hur, Joon
    • Journal of the Korean Data and Information Science Society
    • /
    • v.20 no.1
    • /
    • pp.179-190
    • /
    • 2009
  • In the financial industry, the decision tree algorithm has been widely used for classification analysis. In this case one of the major difficulties is that there are so many explanatory variables to be considered for modeling. So we do need to find effective method for reducing the number of explanatory variables under condition that the modeling results are not affected seriously. In this research, we try to compare the various variable reducing methods and to find the best method based on the modeling accuracy for the tree algorithm. We applied the methods on the pension insurance of a insurance company for getting empirical results. As a result, we found that selecting variables by using the sensitivity analysis of neural network method is the most effective method for reducing the number of variables while keeping the accuracy.

  • PDF

Comparative Analysis of Predictors of Depression for Residents in a Metropolitan City using Logistic Regression and Decision Making Tree (로지스틱 회귀분석과 의사결정나무 분석을 이용한 일 대도시 주민의 우울 예측요인 비교 연구)

  • Kim, Soo-Jin;Kim, Bo-Young
    • The Journal of the Korea Contents Association
    • /
    • v.13 no.12
    • /
    • pp.829-839
    • /
    • 2013
  • This study is a descriptive research study with the purpose of predicting and comparing factors of depression affecting residents in a metropolitan city by using logistic regression analysis and decision-making tree analysis. The subjects for the study were 462 residents ($20{\leq}aged{\angle}65$) in a metropolitan city. This study collected data between October 7, 2011 and October 21, 2011 and analyzed them with frequency analysis, percentage, the mean and standard deviation, ${\chi}^2$-test, t-test, logistic regression analysis, roc curve, and a decision-making tree by using SPSS 18.0 program. The common predicting variables of depression in community residents were social dysfunction, perceived physical symptom, and family support. The specialty and sensitivity of logistic regression explained 93.8% and 42.5%. The receiver operating characteristic (roc) curve was used to determine an optimal model. The AUC (area under the curve) was .84. Roc curve was found to be statistically significant (p=<.001). The specialty and sensitivity of decision-making tree analysis were 98.3% and 20.8% respectively. As for the whole classification accuracy, the logistic regression explained 82.0% and the decision making tree analysis explained 80.5%. From the results of this study, it is believed that the sensitivity, the classification accuracy, and the logistics regression analysis as shown in a higher degree may be useful materials to establish a depression prediction model for the community residents.

The impact of the change in the splitting method of decision trees on the prediction power (의사결정나무의 분기법 변화가 예측력에 미치는 영향)

  • Chang, Youngjae
    • The Korean Journal of Applied Statistics
    • /
    • v.35 no.4
    • /
    • pp.517-525
    • /
    • 2022
  • In the era of big data, various data mining techniques have been proposed as major analysis methodologies. As complex and diverse data is mass-produced, data mining techniques have attracted attention as a method that forms the foundation of data science. In this paper, we focused on the decision tree, which is frequently used in practice and easy to understand as one of representative data mining methods. Specifically, we analyzed the effect of the splitting method of decision trees on the model performance. We compared the prediction power and structures of decision tree models with different split methods based on various simulated data. The results show that the linear combination split method can improve the prediction accuracy of decision trees in the case of data simulated from nonlinear models with complex structure.

An Application of Data-Mining Tool in Fraud Pension Payment Prediction (데이터마이닝을 이용한 국민연금 부정수급 예측모형 개발 - 손해배상금 불성실 신고를 대상으로 -)

  • Cha, Kyung-Yup
    • Communications for Statistical Applications and Methods
    • /
    • v.17 no.1
    • /
    • pp.1-8
    • /
    • 2010
  • This study tested the applicability of a Data mining tool in the analysis of massive National Pension data for the purpose of developing fraud pension payment prediction model. This study is identified significant variables for fraud pension payment through the statistical analysis process and developed prediction models using data mining methodology.

A Development of a Tailored Follow up Management Model Using the Data Mining Technique on Hypertension (데이터마이닝 기법을 활용한 맞춤형 고혈압 사후관리 모형 개발)

  • Park, Il-Su;Yong, Wang-Sik;Kim, Yu-Mi;Kang, Sung-Hong;Han, Jun-Tae
    • The Korean Journal of Applied Statistics
    • /
    • v.21 no.4
    • /
    • pp.639-647
    • /
    • 2008
  • This study used the characteristics of the knowledge discovery and data mining algorithms to develop tailored hypertension follow up management model - hypertension care predictive model and hypertension care compliance segmentation model - for hypertension management using the Korea National Health Insurance Corporation database(the insureds’ screening and health care benefit data). This study validated the predictive power of data mining algorithms by comparing the performance of logistic regression, decision tree, and ensemble technique. On the basis of internal and external validation, it was found that the model performance of logistic regression method was the best among the above three techniques on hypertension care predictive model and hypertension care compliance segmentation model was developed by Decision tree analysis. This study produced several factors affecting the outbreak of hypertension using screening. It is considered to be a contributing factor towards the nation’s building of a Hypertension follow up Management System in the near future by bringing forth representative results on the rise and care of hypertension.

A Case Study on segmentation of Department Store using Decision Tree Analysis (의사결정나무 기법을 활용한 백화점의 고객세분화 사례연구)

  • Chae, Kyung-Hee;Kim, Sang-Cheol
    • Journal of Distribution Science
    • /
    • v.8 no.1
    • /
    • pp.13-19
    • /
    • 2010
  • Segmentation, targeting, and positioning are marketing tools used by a company to gain competitive advantage in the market. For an accurate segmentation, various statistics models or datamining techniques are used. Especially, datamining techniques are introduced in the beginning of the 1980s and solved several marketing problems effectively. In this paper, we research about datamining technique for segmentation and analyze customer's transaction data of Department Store using Decision Tree Analysis, one of the dataming technique. After that, we discuss effects and advantages of segmentation using Decision Tree.

  • PDF

머신러닝 기반 KOSDAQ 시장의 관리종목 지정 예측 연구

  • Yun, Yang-Hyeon;Kim, Tae-Gyeong;Kim, Su-Yeong;Park, Yong-Gyun
    • 한국벤처창업학회:학술대회논문집
    • /
    • 2021.11a
    • /
    • pp.185-187
    • /
    • 2021
  • 관리종목 지정 제도는 상장 기업 내 기업의 부실화를 경고하여 기업에게는 회생 기회를 주고, 투자자들에게는 투자 위험을 경고하기 위한 시장규제 제도이다. 본 연구는 관리종목과 비관리종목의 기업의 재무 데이터를 표본으로 하여 관리종목 지정 예측에 대한 연구를 진행하였다. 분석에 쓰인 분석 방법은 로지스틱 회귀분석, 의사결정나무, 서포트 벡터 머신, 소프트 보팅, 랜덤 포레스트, LightGBM이며 분류 정확도가 82.73%인 LightGBM이 가장 우수한 예측 모형이었으며 분류 정확도가 가장 낮은 예측 모형은 정확도가 71.94%인 의사결정나무였다. 대체적으로 앙상블을 이용한 학습 모형이 단일 학습 모형보다 예측 성능이 높았다.

  • PDF

Evaluations of predicted models fitted for data mining - comparisons of classification accuracy and training time for 4 algorithms (데이터마이닝기법상에서 적합된 예측모형의 평가 -4개분류예측모형의 오분류율 및 훈련시간 비교평가 중심으로)

  • Lee, Sang-Bock
    • Journal of the Korean Data and Information Science Society
    • /
    • v.12 no.2
    • /
    • pp.113-124
    • /
    • 2001
  • CHAID, logistic regression, bagging trees, and bagging trees are compared on SAS artificial data set as HMEQ in terms of classification accuracy and training time. In error rates, bagging trees is at the top, although its run time is slower than those of others. The run time of logistic regression is best among given models, but there is no uniformly efficient model satisfied in both criteria.

  • PDF