• 제목/요약/키워드: decision-tree analysis

검색결과 721건 처리시간 0.034초

의사결정나무를 활용한 신경망 모형의 입력특성 선택: 주택가격 추정 사례 (Decision Tree-Based Feature-Selective Neural Network Model: Case of House Price Estimation)

  • 윤한성
    • 디지털산업정보학회논문지
    • /
    • 제19권1호
    • /
    • pp.109-118
    • /
    • 2023
  • Data-based analysis methods have become used more for estimating or predicting housing prices, and neural network models and decision trees in the field of big data are also widely used more and more. Neural network models are often evaluated to be superior to existing statistical models in terms of estimation or prediction accuracy. However, there is ambiguity in determining the input feature of the input layer of the neural network model, that is, the type and number of input features, and decision trees are sometimes used to overcome these disadvantages. In this paper, we evaluate the existing methods of using decision trees and propose the method of using decision trees to prioritize input feature selection in neural network models. This can be a complementary or combined analysis method of the neural network model and decision tree, and the validity was confirmed by applying the proposed method to house price estimation. Through several comparisons, it has been summarized that the selection of appropriate input characteristics according to priority can increase the estimation power of the model.

매개 변수를 이용한 의사결정나무 생성에 관한 연구 (A study on decision tree creation using intervening variable)

  • 조광현;박희창
    • Journal of the Korean Data and Information Science Society
    • /
    • 제22권4호
    • /
    • pp.671-678
    • /
    • 2011
  • 데이터마이닝은 방대한 양의 데이터 속에서 쉽게 드러나지 않는 유용한 정보를 찾아내는 기법으로서 의사결정나무, 연관 규칙, 군집분석, 신경망 분석 등의 기법이 있으며, 이중 의사결정나무 알고리즘은 의사결정 규칙을 도표화하여 관심대상이 되는 집단을 몇 개의 소집단으로 분류하거나 예측을 수행하는 방법으로서 고객세분화, 고객 분류, 문제 예측 등의 여러 분야에서 유용하게 활용되고 있다. 일반적으로 의사결정나무의 모형 생성 시, 모형 생성의 기준 및 입력 변수의 수에 따라 복잡한 모형이 생성되기도 하며 특히 입력 변수의 수가 많을 경우 종종 모형 생성 및 해석에 있어 어려움을 격기도 한다. 이에 본 논문에서는 의사결정나무 생성 시, 입력 변수에 대한 매개 관계를 파악하여 나무 생성에 불필요한 입력 변수를 제거하는 방법을 제시하고 그 효율성을 파악하기 위하여 실제 자료에 적용하고자 한다.

Development of Discriminant Analysis System by Graphical User Interface of Visual Basic

  • Lee, Yong-Kyun;Shin, Young-Jae;Cha, Kyung-Joon
    • Journal of the Korean Data and Information Science Society
    • /
    • 제18권2호
    • /
    • pp.447-456
    • /
    • 2007
  • Recently, the multivariate statistical analysis has been used to analyze meaningful information for various data. In this paper, we develope the multivariate statistical analysis system combined with Fisher discriminant analysis, logistic regression, neural network, and decision tree using visual basic 6.0.

  • PDF

CAE와 Decision-tree를 이용한 사출성형 공정개선에 관한 연구 (A Study on the Improvement of Injection Molding Process Using CAE and Decision-tree)

  • 황순환;한성렬;이후진
    • 한국산학기술학회논문지
    • /
    • 제22권4호
    • /
    • pp.580-586
    • /
    • 2021
  • 현재 사출성형분야의 Computer Aided Testing(CAT) 방법론으로 CAE(Computer Aided Engineering)를 이용한 수치 해석 기법이 주를 이루고 있다. 그러나 최근 시뮬레이션에 추가로 인공지능 기법을 응용하는 방법론이 연구되고 있다. 우리는 지난 연구에서 다양한 Machine Learning 기법을 활용하여 사출 성형 공정에 따른 변형 결과를 비교하였으며, 최종적으로 MLP(Multi-Layer Perceptron) 예측모델을 생성하였고, HMA(Hybrid Metaheuristic Algorithm)를 이용하여 최적화 결과를 얻어냈다. 그러나 MLP는 예측 성능이 우수한 반면 블랙박스와 같이 결정 과정에 대한 설명이 부족하다. 본 연구에서는 Radiator Tank 부품에 대하여 사출 성형 해석 소프트웨어인 Autodesk Moldflow 2018을 이용하여 수치 해석 기법으로 데이터를 생성하고, Machine Learning 소프트웨어인 RapidMiner Studio version 9.5를 활용하여 여러 Machine Learning Algorithms 모델을 생성하여 평균 제곱근 오차를 비교하였다. Decision-tree는 Root Mean Square Error(RMSE) 값이 다른 Machine Learning 기법에 비해 양호한 예측 성능을 갖추고 있었다. Decision-tree의 크기를 결정하는 Maximal Depth에 따라 분류 기준을 높일 수 있지만 복잡성도 함께 증가시켰다. Decision-tree를 이용하여 구속 조건을 만족하는 중간 값을 선정하여 시뮬레이션을 진행한 결과 기존의 시뮬레이션만 진행한 것보다 7.7%의 개선 효과가 있었다.

패턴의 변화를 가지는 연속성 데이터를 위한 스트리밍 의사결정나무 (Streaming Decision Tree for Continuity Data with Changed Pattern)

  • 윤태복;심학준;이지형;최영미
    • 한국지능시스템학회논문지
    • /
    • 제20권1호
    • /
    • pp.94-100
    • /
    • 2010
  • 데이터 마이닝(Data Mining)은 환경으로부터 수집된 데이터에서 패턴을 추출하고 의미 있는 정보를 발견하기 위하여 주로 사용된다. 하지만, 기존의 방법은 데이터의 수집이 완료된 상태에서 분석하는 것을 기반으로 하고 있으며, 시간의 흐름에 따른 패턴의 변화를 반영하기 어렵다. 본 논문은 연속성(Continuity data), 대량성(Large scale) 그리고 패턴의 가변성(Changed pattern)과 같은 특성을 가지는 스트림 데이터(Stream Data)의 분석을 위한 스트리밍 의사결정 나무(Streaming Decision Tree : SDT) 방법을 소개한다. SDT는 연속적으로 발생하는 데이터를 블록으로 정의하고, 각 블록은 의사결정나무 학습 방법을 이용하여 규칙을 추출한다. 추출된 규칙은 발생 시간, 빈도 그리고 모순 등을 고려하여 결합하였다. 실험에서는 시계열 데이터를 이용하여 분석하였고, 적절한 결과를 확인하였다.

Inter-Process Correlation Model based Hybrid Framework for Fault Diagnosis in Wireless Sensor Networks

  • Zafar, Amna;Akbar, Ali Hammad;Akram, Beenish Ayesha
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제13권2호
    • /
    • pp.536-564
    • /
    • 2019
  • Soft faults are inherent in wireless sensor networks (WSNs) due to external and internal errors. The failure of processes in a protocol stack are caused by errors on various layers. In this work, impact of errors and channel misbehavior on process execution is investigated to provide an error classification mechanism. Considering implementation of WSN protocol stack, inter-process correlations of stacked and peer layer processes are modeled. The proposed model is realized through local and global decision trees for fault diagnosis. A hybrid framework is proposed to implement local decision tree on sensor nodes and global decision tree on diagnostic cluster head. Local decision tree is employed to diagnose critical failures due to errors in stacked processes at node level. Global decision tree, diagnoses critical failures due to errors in peer layer processes at network level. The proposed model has been analyzed using fault tree analysis. The framework implementation has been done in Castalia. Simulation results validate the inter-process correlation model-based fault diagnosis. The hybrid framework distributes processing load on sensor nodes and diagnostic cluster head in a decentralized way, reducing communication overhead.

연결강도분석을 이용한 통합된 부도예측용 신경망모형

  • 이웅규;임영하
    • 한국정보시스템학회:학술대회논문집
    • /
    • 한국정보시스템학회 2002년도 추계학술대회
    • /
    • pp.289-312
    • /
    • 2002
  • This study suggests the Link weight analysis approach to choose input variables and an integrated model to make more accurate bankruptcy prediction model. the Link weight analysis approach is a method to choose input variables to analyze each input node's link weight which is the absolute value of link weight between an input nodes and a hidden layer. There are the weak-linked neurons elimination method, the strong-linked neurons selection method in the link weight analysis approach. The Integrated Model is a combined type adapting Bagging method that uses the average value of the four models, the optimal weak-linked-neurons elimination method, optimal strong-linked neurons selection method, decision-making tree model, and MDA. As a result, the methods suggested in this study - the optimal strong-linked neurons selection method, the optimal weak-linked neurons elimination method, and the integrated model - show much higher accuracy than MDA and decision making tree model. Especially the integrated model shows much higher accuracy than MDA and decision making tree model and shows slightly higher accuracy than the optimal weak-linked neurons elimination method and the optimal strong-linked neurons selection method.

  • PDF

국민의료비 결정요인 및 영향력 분석 (The Determinants of National Health Expenditure: A Decision Tree Analysis)

  • 이견직;정영호
    • 보건행정학회지
    • /
    • 제12권3호
    • /
    • pp.99-111
    • /
    • 2002
  • This paper draws the determinants of National Health Expenditures(min) and collectivizes OECD countries which are positioned by same conditions using the decision tree analysis. Major findings are summarized as follows. We find that the power of influence of income level on NHE has been 58.35% in 1985, 65.37% in 1990, 66.90% in 1995, and 66.47% in 1997. The power of influence of public share in NHE has been on the increase during that period: 19.50% in 1985, 19.91% in 1990, 22.81% in 1995 and 26.88% in 1997. The two factors(income level, public share) tells for the most part of NHE: 77.85% in 1985, 85.28% in 1990, 89.71% in 1995, 93.35% in 1997. Our results support the hypothesis that NHE could be explained mostly by the income level and show that public share is negatively correlated with the growth of NHE.

CANCER CLASSIFICATION AND PREDICTION USING MULTIVARIATE ANALYSIS

  • Shon, Ho-Sun;Lee, Heon-Gyu;Ryu, Keun-Ho
    • 대한원격탐사학회:학술대회논문집
    • /
    • 대한원격탐사학회 2006년도 Proceedings of ISRS 2006 PORSEC Volume II
    • /
    • pp.706-709
    • /
    • 2006
  • Cancer is one of the major causes of death; however, the survival rate can be increased if discovered at an early stage for timely treatment. According to the statistics of the World Health Organization of 2002, breast cancer was the most prevalent cancer for all cancers occurring in women worldwide, and it account for 16.8% of entire cancers inflicting Korean women today. In order to classify the type of breast cancer whether it is benign or malignant, this study was conducted with the use of the discriminant analysis and the decision tree of data mining with the breast cancer data disclosed on the web. The discriminant analysis is a statistical method to seek certain discriminant criteria and discriminant function to separate the population groups on the basis of observation values obtained from two or more population groups, and use the values obtained to allow the existing observation value to the population group thereto. The decision tree analyzes the record of data collected in the part to show it with the pattern existing in between them, namely, the combination of attribute for the characteristics of each class and make the classification model tree. Through this type of analysis, it may obtain the systematic information on the factors that cause the breast cancer in advance and prevent the risk of recurrence after the surgery.

  • PDF

의사결정트리를 활용한 황사예보의 경제적 가치 분석-의약품 재고관리문제를 중심으로 (Economic Value Analysis of Asian Dust Forecasts Using Decision Tree-Focused on Medicine Inventory Management)

  • 윤승철;이기광
    • 산업경영시스템학회지
    • /
    • 제37권1호
    • /
    • pp.120-126
    • /
    • 2014
  • This paper deals with the economic value analysis of meteorological forecasts for a hypothetical inventory decision-making situation in the pharmaceutical industry. The value of Asian dust (AD) forecasts is assessed in terms of the expected value of profits by using a decision tree, which is transformed from the specific payoff structure. The forecast user is assumed to determine the inventory level by considering base profit, inventory cost, and lost sales cost. We estimate the information value of AD forecasts by comparing the two cases of decision-making with or without the AD forecast. The proposed method is verified for the real data of AD forecasts and events in Seoul during the period 2004~2008. The results indicate that AD forecasts can provide the forecast users with benefits, which have various ranges of values according to the relative rate of inventory and lost sales cost.