• Title/Summary/Keyword: Decision Tree analysis

Search Result 723, Processing Time 0.022 seconds

Decision Tree-Based Feature-Selective Neural Network Model: Case of House Price Estimation (의사결정나무를 활용한 신경망 모형의 입력특성 선택: 주택가격 추정 사례)

  • Yoon Han-Seong
    • Journal of Korea Society of Digital Industry and Information Management
    • /
    • v.19 no.1
    • /
    • pp.109-118
    • /
    • 2023
  • Data-based analysis methods have become used more for estimating or predicting housing prices, and neural network models and decision trees in the field of big data are also widely used more and more. Neural network models are often evaluated to be superior to existing statistical models in terms of estimation or prediction accuracy. However, there is ambiguity in determining the input feature of the input layer of the neural network model, that is, the type and number of input features, and decision trees are sometimes used to overcome these disadvantages. In this paper, we evaluate the existing methods of using decision trees and propose the method of using decision trees to prioritize input feature selection in neural network models. This can be a complementary or combined analysis method of the neural network model and decision tree, and the validity was confirmed by applying the proposed method to house price estimation. Through several comparisons, it has been summarized that the selection of appropriate input characteristics according to priority can increase the estimation power of the model.

A study on decision tree creation using intervening variable (매개 변수를 이용한 의사결정나무 생성에 관한 연구)

  • Cho, Kwang-Hyun;Park, Hee-Chang
    • Journal of the Korean Data and Information Science Society
    • /
    • v.22 no.4
    • /
    • pp.671-678
    • /
    • 2011
  • Data mining searches for interesting relationships among items in a given database. The methods of data mining are decision tree, association rules, clustering, neural network and so on. The decision tree approach is most useful in classification problems and to divide the search space into rectangular regions. Decision tree algorithms are used extensively for data mining in many domains such as retail target marketing, customer classification, etc. When create decision tree model, complicated model by standard of model creation and number of input variable is produced. Specially, there is difficulty in model creation and analysis in case of there are a lot of numbers of input variable. In this study, we study on decision tree using intervening variable. We apply to actuality data to suggest method that remove unnecessary input variable for created model and search the efficiency.

Development of Discriminant Analysis System by Graphical User Interface of Visual Basic

  • Lee, Yong-Kyun;Shin, Young-Jae;Cha, Kyung-Joon
    • Journal of the Korean Data and Information Science Society
    • /
    • v.18 no.2
    • /
    • pp.447-456
    • /
    • 2007
  • Recently, the multivariate statistical analysis has been used to analyze meaningful information for various data. In this paper, we develope the multivariate statistical analysis system combined with Fisher discriminant analysis, logistic regression, neural network, and decision tree using visual basic 6.0.

  • PDF

A Study on the Improvement of Injection Molding Process Using CAE and Decision-tree (CAE와 Decision-tree를 이용한 사출성형 공정개선에 관한 연구)

  • Hwang, Soonhwan;Han, Seong-Ryeol;Lee, Hoojin
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.22 no.4
    • /
    • pp.580-586
    • /
    • 2021
  • The CAT methodology is a numerical analysis technique using CAE. Recently, a methodology of applying artificial intelligence techniques to a simulation has been studied. A previous study compared the deformation results according to the injection molding process using a machine learning technique. Although MLP has excellent prediction performance, it lacks an explanation of the decision process and is like a black box. In this study, data was generated using Autodesk Moldflow 2018, an injection molding analysis software. Several Machine Learning Algorithms models were developed using RapidMiner version 9.5, a machine learning platform software, and the root mean square error was compared. The decision-tree showed better prediction performance than other machine learning techniques with the RMSE values. The classification criterion can be increased according to the Maximal Depth that determines the size of the Decision-tree, but the complexity also increases. The simulation showed that by selecting an intermediate value that satisfies the constraint based on the changed position, there was 7.7% improvement compared to the previous simulation.

Streaming Decision Tree for Continuity Data with Changed Pattern (패턴의 변화를 가지는 연속성 데이터를 위한 스트리밍 의사결정나무)

  • Yoon, Tae-Bok;Sim, Hak-Joon;Lee, Jee-Hyong;Choi, Young-Mee
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.20 no.1
    • /
    • pp.94-100
    • /
    • 2010
  • Data Mining is mainly used for pattern extracting and information discovery from collected data. However previous methods is difficult to reflect changing patterns with time. In this paper, we introduce Streaming Decision Tree(SDT) analyzing data with continuity, large scale, and changed patterns. SDT defines continuity data as blocks and extracts rules using a Decision Tree's learning method. The extracted rules are combined considering time of occurrence, frequency, and contradiction. In experiment, we applied time series data and confirmed resonable result.

Inter-Process Correlation Model based Hybrid Framework for Fault Diagnosis in Wireless Sensor Networks

  • Zafar, Amna;Akbar, Ali Hammad;Akram, Beenish Ayesha
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.13 no.2
    • /
    • pp.536-564
    • /
    • 2019
  • Soft faults are inherent in wireless sensor networks (WSNs) due to external and internal errors. The failure of processes in a protocol stack are caused by errors on various layers. In this work, impact of errors and channel misbehavior on process execution is investigated to provide an error classification mechanism. Considering implementation of WSN protocol stack, inter-process correlations of stacked and peer layer processes are modeled. The proposed model is realized through local and global decision trees for fault diagnosis. A hybrid framework is proposed to implement local decision tree on sensor nodes and global decision tree on diagnostic cluster head. Local decision tree is employed to diagnose critical failures due to errors in stacked processes at node level. Global decision tree, diagnoses critical failures due to errors in peer layer processes at network level. The proposed model has been analyzed using fault tree analysis. The framework implementation has been done in Castalia. Simulation results validate the inter-process correlation model-based fault diagnosis. The hybrid framework distributes processing load on sensor nodes and diagnostic cluster head in a decentralized way, reducing communication overhead.

연결강도분석을 이용한 통합된 부도예측용 신경망모형

  • Lee Woongkyu;Lim Young Ha
    • Proceedings of the Korea Association of Information Systems Conference
    • /
    • 2002.11a
    • /
    • pp.289-312
    • /
    • 2002
  • This study suggests the Link weight analysis approach to choose input variables and an integrated model to make more accurate bankruptcy prediction model. the Link weight analysis approach is a method to choose input variables to analyze each input node's link weight which is the absolute value of link weight between an input nodes and a hidden layer. There are the weak-linked neurons elimination method, the strong-linked neurons selection method in the link weight analysis approach. The Integrated Model is a combined type adapting Bagging method that uses the average value of the four models, the optimal weak-linked-neurons elimination method, optimal strong-linked neurons selection method, decision-making tree model, and MDA. As a result, the methods suggested in this study - the optimal strong-linked neurons selection method, the optimal weak-linked neurons elimination method, and the integrated model - show much higher accuracy than MDA and decision making tree model. Especially the integrated model shows much higher accuracy than MDA and decision making tree model and shows slightly higher accuracy than the optimal weak-linked neurons elimination method and the optimal strong-linked neurons selection method.

  • PDF

The Determinants of National Health Expenditure: A Decision Tree Analysis (국민의료비 결정요인 및 영향력 분석)

  • 이견직;정영호
    • Health Policy and Management
    • /
    • v.12 no.3
    • /
    • pp.99-111
    • /
    • 2002
  • This paper draws the determinants of National Health Expenditures(min) and collectivizes OECD countries which are positioned by same conditions using the decision tree analysis. Major findings are summarized as follows. We find that the power of influence of income level on NHE has been 58.35% in 1985, 65.37% in 1990, 66.90% in 1995, and 66.47% in 1997. The power of influence of public share in NHE has been on the increase during that period: 19.50% in 1985, 19.91% in 1990, 22.81% in 1995 and 26.88% in 1997. The two factors(income level, public share) tells for the most part of NHE: 77.85% in 1985, 85.28% in 1990, 89.71% in 1995, 93.35% in 1997. Our results support the hypothesis that NHE could be explained mostly by the income level and show that public share is negatively correlated with the growth of NHE.

CANCER CLASSIFICATION AND PREDICTION USING MULTIVARIATE ANALYSIS

  • Shon, Ho-Sun;Lee, Heon-Gyu;Ryu, Keun-Ho
    • Proceedings of the KSRS Conference
    • /
    • v.2
    • /
    • pp.706-709
    • /
    • 2006
  • Cancer is one of the major causes of death; however, the survival rate can be increased if discovered at an early stage for timely treatment. According to the statistics of the World Health Organization of 2002, breast cancer was the most prevalent cancer for all cancers occurring in women worldwide, and it account for 16.8% of entire cancers inflicting Korean women today. In order to classify the type of breast cancer whether it is benign or malignant, this study was conducted with the use of the discriminant analysis and the decision tree of data mining with the breast cancer data disclosed on the web. The discriminant analysis is a statistical method to seek certain discriminant criteria and discriminant function to separate the population groups on the basis of observation values obtained from two or more population groups, and use the values obtained to allow the existing observation value to the population group thereto. The decision tree analyzes the record of data collected in the part to show it with the pattern existing in between them, namely, the combination of attribute for the characteristics of each class and make the classification model tree. Through this type of analysis, it may obtain the systematic information on the factors that cause the breast cancer in advance and prevent the risk of recurrence after the surgery.

  • PDF

Economic Value Analysis of Asian Dust Forecasts Using Decision Tree-Focused on Medicine Inventory Management (의사결정트리를 활용한 황사예보의 경제적 가치 분석-의약품 재고관리문제를 중심으로)

  • Yoon, Seung-Chul;Lee, Ki-Kwang
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.37 no.1
    • /
    • pp.120-126
    • /
    • 2014
  • This paper deals with the economic value analysis of meteorological forecasts for a hypothetical inventory decision-making situation in the pharmaceutical industry. The value of Asian dust (AD) forecasts is assessed in terms of the expected value of profits by using a decision tree, which is transformed from the specific payoff structure. The forecast user is assumed to determine the inventory level by considering base profit, inventory cost, and lost sales cost. We estimate the information value of AD forecasts by comparing the two cases of decision-making with or without the AD forecast. The proposed method is verified for the real data of AD forecasts and events in Seoul during the period 2004~2008. The results indicate that AD forecasts can provide the forecast users with benefits, which have various ranges of values according to the relative rate of inventory and lost sales cost.