• Title/Summary/Keyword: Model Tree

Search Result 1,904, Processing Time 0.026 seconds

A study on decision tree creation using intervening variable (매개 변수를 이용한 의사결정나무 생성에 관한 연구)

  • Cho, Kwang-Hyun;Park, Hee-Chang
    • Journal of the Korean Data and Information Science Society
    • /
    • v.22 no.4
    • /
    • pp.671-678
    • /
    • 2011
  • Data mining searches for interesting relationships among items in a given database. The methods of data mining are decision tree, association rules, clustering, neural network and so on. The decision tree approach is most useful in classification problems and to divide the search space into rectangular regions. Decision tree algorithms are used extensively for data mining in many domains such as retail target marketing, customer classification, etc. When create decision tree model, complicated model by standard of model creation and number of input variable is produced. Specially, there is difficulty in model creation and analysis in case of there are a lot of numbers of input variable. In this study, we study on decision tree using intervening variable. We apply to actuality data to suggest method that remove unnecessary input variable for created model and search the efficiency.

A Development of Suicidal Ideation Prediction Model and Decision Rules for the Elderly: Decision Tree Approach (의사결정나무 기법을 이용한 노인들의 자살생각 예측모형 및 의사결정 규칙 개발)

  • Kim, Deok Hyun;Yoo, Dong Hee;Jeong, Dae Yul
    • The Journal of Information Systems
    • /
    • v.28 no.3
    • /
    • pp.249-276
    • /
    • 2019
  • Purpose The purpose of this study is to develop a prediction model and decision rules for the elderly's suicidal ideation based on the Korean Welfare Panel survey data. By utilizing this data, we obtained many decision rules to predict the elderly's suicide ideation. Design/methodology/approach This study used classification analysis to derive decision rules to predict on the basis of decision tree technique. Weka 3.8 is used as the data mining tool in this study. The decision tree algorithm uses J48, also known as C4.5. In addition, 66.6% of the total data was divided into learning data and verification data. We considered all possible variables based on previous studies in predicting suicidal ideation of the elderly. Finally, 99 variables including the target variable were used. Classification analysis was performed by introducing sampling technique through backward elimination and data balancing. Findings As a result, there were significant differences between the data sets. The selected data sets have different, various decision tree and several rules. Based on the decision tree method, we derived the rules for suicide prevention. The decision tree derives not only the rules for the suicidal ideation of the depressed group, but also the rules for the suicidal ideation of the non-depressed group. In addition, in developing the predictive model, the problem of over-fitting due to the data imbalance phenomenon was directly identified through the application of data balancing. We could conclude that it is necessary to balance the data on the target variables in order to perform the correct classification analysis without over-fitting. In addition, although data balancing is applied, it is shown that performance is not inferior in prediction rate when compared with a biased prediction model.

The Education Program Model for the Thinking Extension Ability of the Gifted in Information Based on Game Tree (게임 트리에 기반한 정보영재의 사고력 신장을 위한 교육 프로그램 모형)

  • Jung, Deok-Gil;Kim, Byung-Joe
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2007.06a
    • /
    • pp.310-314
    • /
    • 2007
  • In this paper, we develop the thinking extension education program for the gifted students of information, and prove the validity and effectiveness of the proposed model by presenting the Tic-tac-toe problem as the practical example of the information-gifted students. This model consists of four phases which has the game tree as data structure and the search of game tree as control structure. And the search of game tree becomes the basis of the thinking extension education program. This model gives the help for students to learn representing the problem as tree structure and solving the problem of tree structure using the search method of game tree. The internal ability of the information-gifted for thinking extension of this education program contains the fluency, perceptiveness, originality, power of concentration, imaginative power, analyzing skills, pattern recognition, space sense, synthesizing, problem-solving.

  • PDF

연결강도분석을 이용한 통합된 부도예측용 신경망모형

  • Lee Woongkyu;Lim Young Ha
    • Proceedings of the Korea Association of Information Systems Conference
    • /
    • 2002.11a
    • /
    • pp.289-312
    • /
    • 2002
  • This study suggests the Link weight analysis approach to choose input variables and an integrated model to make more accurate bankruptcy prediction model. the Link weight analysis approach is a method to choose input variables to analyze each input node's link weight which is the absolute value of link weight between an input nodes and a hidden layer. There are the weak-linked neurons elimination method, the strong-linked neurons selection method in the link weight analysis approach. The Integrated Model is a combined type adapting Bagging method that uses the average value of the four models, the optimal weak-linked-neurons elimination method, optimal strong-linked neurons selection method, decision-making tree model, and MDA. As a result, the methods suggested in this study - the optimal strong-linked neurons selection method, the optimal weak-linked neurons elimination method, and the integrated model - show much higher accuracy than MDA and decision making tree model. Especially the integrated model shows much higher accuracy than MDA and decision making tree model and shows slightly higher accuracy than the optimal weak-linked neurons elimination method and the optimal strong-linked neurons selection method.

  • PDF

A Study on Analysis Method of Warranty Data Using Multivariate Model (다변량 모형을 이용한 보증데이터 분석 방법 연구)

  • Kim, Jong-Gurl;Sung, Ki-Woo
    • Journal of the Korea Safety Management & Science
    • /
    • v.17 no.2
    • /
    • pp.241-247
    • /
    • 2015
  • The purpose of the warranty data analysis can be classified into two categories. Two goals is a failure cause analysis and life prediction analysis. In this paper first, we applied multivariate analysis method that can be estimated in consideration of various factors on the failure cause warranty data. In particular, we apply the Tree model and Cox model. The advantage of the Tree is easy to interpret this result as compared to other models. In addition Cox model can quantitatively express the risk. Second, this paper proposed a multivariate life prediction model (AFT) considering a variety of factors. By applying the actual warranty data confirmed the usability.

Study on the Prediction Model for Employment of University Graduates Using Machine Learning Classification (머신러닝 기법을 활용한 대졸 구직자 취업 예측모델에 관한 연구)

  • Lee, Dong Hun;Kim, Tae Hyung
    • The Journal of Information Systems
    • /
    • v.29 no.2
    • /
    • pp.287-306
    • /
    • 2020
  • Purpose Youth unemployment is a social problem that continues to emerge in Korea. In this study, we create a model that predicts the employment of college graduates using decision tree, random forest and artificial neural network among machine learning techniques and compare the performance between each model through prediction results. Design/methodology/approach In this study, the data processing was performed, including the acquisition of the college graduates' vocational path survey data first, then the selection of independent variables and setting up dependent variables. We use R to create decision tree, random forest, and artificial neural network models and predicted whether college graduates were employed through each model. And at the end, the performance of each model was compared and evaluated. Findings The results showed that the random forest model had the highest performance, and the artificial neural network model had a narrow difference in performance than the decision tree model. In the decision-making tree model, key nodes were selected as to whether they receive economic support from their families, major affiliates, the route of obtaining information for jobs at universities, the importance of working income when choosing jobs and the location of graduation universities. Identifying the importance of variables in the random forest model, whether they receive economic support from their families as important variables, majors, the route to obtaining job information, the degree of irritating feelings for a month, and the location of the graduating university were selected.

A Study on the Categorization of Context-dependent Phoneme using Decision Tree Modeling (결정 트리 모델링에 의한 한국어 문맥 종속 음소 분류 연구)

  • 이선정
    • Journal of the Korea Computer Industry Society
    • /
    • v.2 no.2
    • /
    • pp.195-202
    • /
    • 2001
  • In this paper, we show a study on how to model a phoneme of which acoustic feature is changed according to both left-hand and right-hand phonemes. For this purpose, we make a comparative study on two kinds of algorithms; a unit reduction algorithm and decision tree modeling. The unit reduction algorithm uses only statistical information while the decision tree modeling uses statistical information and Korean acoustical information simultaneously. Especially, we focus on how to model context-dependent phonemes based on decision tree modeling. Finally, we show the recognition rate when context-dependent phonemes are obtained by the decision tree modeling.

  • PDF

Simulation-Based Risk Analysis of Integrated Power System (시뮬레이션을 이용한 통합전력시스템의 위험도 분석)

  • Lee, Ji Young;Han, Young Jin;Yun, Won Young;Bin, Jae Goo
    • Journal of Korean Institute of Industrial Engineers
    • /
    • v.42 no.2
    • /
    • pp.151-164
    • /
    • 2016
  • In this paper, we deal with a risk analysis for an IPS (Integrated power system) and propose a simulation model combining the fault tree and event tree in order to estimate the system availability and risk level, together. Firstly, the basic information such as operational scenarios, physical structure, safety systems is explained in order to make the fault tree and event tree of the IPS. Next, we propose a discrete-event simulation model using a next-event time advance technique to advance the simulation time. Also the state transition and activity diagrams are explained to represent the relationship between the objects. By numerical examples, the redundancy allocation is considered in order to decrease the risk level of the IPS.

Cluster Based Fuzzy Model Tree Using Node Information (상호 노드 정보를 이용한 클러스터 기반 퍼지 모델트리)

  • Park, Jin-Il;Lee, Dae-Jong;Kim, Yong-Sam;Cho, Young-Im;Chun, Myung-Geun
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.18 no.1
    • /
    • pp.41-47
    • /
    • 2008
  • Cluster based fuzzy model tree has certain drawbacks to decrease performance of testinB data when over-fitting of training data exists. To reduce the sensitivity of performance due to over-fitting problem, we proposed a modified cluster based fuzzy model tree with node information. To construct model tree, cluster centers are calculated by fuzzy clustering method using all input and output attributes in advance. And then, linear models are constructed at internal nodes with fuzzy membership values between centers and input attributes. In the prediction step, membership values are calculated by using fuzzy distance between input attributes and all centers that passing the nodes from root to leaf nodes. Finally, data prediction is performed by the weighted average method with the linear models and fuzzy membership values. To show the effectiveness of the proposed method, we have applied our method to various dataset. Under various experiments, our proposed method shows better performance than conventional cluster based fuzzy model tree.

Classification Method of Congestion Change Type for Efficient Traffic Management (효율적인 교통관리를 위한 혼잡상황변화 유형 분류기법 개발)

  • Shim, Sangwoo;Lee, Hwanpil;Lee, Kyujin;Choi, Keechoo
    • International Journal of Highway Engineering
    • /
    • v.16 no.4
    • /
    • pp.127-134
    • /
    • 2014
  • PURPOSES : To operate more efficient traffic management system, it is utmost important to detect the change in congestion level on a freeway segment rapidly and reliably. This study aims to develop classification method of congestion change type. METHODS: This research proposes two classification methods to capture the change of the congestion level on freeway segments using the dedicated short range communication (DSRC) data and the vehicle detection system (VDS) data. For developing the classification methods, the decision tree models were employed in which the independent variable is the change in congestion level and the covariates are the DSRC and VDS data collected from the freeway segments in Korea. RESULTS : The comparison results show that the decision tree model with DSRC data are better than the decision tree model with VDS data. Specifically, the decision tree model using DSRC data with better fits show approximately 95% accuracies. CONCLUSIONS : It is expected that the congestion change type classified using the decision tree models could play an important role in future freeway traffic management strategy.