• Title/Summary/Keyword: tree-based models

Search Result 437, Processing Time 0.024 seconds

Developing a regional fog prediction model using tree-based machine-learning techniques and automated visibility observations (시정계 자료와 기계학습 기법을 이용한 지역 안개예측 모형 개발)

  • Kim, Daeha
    • Journal of Korea Water Resources Association
    • /
    • v.54 no.12
    • /
    • pp.1255-1263
    • /
    • 2021
  • While it could become an alternative water resource, fog could undermine traffic safety and operational performance of infrastructures. To reduce such adverse impacts, it is necessary to have spatially continuous fog risk information. In this work, tree-based machine-learning models were developed in order to quantify fog risks with routine meteorological observations alone. The Extreme Gradient Boosting (XGB), Light Gradient Boosting (LGB), and Random Forests (RF) were chosen for the regional fog models using operational weather and visibility observations within the Jeollabuk-do province. Results showed that RF seemed to show the most robust performance to categorize between fog and non-fog situations during the training and evaluation period of 2017-2019. While the LGB performed better than in predicting fog occurrences than the others, its false alarm ratio was the highest (0.695) among the three models. The predictability of the three models considerably declined when applying them for an independent period of 2020, potentially due to the distinctively enhanced air quality in the year under the global lockdown. Nonetheless, even in 2020, the three models were all able to produce fog risk information consistent with the spatial variation of observed fog occurrences. This work suggests that the tree-based machine learning models could be used as tools to find locations with relatively high fog risks.

Simplification of Boundary Representation Models Based on Stepwise Volume Decomposition (단계적 볼륨분해에 기반한 경계표현 모델의 단순화)

  • Kim, Byung Chul;Mun, Duhwan
    • Transactions of the Korean Society of Mechanical Engineers A
    • /
    • v.37 no.10
    • /
    • pp.1305-1313
    • /
    • 2013
  • In this study, a method to apply feature-based simplification to boundary representation models is proposed. For feature-based simplification, a volume decomposition tree is created from a boundary representation model. The volume decomposition tree is represented by regularized Boolean operations of additive volumes, subtractive volumes, and fillet/round/chamfer volumes, and it is generated by stepwise volume decomposition, which consists of fillet/round/chamfer decomposition, wrap-around decomposition, volume split decomposition, and cell-based decomposition. After the volume decomposition tree is transformed to an infix expression, the CAD model can be simplified by reordering the volumes. To verify the proposed method, a prototype system was implemented, and experiments on test cases were conducted. From the results of the experiments, it is verified that the proposed method is useful for simplifying CAD models based on boundary representation.

Prediction Models of Conflict and Intimacy in Teacher-Child Relationships: Investigation of Child Variables Based on Decision Tree Analysis (교사-유아 관계의 갈등 및 친밀감에 대한 예측 모형: 의사결정나무분석을 적용한 유아변인의 탐색)

  • Shin, Yoolim
    • Korean Journal of Childcare and Education
    • /
    • v.16 no.5
    • /
    • pp.69-86
    • /
    • 2020
  • Objective: The purpose of this research was to examine the prediction models of conflict and intimacy in teacher-child relationships based on decision tree analysis. Methods: The participants were 297 preschool children from ages three to five including 166 boys and 131 girls. Teacher-child relationships were measured by the Student-Teacher Relationship Scale(STRS). Physical aggression, relational aggression, social withdrawal, and prosocial behaviors were measured by teacher ratings. Moreover, ADHD-RS(Attentive Deficit Hyperactivity Disorder Rating Scale) was used to measure ADHD. The data was analyzed with decision tree analysis. Results: According to the prediction model for teacher-child conflict, the significant predictors were physical aggression and social withdrawal. According to the prediction model for teacher-child intimacy, the significant predictors were prosocial behaviors and relational aggression. However, children's age, gender and ADHD were not significant predictors. Conclusion/Implications: The findings suggest that social behaviors may be closely related with teacher-child relationships for preschool children. Based on the results of this study, intervention suggestions were made.

LAT System for Fault Tree Generation (PLC로 제어되는 기계에서 Fault Tree를 효과적으로 생성하기 위한 LAT(Ladder Analysis Tool)개발)

  • 김선호;김동훈;김도연;한기상;김주한
    • Proceedings of the Korean Society of Precision Engineering Conference
    • /
    • 1997.10a
    • /
    • pp.442-445
    • /
    • 1997
  • A challenging activity in the manufacturing industry is to perform in real time the continuous monitoring of the process state, the situation assessment and identification of the problem on line and diagnosis of the cause and importance of the problem if he process does not work properly. This paper describes LAT(Ladder Analysis Tool) system for fault tree generation to improving the fault diagnosis of CNC machine tools. The system consists of 4 steps which can automatically ladder analysis from ladder diagram to two diagnosis function models. The two diagnostic models based on he ladder diagram is switching function model and step switching function model. This system tries to overcome diagnosis deficiencies present machine tool.

  • PDF

A Study of Data Mining Optimization Model for the Credit Evaluation

  • Kim, Kap-Sik;Lee, Chang-Soon
    • Journal of the Korean Data and Information Science Society
    • /
    • v.14 no.4
    • /
    • pp.825-836
    • /
    • 2003
  • Based on customer information and financing processes in capital market, we derived individual models by applying multi-layered perceptrons, MDA, and decision tree. Further, the results from the existing single models were compared with the results from the integrated model that was developed using genetic algorithm. This study contributes not only to verifying the existing individual models and but also to overcoming the limitations of the existing approaches. We have depended upon the approaches that compare individual models and search for the best-fit model. However, this study presents a methodology to build an integrated data mining model using genetic algorithm.

  • PDF

Development of Medical Cost Prediction Model Based on the Machine Learning Algorithm (머신러닝 알고리즘 기반의 의료비 예측 모델 개발)

  • Han Bi KIM;Dong Hoon HAN
    • Journal of Korea Artificial Intelligence Association
    • /
    • v.1 no.1
    • /
    • pp.11-16
    • /
    • 2023
  • Accurate hospital case modeling and prediction are crucial for efficient healthcare. In this study, we demonstrate the implementation of regression analysis methods in machine learning systems utilizing mathematical statics and machine learning techniques. The developed machine learning model includes Bayesian linear, artificial neural network, decision tree, decision forest, and linear regression analysis models. Through the application of these algorithms, corresponding regression models were constructed and analyzed. The results suggest the potential of leveraging machine learning systems for medical research. The experiment aimed to create an Azure Machine Learning Studio tool for the speedy evaluation of multiple regression models. The tool faciliates the comparision of 5 types of regression models in a unified experiment and presents assessment results with performance metrics. Evaluation of regression machine learning models highlighted the advantages of boosted decision tree regression, and decision forest regression in hospital case prediction. These findings could lay the groundwork for the deliberate development of new directions in medical data processing and decision making. Furthermore, potential avenues for future research may include exploring methods such as clustering, classification, and anomaly detection in healthcare systems.

Default Prediction of Automobile Credit Based on Support Vector Machine

  • Chen, Ying;Zhang, Ruirui
    • Journal of Information Processing Systems
    • /
    • v.17 no.1
    • /
    • pp.75-88
    • /
    • 2021
  • Automobile credit business has developed rapidly in recent years, and corresponding default phenomena occur frequently. Credit default will bring great losses to automobile financial institutions. Therefore, the successful prediction of automobile credit default is of great significance. Firstly, the missing values are deleted, then the random forest is used for feature selection, and then the sample data are randomly grouped. Finally, six prediction models of support vector machine (SVM), random forest and k-nearest neighbor (KNN), logistic, decision tree, and artificial neural network (ANN) are constructed. The results show that these six machine learning models can be used to predict the default of automobile credit. Among these six models, the accuracy of decision tree is 0.79, which is the highest, but the comprehensive performance of SVM is the best. And random grouping can improve the efficiency of model operation to a certain extent, especially SVM.

Machine Learning Based Automatic Categorization Model for Text Lines in Invoice Documents

  • Shin, Hyun-Kyung
    • Journal of Korea Multimedia Society
    • /
    • v.13 no.12
    • /
    • pp.1786-1797
    • /
    • 2010
  • Automatic understanding of contents in document image is a very hard problem due to involvement with mathematically challenging problems originated mainly from the over-determined system induced by document segmentation process. In both academic and industrial areas, there have been incessant and various efforts to improve core parts of content retrieval technologies by the means of separating out segmentation related issues using semi-structured document, e.g., invoice,. In this paper we proposed classification models for text lines on invoice document in which text lines were clustered into the five categories in accordance with their contents: purchase order header, invoice header, summary header, surcharge header, purchase items. Our investigation was concentrated on the performance of machine learning based models in aspect of linear-discriminant-analysis (LDA) and non-LDA (logic based). In the group of LDA, na$\"{\i}$ve baysian, k-nearest neighbor, and SVM were used, in the group of non LDA, decision tree, random forest, and boost were used. We described the details of feature vector construction and the selection processes of the model and the parameter including training and validation. We also presented the experimental results of comparison on training/classification error levels for the models employed.

Prediction Model for the Risk of Scapular Winging in Young Women Based on the Decision Tree

  • Gwak, Gyeong-tae;Ahn, Sun-hee;Kim, Jun-hee;Weon, Young-soo;Kwon, Oh-yun
    • Physical Therapy Korea
    • /
    • v.27 no.2
    • /
    • pp.140-148
    • /
    • 2020
  • Background: Scapular winging (SW) could be caused by tightness or weakness of the periscapular muscles. Although data mining techniques are useful in classifying or predicting risk of musculoskeletal disorder, predictive models for risk of musculoskeletal disorder using the results of clinical test or quantitative data are scarce. Objects: This study aimed to (1) investigate the difference between young women with and without SW, (2) establish a predictive model for presence of SW, and (3) determine the cutoff value of each variable for predicting the risk of SW using the decision tree method. Methods: Fifty young female subjects participated in this study. To classify the presence of SW as the outcome variable, scapular protractor strength, elbow flexor strength, shoulder internal rotation, and whether the scapula is in the dominant or nondominant side were determined. Results: The classification tree selected scapular protractor strength, shoulder internal rotation range of motion, and whether the scapula is in the dominant or nondominant side as predictor variables. The classification tree model correctly classified 78.79% (p = 0.02) of the training data set. The accuracy obtained by the classification tree on the test data set was 82.35% (p = 0.04). Conclusion: The classification tree showed acceptable accuracy (82.35%) and high specificity (95.65%) but low sensitivity (54.55%). Based on the predictive model in this study, we suggested that 20% of body weight in scapular protractor strength is a meaningful cutoff value for presence of SW.

Analysis Performance Characterics of Dynamic Signature File Methods

  • Yoo, Jae-Soo;Choi, Kil-Seong;Kim, Myoung-Ho
    • Journal of Electrical Engineering and information Science
    • /
    • v.2 no.4
    • /
    • pp.37-45
    • /
    • 1997
  • With rapid increase of information requirements from various application areas, there has been much research on dynamic information storage structures that effectively support insertions, deletions and updates. In this appear we evaluate the performance of he existing dynamic signature file methods such as the S-tree, Quick Filter and HS file and provide guidelines for the most effective usage to a given operational environment. We derive analytic performance evaluation models of the storage structures based on retrieval time, storage overhead and insertion time. We also perform extensive experiments with various data distributions such as uniform, normal and exponential distributions. The relationships among various performance parameters are throughly investigated. We show through performance comparison based on analytic models and experiments that regardless of data distribution, the HS file significantly improves performance in both the retreival time and the storage overhead over S-tree and Quick Filter.

  • PDF