• Title/Summary/Keyword: Decision-trees

Search Result 311, Processing Time 0.024 seconds

Analysis of Predictive Factors for Suicidal Ideation of Adolescents Using Decision Tree Analysis (의사결정나무 분석을 이용한 청소년의 자살 생각 예측 요인 분석: 2019년 아동·청소년 인권실태조사를 중심으로)

  • Han, Myeunghee
    • Journal of Korean Public Health Nursing
    • /
    • v.36 no.2
    • /
    • pp.157-169
    • /
    • 2022
  • Purpose: This study aimed to implement a model for predicting the presence or absence of suicidal ideation in adolescents by using the decision tree analysis method. Methods: This study is a secondary data analysis using the 2019 Child and Adolescent Human Rights Survey, the most recent data published by the Korea Youth Policy Institute. In order to identify the variables predicting suicidal ideation, a decision tree analysis with suicidal ideation as a dependent variable was performed. Results: This study found that the variables of life satisfaction, insults from parents, sex, and cyber-bullying experience of adolescents were selected as significant predictors of suicidal ideation. It is predicted that 58.2% of subjects with low life satisfaction would think of suicide. Among them, the probability of thinking of suicide increased to 72.7% in the case of unhappy people, and the probability of thinking of suicide in the case of a woman increase to 82.9%. Conclusions: It is necessary to consider family, school, and society environment to prevent suicidal ideation of adolescents.

An Analysis of the Determinants of Government-Funded Defense Companies using a Decision Tree (의사결정나무를 활용한 방산육성지원 수혜기업 결정요인 분석)

  • Gowoon Jeon;Seulah Baek;Jeonghwan Jeon;Donghee Yoo
    • Journal of the Korea Institute of Military Science and Technology
    • /
    • v.27 no.1
    • /
    • pp.80-93
    • /
    • 2024
  • This study attempted to analyze the factors that influence the participation of beneficiary companies in the government's defense industry promotion support project. To this end, experimental data were analyzed by constructing a prediction model consisting of highly important variables in beneficiary company decisions among various company information using the decision tree model, one of the data mining techniques. In addition, various rules were derived to determine the beneficiary companies of the government's support project using the analysis results expressed as decision trees. Three policy measures were presented based on the important rules that repeatedly appear in different predictive models to increase the effect of the government's industrial development. Using the analysis methods presented in this study and the determinants of the beneficiary companies of the government support project will help create a sustainable future defense industry growth environment.

A study on decision tree creation using marginally conditional variables (주변조건부 변수를 이용한 의사결정나무모형 생성에 관한 연구)

  • Cho, Kwang-Hyun;Park, Hee-Chang
    • Journal of the Korean Data and Information Science Society
    • /
    • v.23 no.2
    • /
    • pp.299-307
    • /
    • 2012
  • Data mining is a method of searching for an interesting relationship among items in a given database. The decision tree is a typical algorithm of data mining. The decision tree is the method that classifies or predicts a group as some subgroups. In general, when researchers create a decision tree model, the generated model can be complicated by the standard of model creation and the number of input variables. In particular, if the decision trees have a large number of input variables in a model, the generated models can be complex and difficult to analyze model. When creating the decision tree model, if there are marginally conditional variables (intervening variables, external variables) in the input variables, it is not directly relevant. In this study, we suggest the method of creating a decision tree using marginally conditional variables and apply to actual data to search for efficiency.

A Study on Factors of Education's Outcome using Decision Trees (의사결정트리를 이용한 교육성과 요인에 관한 연구)

  • Kim, Wan-Seop
    • Journal of Engineering Education Research
    • /
    • v.13 no.4
    • /
    • pp.51-59
    • /
    • 2010
  • In order to manage the lectures efficiently in the university and improve the educational outcome, the process is needed that make diagnosis of the present educational outcome of each classes on a lecture and find factors of educational outcome. In most studies for finding the factors of the efficient lecture, statistical methods such as association analysis, regression analysis are used usually, and recently decision tree analysis is employed, too. The decision tree analysis have the merits that is easy to understand a result model, and to be easy to apply for the decision making, but have the weaknesses that is not strong for characteristic of input data such as multicollinearity. This paper indicates the weaknesses of decision tree analysis, and suggests the experimental solution using multiple decision tree algorithm to supplement these problems. The experimental result shows that the suggested method is more effective in finding the reliable factors of the educational outcome.

  • PDF

Comparative Study of Machine learning Techniques for Spammer Detection in Social Bookmarking Systems (소셜 복마킹 시스템의 스패머 탐지를 위한 기계학습 기술의 성능 비교)

  • Kim, Chan-Ju;Hwang, Kyu-Baek
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.15 no.5
    • /
    • pp.345-349
    • /
    • 2009
  • Social bookmarking systems are a typical web 2.0 service based on folksonomy, providing the platform for storing and sharing bookmarking information. Spammers in social bookmarking systems denote the users who abuse the system for their own interests in an improper way. They can make the entire resources in social bookmarking systems useless by posting lots of wrong information. Hence, it is important to detect spammers as early as possible and protect social bookmarking systems from their attack. In this paper, we applied a diverse set of machine learning approaches, i.e., decision tables, decision trees (ID3), $na{\ddot{i}}ve$ Bayes classifiers, TAN (tree-augment $na{\ddot{i}}ve$ Bayes) classifiers, and artificial neural networks to this task. In our experiments, $na{\ddot{i}}ve$ Bayes classifiers performed significantly better than other methods with respect to the AUC (area under the ROC curve) score as veil as the model building time. Plausible explanations for this result are as follows. First, $na{\ddot{i}}ve$> Bayes classifiers art known to usually perform better than decision trees in terms of the AUC score. Second, the spammer detection problem in our experiments is likely to be linearly separable.

Calculation of Top Event Probability of Fault Tree using BDD (BDD를 이용한 사고수목 정상사상확률 계산)

  • Cho, Byeong Ho;Yum, Byeoungsoo;Kim, Sangahm
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.20 no.3
    • /
    • pp.654-662
    • /
    • 2016
  • As the number of gates and basic events in fault trees increases, it becomes difficult to calculate the exact probability of the top event. In order to overcome this difficulty the BDD methodology can be used to calculate the exact top event probability for small and medium size fault trees in short time. Fault trees are converted to BDD by using CUDD library functions and a failure path search algorithm is proposed to calculate the exact top event probability. The backward search algorithm is more efficient than the forward one in finding failure paths and in the calculation of the top event probability. This backward search algorithm can reduce searching time in the identification of disjoint failure paths from BDD and can be considered as an effective tool to find the cut sets and the minimal cut sets for the given fault trees.

A Study on Classification of Crown Classes and Selection of Thinned Trees for Major Conifers Using Machine Learning Techniques (머신러닝 기법을 활용한 주요 침엽수종의 수관급 분류와 간벌목 선정 연구)

  • Lee, Yong-Kyu;Lee, Jung-Soo;Park, Jin-Woo
    • Journal of Korean Society of Forest Science
    • /
    • v.111 no.2
    • /
    • pp.302-310
    • /
    • 2022
  • Here we aimed to classify the major coniferous tree species (Pinus densiflora, Pinus koraiensis, and Larix kaempferi) by tree measurement information and machine learning algorithms to establish an efficient forest management plan. We used national forest monitoring information amassed over nine years for the measurement information of trees, and random forest (RF), XGBoost (XGB), and light GBM (LGBM) as machine learning algorithms. We compared and evaluated the accuracy of the algorithm through performance evaluation using the accuracy, precision, recall, and F1 score of the algorithm. The RF algorithm had the highest performance evaluation score for all tree species, and highest scores for Pinus densiflora, with an accuracy of about 65%, a precision of about 72%, a recall of about 60%, and an F1 score of about 66%. The classification accuracy for the dominant trees was higher than about 80% in the crown classes, but that of the co-dominant trees, the intermediate trees, and the overtopper trees was evaluated as low. We consider that the results of this study can be used as reference data for decision-making in the selection of thinning trees for forest management.

Research on Optimization Strategies for Random Forest Algorithms in Federated Learning Environments (연합 학습 환경에서의 랜덤 포레스트 알고리즘 최적화 전략 연구)

  • InSeo Song;KangYoon Lee
    • The Journal of Bigdata
    • /
    • v.9 no.1
    • /
    • pp.101-113
    • /
    • 2024
  • Federated learning has garnered attention as an efficient method for training machine learning models in a distributed environment while maintaining data privacy and security. This study proposes a novel FedRFBagging algorithm to optimize the performance of random forest models in such federated learning environments. By dynamically adjusting the trees of local random forest models based on client-specific data characteristics, the proposed approach reduces communication costs and achieves high prediction accuracy even in environments with numerous clients. This method adapts to various data conditions, significantly enhancing model stability and training speed. While random forest models consist of multiple decision trees, transmitting all trees to the server in a federated learning environment results in exponentially increasing communication overhead, making their use impractical. Additionally, differences in data distribution among clients can lead to quality imbalances in the trees. To address this, the FedRFBagging algorithm selects only the highest-performing trees from each client for transmission to the server, which then reselects trees based on impurity values to construct the optimal global model. This reduces communication overhead and maintains high prediction performance across diverse data distributions. Although the global model reflects data from various clients, the data characteristics of each client may differ. To compensate for this, clients further train additional trees on the global model to perform local optimizations tailored to their data. This improves the overall model's prediction accuracy and adapts to changing data distributions. Our study demonstrates that the FedRFBagging algorithm effectively addresses the communication cost and performance issues associated with random forest models in federated learning environments, suggesting its applicability in such settings.

Directed Association Rules Mining and Classification (목표 속성을 고려한 연관규칙과 분류 기법)

  • 한경록;김재련
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.24 no.63
    • /
    • pp.23-31
    • /
    • 2001
  • Data mining can be either directed or undirected. One way of thinking about it is that we use undirected data mining to recognize relationship in the data and directed data mining to explain those relationships once they have been found. Several data mining techniques have received considerable research attention. In this paper, we propose an algorithm for discovering association rules as directed data mining and applying them to classification. In the first phase, we find frequent closed itemsets and association rules. After this phase, we construct the decision trees using discovered association rules. The algorithm can be applicable to customer relationship management.

  • PDF

Unsupervised segmentation of Multi -Source Remotely Sensed images using Binary Decision Trees and Canonical Transform

  • Mohammad, Rahmati;Kim, Jung-Ha
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 2001.10a
    • /
    • pp.23.4-23
    • /
    • 2001
  • This paper proposes a new approach to unsupervised classification of remotely sensed images. Fusion of optic images (Landsat TM) and radar data (SAR) has beer used to increase the accuracy of classification. Number of clusters is estimated using generalized Dunns measure. Performance of the proposed method is best observed comparing the classified images with classified aerial images.

  • PDF