• Title/Summary/Keyword: CLASSIFICATION TREE MODEL

Search Result 314, Processing Time 0.021 seconds

Improved Decision Tree Classification (IDT) Algorithm For Social Media Data

  • Anu Sharma;M.K Sharma;R.K Dwivedi
    • International Journal of Computer Science & Network Security
    • /
    • v.24 no.6
    • /
    • pp.83-88
    • /
    • 2024
  • In this paper we used classification algorithms on social networking. We are proposing, a new classification algorithm called the improved Decision Tree (IDT). Our model provides better classification accuracy than the existing systems for classifying the social network data. Here we examined the performance of some familiar classification algorithms regarding their accuracy with our proposed algorithm. We used Support Vector Machines, Naïve Bayes, k-Nearest Neighbors, decision tree in our research and performed analyses on social media dataset. Matlab is used for performing experiments. The result shows that the proposed algorithm achieves the best results with an accuracy of 84.66%.

Prediction Model for the Risk of Scapular Winging in Young Women Based on the Decision Tree

  • Gwak, Gyeong-tae;Ahn, Sun-hee;Kim, Jun-hee;Weon, Young-soo;Kwon, Oh-yun
    • Physical Therapy Korea
    • /
    • v.27 no.2
    • /
    • pp.140-148
    • /
    • 2020
  • Background: Scapular winging (SW) could be caused by tightness or weakness of the periscapular muscles. Although data mining techniques are useful in classifying or predicting risk of musculoskeletal disorder, predictive models for risk of musculoskeletal disorder using the results of clinical test or quantitative data are scarce. Objects: This study aimed to (1) investigate the difference between young women with and without SW, (2) establish a predictive model for presence of SW, and (3) determine the cutoff value of each variable for predicting the risk of SW using the decision tree method. Methods: Fifty young female subjects participated in this study. To classify the presence of SW as the outcome variable, scapular protractor strength, elbow flexor strength, shoulder internal rotation, and whether the scapula is in the dominant or nondominant side were determined. Results: The classification tree selected scapular protractor strength, shoulder internal rotation range of motion, and whether the scapula is in the dominant or nondominant side as predictor variables. The classification tree model correctly classified 78.79% (p = 0.02) of the training data set. The accuracy obtained by the classification tree on the test data set was 82.35% (p = 0.04). Conclusion: The classification tree showed acceptable accuracy (82.35%) and high specificity (95.65%) but low sensitivity (54.55%). Based on the predictive model in this study, we suggested that 20% of body weight in scapular protractor strength is a meaningful cutoff value for presence of SW.

A Study on the Deep Learning-based Tree Species Classification by using High-resolution Orthophoto Images (고해상도 정사영상을 이용한 딥러닝 기반의 산림수종 분류에 관한 연구)

  • JANG, Kwangmin
    • Journal of the Korean Association of Geographic Information Studies
    • /
    • v.24 no.3
    • /
    • pp.1-9
    • /
    • 2021
  • In this study, we evaluated the accuracy of deep learning-based tree species classification model trained by using high-resolution images. We selected five species classed, i.e., pine, birch, larch, korean pine, mongolian oak for classification. We created 5,000 datasets using high-resolution orthophoto and forest type map. CNN deep learning model is used to tree species classification. We divided training data, verification data, and test data by a 5:3:2 ratio of the datasets and used it for the learning and evaluation of the model. The overall accuracy of the model was 89%. The accuracy of each species were pine 95%, birch 89%, larch 80%, korean pine 86% and mongolian oak 98%.

Feature Based Decision Tree Model for Fault Detection and Classification of Semiconductor Process (반도체 공정의 이상 탐지와 분류를 위한 특징 기반 의사결정 트리)

  • Son, Ji-Hun;Ko, Jong-Myoung;Kim, Chang-Ouk
    • IE interfaces
    • /
    • v.22 no.2
    • /
    • pp.126-134
    • /
    • 2009
  • As product quality and yield are essential factors in semiconductor manufacturing, monitoring the main manufacturing steps is a critical task. For the purpose, FDC(Fault detection and classification) is used for diagnosing fault states in the processes by monitoring data stream collected by equipment sensors. This paper proposes an FDC model based on decision tree which provides if-then classification rules for causal analysis of the processing results. Unlike previous decision tree approaches, we reflect the structural aspect of the data stream to FDC. For this, we segment the data stream into multiple subregions, define structural features for each subregion, and select the features which have high relevance to results of the process and low redundancy to other features. As the result, we can construct simple, but highly accurate FDC model. Experiments using the data stream collected from etching process show that the proposed method is able to classify normal/abnormal states with high accuracy.

Adopting and Implementation of Decision Tree Classification Method for Image Interpolation (이미지 보간을 위한 의사결정나무 분류 기법의 적용 및 구현)

  • Kim, Donghyung
    • Journal of Korea Society of Digital Industry and Information Management
    • /
    • v.16 no.1
    • /
    • pp.55-65
    • /
    • 2020
  • With the development of display hardware, image interpolation techniques have been used in various fields such as image zooming and medical imaging. Traditional image interpolation methods, such as bi-linear interpolation, bi-cubic interpolation and edge direction-based interpolation, perform interpolation in the spatial domain. Recently, interpolation techniques in the discrete cosine transform or wavelet domain are also proposed. Using these various existing interpolation methods and machine learning, we propose decision tree classification-based image interpolation methods. In other words, this paper is about the method of adaptively applying various existing interpolation methods, not the interpolation method itself. To obtain the decision model, we used Weka's J48 library with the C4.5 decision tree algorithm. The proposed method first constructs attribute set and select classes that means interpolation methods for classification model. And after training, interpolation is performed using different interpolation methods according to attributes characteristics. Simulation results show that the proposed method yields reasonable performance.

Selection of Important Variables in the Classification Model for Successful Flight Training (조종사 비행훈련 성패예측모형 구축을 위한 중요변수 선정)

  • Lee, Sang-Heon;Lee, Sun-Doo
    • IE interfaces
    • /
    • v.20 no.1
    • /
    • pp.41-48
    • /
    • 2007
  • The main purpose of this paper is cost reduction in absurd pilot positive expense and human accident prevention which is caused by in the pilot selection process. We use classification models such as logistic regression, decision tree, and neural network based on aptitude test results of 505 ROK Air Force applicants in 2001~2004. First, we determine the reliability and propriety against the aptitude test system which has been improved. Based on this conference flight simulator test item was compared to the new aptitude test item in order to make additional yes or no decision from different models in terms of classification accuracy, ROC and Response Threshold side. Decision tree was selected as the most efficient for each sequential flight training result and the last flight training results predict excellent. Therefore, we propose that the standard of pilot selection be adopted by the decision tree and it presents in the aptitude test item which is new a conference flight simulator test.

A Development of Suicidal Ideation Prediction Model and Decision Rules for the Elderly: Decision Tree Approach (의사결정나무 기법을 이용한 노인들의 자살생각 예측모형 및 의사결정 규칙 개발)

  • Kim, Deok Hyun;Yoo, Dong Hee;Jeong, Dae Yul
    • The Journal of Information Systems
    • /
    • v.28 no.3
    • /
    • pp.249-276
    • /
    • 2019
  • Purpose The purpose of this study is to develop a prediction model and decision rules for the elderly's suicidal ideation based on the Korean Welfare Panel survey data. By utilizing this data, we obtained many decision rules to predict the elderly's suicide ideation. Design/methodology/approach This study used classification analysis to derive decision rules to predict on the basis of decision tree technique. Weka 3.8 is used as the data mining tool in this study. The decision tree algorithm uses J48, also known as C4.5. In addition, 66.6% of the total data was divided into learning data and verification data. We considered all possible variables based on previous studies in predicting suicidal ideation of the elderly. Finally, 99 variables including the target variable were used. Classification analysis was performed by introducing sampling technique through backward elimination and data balancing. Findings As a result, there were significant differences between the data sets. The selected data sets have different, various decision tree and several rules. Based on the decision tree method, we derived the rules for suicide prevention. The decision tree derives not only the rules for the suicidal ideation of the depressed group, but also the rules for the suicidal ideation of the non-depressed group. In addition, in developing the predictive model, the problem of over-fitting due to the data imbalance phenomenon was directly identified through the application of data balancing. We could conclude that it is necessary to balance the data on the target variables in order to perform the correct classification analysis without over-fitting. In addition, although data balancing is applied, it is shown that performance is not inferior in prediction rate when compared with a biased prediction model.

Evaluation Method of College English Education Effect Based on Improved Decision Tree Algorithm

  • Dou, Fang
    • Journal of Information Processing Systems
    • /
    • v.18 no.4
    • /
    • pp.500-509
    • /
    • 2022
  • With the rapid development of educational informatization, teaching methods become diversified characteristics, but a large number of information data restrict the evaluation on teaching subject and object in terms of the effect of English education. Therefore, this study adopts the concept of incremental learning and eigenvalue interval algorithm to improve the weighted decision tree, and builds an English education effect evaluation model based on association rules. According to the results, the average accuracy of information classification of the improved decision tree algorithm is 96.18%, the classification error rate can be as low as 0.02%, and the anti-fitting performance is good. The classification error rate between the improved decision tree algorithm and the original decision tree does not exceed 1%. The proposed educational evaluation method can effectively provide early warning of academic situation analysis, and improve the teachers' professional skills in an accelerated manner and perfect the education system.

Decision Tree Learning Algorithms for Learning Model Classification in the Vocabulary Recognition System (어휘 인식 시스템에서 학습 모델 분류를 위한 결정 트리 학습 알고리즘)

  • Oh, Sang-Yeob
    • Journal of Digital Convergence
    • /
    • v.11 no.9
    • /
    • pp.153-158
    • /
    • 2013
  • Target learning model is not recognized in this category or not classified clearly failed to determine if the vocabulary recognition is reduced. Form of classification learning model is changed or a new learning model is added to the recognition decision tree structure of the model should be changed to a structural problem. In order to solve these problems, a decision tree learning model for classification learning algorithm is proposed. Phonological phenomenon reflected sound enough to configure the database to ensure learning a decision tree learning model for classifying method was used. In this study, the indoor environment-dependent recognition and vocabulary words for the experimental results independent recognition vocabulary of the indoor environment-dependent recognition performance of 98.3% in the experiment showed, vocabulary independent recognition performance of 98.4% in the experiment shown.

Classification Tree Analysis to Assess Contributing Factors Influencing Biosecurity Level on Farrow-to-Finish Pig Farms in Korea (분류 트리 기법을 이용한 국내 일괄사육 양돈장의 차단방역 수준에 영향을 미치는 기여 요인 평가)

  • Kim, Kyu-Wook;Pak, Son-Il
    • Journal of Veterinary Clinics
    • /
    • v.33 no.2
    • /
    • pp.107-112
    • /
    • 2016
  • The objective of this study was to determine potential contributing factors associated with biosecurity level of farrow-to-finish pig farms and to develop a classification tree model to explore how these factors related to each other based on prediction model. To this end, the author analyzed data (n = 193) extracted from a cross-sectional study of 344 farrow-to-finish farms which was conducted between March and September 2014 aimed to explore swine disease status at farm level. Standardized questionnaires with information about basic demographical data and management practices were collected in each farm by on-site visit of trained veterinarians. For the classification of the data sets regarding biosecurity level as a dependent variable and predictor variables, Chi-squared Automatic Interaction Detection (CHAID) algorithm was applied for modeling classification tree. The statistics of misclassification risk was used to evaluate the fitness of the model in terms of prediction results. Categorical multivariate input data (40 variables) was used to construct a classification tree, and the target variable was biosecurity level dichotomized into low versus high. In general, the level of biosecurity was lower in the majority of farms studied, mainly due to the limited implementation of on-farm basic biosecurity measures aimed at controlling the potential introduction and transmission of swine diseases. The CHAID model illustrated the relative importance of significant predictors in explaining the level of biosecurity; maintenance of medical records of treatment and vaccination, use of dedicated clothing to enter the farm, installing fence surrounding the farm perimeter, and periodic monitoring of the herd using written biosecurity plan in place. The misclassification risk estimate of the prediction model was 0.145 with the standard error of 0.025, indicating that 85.5% of the cases could be classified correctly by using the decision rule based on the current tree. Although CHAID approach could provide detailed information and insight about interactions among factors associated with biosecurity level, further evaluation of potential bias intervened in the course of data collection should be included in future studies. In addition, there is still need to validate findings through the external dataset with larger sample size to improve the external validity of the current model.