• Title/Summary/Keyword: Decision tree method

Search Result 621, Processing Time 0.029 seconds

Implementation of Fatigue Identification System using C4.5 Algorithm (C4.5 알고리즘을 이용한 피로도 식별 시스템 구현)

  • Jin, You Zhen;Lee, Deok-Jin
    • Journal of the Korea Convergence Society
    • /
    • v.10 no.8
    • /
    • pp.21-26
    • /
    • 2019
  • This paper proposes a fatigue recognition method using the C4.5 algorithm. Based on domestic and international studies on fatigue evaluation, we have completed the fatigue self - assessment scale in combination with lifestyle and cultural characteristics of Chinese people. The scales used in the text were applied to 58 sub items and were used to assess the type and extent of fatigue. These items fall into four categories that measure physical fatigue, mental fatigue, personal habits, and fatigue outcomes. The purpose of this study is to analyze the leading causes of fatigue formation and to recognize the degree of fatigue, thereby increasing the personal interest in fatigue and reducing the risk of cerebrovascular disease due to excessive fatigue. The recognition rate of the fatigue recognition system using the C4.5 algorithm was 85% on average, confirming the usefulness of this proposal.

A Study on Detection of Small Size Malicious Code using Data Mining Method (데이터 마이닝 기법을 이용한 소규모 악성코드 탐지에 관한 연구)

  • Lee, Taek-Hyun;Kook, Kwang-Ho
    • Convergence Security Journal
    • /
    • v.19 no.1
    • /
    • pp.11-17
    • /
    • 2019
  • Recently, the abuse of Internet technology has caused economic and mental harm to society as a whole. Especially, malicious code that is newly created or modified is used as a basic means of various application hacking and cyber security threats by bypassing the existing information protection system. However, research on small-capacity executable files that occupy a large portion of actual malicious code is rather limited. In this paper, we propose a model that can analyze the characteristics of known small capacity executable files by using data mining techniques and to use them for detecting unknown malicious codes. Data mining analysis techniques were performed in various ways such as Naive Bayesian, SVM, decision tree, random forest, artificial neural network, and the accuracy was compared according to the detection level of virustotal. As a result, more than 80% classification accuracy was verified for 34,646 analysis files.

Extracting characteristics of underachievers learning using artificial intelligence and researching a prediction model (인공지능을 이용한 학습부진 특성 추출 및 예측 모델 연구)

  • Yang, Ja-Young;Moon, Kyong-Hi;Park, Seong-Ho
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.26 no.4
    • /
    • pp.510-518
    • /
    • 2022
  • The diagnostic evaluation conducted at the national level is very important to detect underachievers in school early. This study used an artificial intelligence method to find the characteristics of underachievers that affect learning development for middle school students. In this study an artificial intelligence model was constructed and analyzed to determine whether the Busan Education Longitudinal Data in 2020 by entering data from the first year of middle school in 2019. A predictive model was developed to predict basic middle school Korean, English, and mathematics education with machine learning algorithms, and it was confirmed that the accuracy was 78%, 82%, and 83%, respectively, in the prediction for the next school year. In addition, by drawing an achievement prediction decision tree for each middle school subject we are analyzing the process of prediction. Finally, we examined what characteristics affect achievement prediction.

A Comparative Study on Game-Score Prediction Models Using Compuational Thinking Education Game Data (컴퓨팅 사고 교육 게임 데이터를 사용한 게임 점수 예측 모델 성능 비교 연구)

  • Yang, Yeongwook
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.10 no.11
    • /
    • pp.529-534
    • /
    • 2021
  • Computing thinking is regarded as one of the important skills required in the 21st century, and many countries have introduced and implemented computing thinking training courses. Among computational thinking education methods, educational game-based methods increase student participation and motivation, and increase access to computational thinking. Autothinking is an educational game developed for the purpose of providing computational thinking education to learners. It is an adaptive system that dynamically provides feedback to learners and automatically adjusts the difficulty according to the learner's computational thinking ability. However, because the game was designed based on rules, it cannot intelligently consider the computational thinking of learners or give feedback. In this study, game data collected through Autothikning is introduced, and game score prediction that reflects computational thinking is performed in order to increase the adaptability of the game by using it. To solve this problem, a comparative study was conducted on linear regression, decision tree, random forest, and support vector machine algorithms, which are most commonly used in regression problems. As a result of the study, the linear regression method showed the best performance in predicting game scores.

An effective automated ontology construction based on the agriculture domain

  • Deepa, Rajendran;Vigneshwari, Srinivasan
    • ETRI Journal
    • /
    • v.44 no.4
    • /
    • pp.573-587
    • /
    • 2022
  • The agricultural sector is completely different from other sectors since it completely relies on various natural and climatic factors. Climate changes have many effects, including lack of annual rainfall and pests, heat waves, changes in sea level, and global ozone/atmospheric CO2 fluctuation, on land and agriculture in similar ways. Climate change also affects the environment. Based on these factors, farmers chose their crops to increase productivity in their fields. Many existing agricultural ontologies are either domain-specific or have been created with minimal vocabulary and no proper evaluation framework has been implemented. A new agricultural ontology focused on subdomains is designed to assist farmers using Jaccard relative extractor (JRE) and Naïve Bayes algorithm. The JRE is used to find the similarity between two sentences and words in the agricultural documents and the relationship between two terms is identified via the Naïve Bayes algorithm. In the proposed method, the preprocessing of data is carried out through natural language processing techniques and the tags whose dimensions are reduced are subjected to rule-based formal concept analysis and mapping. The subdomain ontologies of weather, pest, and soil are built separately, and the overall agricultural ontology are built around them. The gold standard for the lexical layer is used to evaluate the proposed technique, and its performance is analyzed by comparing it with different state-of-the-art systems. Precision, recall, F-measure, Matthews correlation coefficient, receiver operating characteristic curve area, and precision-recall curve area are the performance metrics used to analyze the performance. The proposed methodology gives a precision score of 94.40% when compared with the decision tree(83.94%) and K-nearest neighbor algorithm(86.89%) for agricultural ontology construction.

A Prediction-Based Data Read Ahead Policy using Decision Tree for improving the performance of NAND flash memory based storage devices (낸드 플래시 메모리 기반 저장 장치의 성능 향상을 위해 결정트리를 이용한 예측 기반 데이터 미리 읽기 정책)

  • Lee, Hyun-Seob
    • Journal of Internet of Things and Convergence
    • /
    • v.8 no.4
    • /
    • pp.9-15
    • /
    • 2022
  • NAND flash memory is used as a medium for various storage devices due to its high data processing speed with low power consumption. However, since the read processing speed of data is about 10 times faster than the write processing speed, various studies are being conducted to improve the speed difference. In particular, flash dedicated buffer management policies have been studied to improve write speed. However, SSD(solid state disks), which has recently been used for various purposes, is more vulnerable to read performance than write performance. In this paper, we find out why read performance is slower than write performance in SSD composed of NAND flash memory and study buffer management policies to improve it. The buffer management policy proposed in this paper proposes a method of improving the speed of a flash-based storage device by analyzing the pattern of read data and applying a policy of pre-reading data to be requested in the future from NAND flash memory. It also proves the effectiveness of the read-ahead policy through simulation.

Detection of Red Pepper Powders Origin based on Machine Learning (머신러닝 기반 고춧가루 원산지 판별기법)

  • Ryu, Sungmin;Park, Minseo
    • The Journal of the Convergence on Culture Technology
    • /
    • v.8 no.4
    • /
    • pp.355-360
    • /
    • 2022
  • As the increase cost of domestic red pepper and the increase of imported red pepper, damage cases such as false labeling of the origin of red pepper powder are issued. Accordingly we need to determine quickly and accurately for the origin of red pepper powder. The used method for presently determining the origin has the limitation in that it requires a lot of cost and time by experimentally comparing and analyzing the components of red pepper powder. To resolve the issues, this study proposes machine learning algorithm to classifiy domestic and imported red pepper powder. We have built machine learning model with 53 components contained in red pepper powder and validated. Through the proposed model, it was possible to identify which ingredients are importantly used in determining the origin. In the near future, it is expected that the cost of determining the origin can be further reduced by expanding to various foods as well as red pepper powder.

Detection of Depression Trends in Literary Cyber Writers Using Sentiment Analysis and Machine Learning

  • Faiza Nasir;Haseeb Ahmad;CM Nadeem Faisal;Qaisar Abbas;Mubarak Albathan;Ayyaz Hussain
    • International Journal of Computer Science & Network Security
    • /
    • v.23 no.3
    • /
    • pp.67-80
    • /
    • 2023
  • Rice is an important food crop for most of the population in Nowadays, psychologists consider social media an important tool to examine mental disorders. Among these disorders, depression is one of the most common yet least cured disease Since abundant of writers having extensive followers express their feelings on social media and depression is significantly increasing, thus, exploring the literary text shared on social media may provide multidimensional features of depressive behaviors: (1) Background: Several studies observed that depressive data contains certain language styles and self-expressing pronouns, but current study provides the evidence that posts appearing with self-expressing pronouns and depressive language styles contain high emotional temperatures. Therefore, the main objective of this study is to examine the literary cyber writers' posts for discovering the symptomatic signs of depression. For this purpose, our research emphases on extracting the data from writers' public social media pages, blogs, and communities; (3) Results: To examine the emotional temperatures and sentences usage between depressive and not depressive groups, we employed the SentiStrength algorithm as a psycholinguistic method, TF-IDF and N-Gram for ranked phrases extraction, and Latent Dirichlet Allocation for topic modelling of the extracted phrases. The results unearth the strong connection between depression and negative emotional temperatures in writer's posts. Moreover, we used Naïve Bayes, Support Vector Machines, Random Forest, and Decision Tree algorithms to validate the classification of depressive and not depressive in terms of sentences, phrases and topics. The results reveal that comparing with others, Support Vectors Machines algorithm validates the classification while attaining highest 79% f-score; (4) Conclusions: Experimental results show that the proposed system outperformed for detection of depression trends in literary cyber writers using sentiment analysis.

Fire Fragility Analysis of Steel Moment Frame using Machine Learning Algorithms (머신러닝 기법을 활용한 철골 모멘트 골조의 화재 취약도 분석)

  • Xingyue Piao;Robin Eunju Kim
    • Journal of the Computational Structural Engineering Institute of Korea
    • /
    • v.37 no.1
    • /
    • pp.57-65
    • /
    • 2024
  • In a fire-resistant structure, uncertainties arise in factors such as ventilation, material elasticity modulus, yield strength, coefficient of thermal expansion, external forces, and fire location. The ventilation uncertainty affects thefactor contributes to uncertainties in fire temperature, subsequently impacting the structural temperature. These temperatures, combined with material properties, give rise to uncertain structural responses. Given the nonlinear behavior of structures under fire conditions, calculating fire fragility traditionally involves time-consuming Monte Carlo simulations. To address this, recent studies have explored leveraging machine learning algorithms to predict fire fragility, aiming to enhance efficiency while maintaining accuracy. This study focuses on predicting the fire fragility of a steel moment frame building, accounting for uncertainties in fire size, location, and structural material properties. The fragility curve, derived from nonlinear structural behavior under fire, follows a log-normal distribution. The results demonstrate that the proposed method accurately and efficiently predicts fire fragility, showcasing its effectiveness in streamlining the analysis process.

Methodology for Variable Optimization in Injection Molding Process (사출 성형 공정에서의 변수 최적화 방법론)

  • Jung, Young Jin;Kang, Tae Ho;Park, Jeong In;Cho, Joong Yeon;Hong, Ji Soo;Kang, Sung Woo
    • Journal of Korean Society for Quality Management
    • /
    • v.52 no.1
    • /
    • pp.43-56
    • /
    • 2024
  • Purpose: The injection molding process, crucial for plastic shaping, encounters difficulties in sustaining product quality when replacing injection machines. Variations in machine types and outputs between different production lines or factories increase the risk of quality deterioration. In response, the study aims to develop a system that optimally adjusts conditions during the replacement of injection machines linked to molds. Methods: Utilizing a dataset of 12 injection process variables and 52 corresponding sensor variables, a predictive model is crafted using Decision Tree, Random Forest, and XGBoost. Model evaluation is conducted using an 80% training data and a 20% test data split. The dependent variable, classified into five characteristics based on temperature and pressure, guides the prediction model. Bayesian optimization, integrated into the selected model, determines optimal values for process variables during the replacement of injection machines. The iterative convergence of sensor prediction values to the optimum range is visually confirmed, aligning them with the target range. Experimental results validate the proposed approach. Results: Post-experiment analysis indicates the superiority of the XGBoost model across all five characteristics, achieving a combined high performance of 0.81 and a Mean Absolute Error (MAE) of 0.77. The study introduces a method for optimizing initial conditions in the injection process during machine replacement, utilizing Bayesian optimization. This streamlined approach reduces both time and costs, thereby enhancing process efficiency. Conclusion: This research contributes practical insights to the optimization literature, offering valuable guidance for industries seeking streamlined and cost-effective methods for machine replacement in injection molding.