• Title/Summary/Keyword: tree-based models

Search Result 437, Processing Time 0.03 seconds

A Comparative Study on Game-Score Prediction Models Using Compuational Thinking Education Game Data (컴퓨팅 사고 교육 게임 데이터를 사용한 게임 점수 예측 모델 성능 비교 연구)

  • Yang, Yeongwook
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.10 no.11
    • /
    • pp.529-534
    • /
    • 2021
  • Computing thinking is regarded as one of the important skills required in the 21st century, and many countries have introduced and implemented computing thinking training courses. Among computational thinking education methods, educational game-based methods increase student participation and motivation, and increase access to computational thinking. Autothinking is an educational game developed for the purpose of providing computational thinking education to learners. It is an adaptive system that dynamically provides feedback to learners and automatically adjusts the difficulty according to the learner's computational thinking ability. However, because the game was designed based on rules, it cannot intelligently consider the computational thinking of learners or give feedback. In this study, game data collected through Autothikning is introduced, and game score prediction that reflects computational thinking is performed in order to increase the adaptability of the game by using it. To solve this problem, a comparative study was conducted on linear regression, decision tree, random forest, and support vector machine algorithms, which are most commonly used in regression problems. As a result of the study, the linear regression method showed the best performance in predicting game scores.

Sintering process optimization of ZnO varistor materials by machine learning based metamodel (기계학습 기반의 메타모델을 활용한 ZnO 바리스터 소결 공정 최적화 연구)

  • Kim, Boyeol;Seo, Ga Won;Ha, Manjin;Hong, Youn-Woo;Chung, Chan-Yeup
    • Journal of the Korean Crystal Growth and Crystal Technology
    • /
    • v.31 no.6
    • /
    • pp.258-263
    • /
    • 2021
  • ZnO varistor is a semiconductor device which can serve to protect the circuit from surge voltage because its non-linear I-V characteristics by controlling the microstructure of grain and grain boundaries. In order to obtain desired electrical properties, it is important to control microstructure evolution during the sintering process. In this research, we defined a dataset composed of process conditions of sintering and relative permittivity of sintered body, and collected experimental dataset with DOE. Meta-models can predict permittivity were developed by learning the collected experimental dataset on various machine learning algorithms. By utilizing the meta-model, we can derive optimized sintering conditions that could show the maximum permittivity from the numerical-based HMA (Hybrid Metaheuristic Algorithm) optimization algorithm. It is possible to search the optimal process conditions with minimum number of experiments if meta-model-based optimization is applied to ceramic processing.

Determination of the stage and grade of periodontitis according to the current classification of periodontal and peri-implant diseases and conditions (2018) using machine learning algorithms

  • Kubra Ertas;Ihsan Pence;Melike Siseci Cesmeli;Zuhal Yetkin Ay
    • Journal of Periodontal and Implant Science
    • /
    • v.53 no.1
    • /
    • pp.38-53
    • /
    • 2023
  • Purpose: The current Classification of Periodontal and Peri-Implant Diseases and Conditions, published and disseminated in 2018, involves some difficulties and causes diagnostic conflicts due to its criteria, especially for inexperienced clinicians. The aim of this study was to design a decision system based on machine learning algorithms by using clinical measurements and radiographic images in order to determine and facilitate the staging and grading of periodontitis. Methods: In the first part of this study, machine learning models were created using the Python programming language based on clinical data from 144 individuals who presented to the Department of Periodontology, Faculty of Dentistry, Süleyman Demirel University. In the second part, panoramic radiographic images were processed and classification was carried out with deep learning algorithms. Results: Using clinical data, the accuracy of staging with the tree algorithm reached 97.2%, while the random forest and k-nearest neighbor algorithms reached 98.6% accuracy. The best staging accuracy for processing panoramic radiographic images was provided by a hybrid network model algorithm combining the proposed ResNet50 architecture and the support vector machine algorithm. For this, the images were preprocessed, and high success was obtained, with a classification accuracy of 88.2% for staging. However, in general, it was observed that the radiographic images provided a low level of success, in terms of accuracy, for modeling the grading of periodontitis. Conclusions: The machine learning-based decision system presented herein can facilitate periodontal diagnoses despite its current limitations. Further studies are planned to optimize the algorithm and improve the results.

Nakdong River Estuary Salinity Prediction Using Machine Learning Methods (머신러닝 기법을 활용한 낙동강 하구 염분농도 예측)

  • Lee, Hojun;Jo, Mingyu;Chun, Sejin;Han, Jungkyu
    • Smart Media Journal
    • /
    • v.11 no.2
    • /
    • pp.31-38
    • /
    • 2022
  • Promptly predicting changes in the salinity in rivers is an important task to predict the damage to agriculture and ecosystems caused by salinity infiltration and to establish disaster prevention measures. Because machine learning(ML) methods show much less computation cost than physics-based hydraulic models, they can predict the river salinity in a relatively short time. Due to shorter training time, ML methods have been studied as a complementary technique to physics-based hydraulic model. Many studies on salinity prediction based on machine learning have been studied actively around the world, but there are few studies in South Korea. With a massive number of datasets available publicly, we evaluated the performance of various kinds of machine learning techniques that predict the salinity of the Nakdong River Estuary Basin. As a result, LightGBM algorithm shows average 0.37 in RMSE as prediction performance and 2-20 times faster learning speed than other algorithms. This indicates that machine learning techniques can be applied to predict the salinity of rivers in Korea.

Protecting Accounting Information Systems using Machine Learning Based Intrusion Detection

  • Biswajit Panja
    • International Journal of Computer Science & Network Security
    • /
    • v.24 no.5
    • /
    • pp.111-118
    • /
    • 2024
  • In general network-based intrusion detection system is designed to detect malicious behavior directed at a network or its resources. The key goal of this paper is to look at network data and identify whether it is normal traffic data or anomaly traffic data specifically for accounting information systems. In today's world, there are a variety of principles for detecting various forms of network-based intrusion. In this paper, we are using supervised machine learning techniques. Classification models are used to train and validate data. Using these algorithms we are training the system using a training dataset then we use this trained system to detect intrusion from the testing dataset. In our proposed method, we will detect whether the network data is normal or an anomaly. Using this method we can avoid unauthorized activity on the network and systems under that network. The Decision Tree and K-Nearest Neighbor are applied to the proposed model to classify abnormal to normal behaviors of network traffic data. In addition to that, Logistic Regression Classifier and Support Vector Classification algorithms are used in our model to support proposed concepts. Furthermore, a feature selection method is used to collect valuable information from the dataset to enhance the efficiency of the proposed approach. Random Forest machine learning algorithm is used, which assists the system to identify crucial aspects and focus on them rather than all the features them. The experimental findings revealed that the suggested method for network intrusion detection has a neglected false alarm rate, with the accuracy of the result expected to be between 95% and 100%. As a result of the high precision rate, this concept can be used to detect network data intrusion and prevent vulnerabilities on the network.

Weibull Diameter Distribution Yield Prediction System for Loblolly Pine Plantations (테다소나무 조림지(造林地)에 대한 Weibull 직경분포(直經分布) 수확예측(收穫豫測) 시스템에 관(關)한 연구(硏究))

  • Lee, Young-Jin;Hong, Sung-Cheon
    • Journal of Korean Society of Forest Science
    • /
    • v.90 no.2
    • /
    • pp.176-183
    • /
    • 2001
  • Loblolly pine (Pinus taeda L.) is the most economically important timber producing species in the southern United States. Much attention has been given to predicting diameter distributions for the solution of multiple-product yield estimates. The three-parameter Weibull diameter distribution yield prediction systems were developed for loblolly pine plantations. A parameter recovery procedure for the Weibull distribution function based on four percentile equations was applied to develop diameter distribution yield prediction models. Four percentiles (0th, 25th, 50th, 95th) of the cumulative diameter distribution were predicted as a function of quadratic mean diameter. Individual tree height prediction equations were developed for the calculation of yields by diameter class. By using individual tree content prediction equations, expected yield by diameter class can be computed. To reduce rounding-off errors, the Weibull cumulative upper bound limit difference procedure applied in this study shows slightly better results compared with upper and lower bound procedure applied in the past studies. To evaluate this system, the predicted diameter distributions were tested against the observed diameter distributions using the Kolmogorov-Smirnov two sample test at the ${\alpha}$=0.05 level to check if any significant differences existed. Statistically, no significant differences were detected based on the data from 516 evaluation data sets. This diameter distribution yield prediction system will be useful in loblolly pine stand structure modeling, in updating forest inventories, and in evaluating investment opportunities.

  • PDF

Study on Detection for Cochlodinium polykrikoides Red Tide using the GOCI image and Machine Learning Technique (GOCI 영상과 기계학습 기법을 이용한 Cochlodinium polykrikoides 적조 탐지 기법 연구)

  • Unuzaya, Enkhjargal;Bak, Su-Ho;Hwang, Do-Hyun;Jeong, Min-Ji;Kim, Na-Kyeong;Yoon, Hong-Joo
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.15 no.6
    • /
    • pp.1089-1098
    • /
    • 2020
  • In this study, we propose a method to detect red tide Cochlodinium Polykrikoide using by machine learning and geostationary marine satellite images. To learn the machine learning model, GOCI Level 2 data were used, and the red tide location data of the National Fisheries Research and Development Institute was used. The machine learning model used logistic regression model, decision tree model, and random forest model. As a result of the performance evaluation, compared to the traditional GOCI image-based red tide detection algorithm without machine learning (Son et al., 2012) (75%), it was confirmed that the accuracy was improved by about 13~22%p (88~98%). In addition, as a result of comparing and analyzing the detection performance between machine learning models, the random forest model (98%) showed the highest detection accuracy.It is believed that this machine learning-based red tide detection algorithm can be used to detect red tide early in the future and track and monitor its movement and spread.

Designing of the Statistical Models for Imprinting Patterns of Quantitative Traits Loci (QTL) in Swine (돼지에 있어서 양적 형질 유전자좌(QTL) 발현 특성 분석을 위한 통계적 검정 모형 설정)

  • Yoon D. H.;Kong H. S.;Cho Y. M.;Lee J. W.;Choi I. S.;Lee H. K.;Jeon G. J.;Oh S. J.;Cheong I. C.
    • Journal of Embryo Transfer
    • /
    • v.19 no.3
    • /
    • pp.291-299
    • /
    • 2004
  • Characterization of quantitative trait loci (QTL) was investigated in the experimental cross population between Berkshire and Yorkshire breed. A total of 512 F$_2$ offspring from 65 matting of F$_1$ parents were phenotyped the carcass traits included average daily gain (ADG), average backfat thickness (ABF), tenth rip backfat thickness (TRF), loin eye area (LEA), and last rip backfat thickness (LRF). All animals were genotyped for 125 markers across the genome. Marker linkage maps were derived and used in QTL analysis based on line cross least squares regression interval mapping. A decision tree to identify QTL with imprinting effects was developed based on tests against the Mendelian mode of QTL expression. To set the evidence of QTL presence, empirical significance thresholds were derived at chromosome-wise and genome-wise levels using specialized permutation strategies. Significance thresholds derived by the permutation test were validated in the data set based on simulation of a pedigree and data structure similar to the Berkshire-Yorkshire population. Genome scan revealed significant evidences for 13 imprinted QTLs affecting growth and body compositions of which nine were identified to be QTL with paternally expressed inheritance mode. Four of QTLs in the loin eye area (LEA), and tenth rip backfat thickness (TRF), a maternally expressed QTL were found on chromosome 10 and 12. These results support the useful statistical models to analyse the imprinting far the QTLs related carcass trait.

An Adaptive Learning System based on Learner's Behavior Preferences (학습자 행위 선호도에 기반한 적응적 학습 시스템)

  • Kim, Yong-Se;Cha, Hyun-Jin;Park, Seon-Hee;Cho, Yun-Jung;Yoon, Tae-Bok;Jung, Young-Mo;Lee, Jee-Hyong
    • 한국HCI학회:학술대회논문집
    • /
    • 2006.02a
    • /
    • pp.519-525
    • /
    • 2006
  • Advances in information and telecommunication technology increasingly reveal the potential of computer supported education. However, most computer supported learning systems until recently did not pay much attention to different characteristics of individual learners. Intelligent learning environments adaptive to learner's preferences and tasks are desired. Each learner has different preferences and needs, so it is very crucial to provide the different styles of learners with different learning environments that are more preferred and more efficient to them. This paper reports a study of the intelligent learning environment where the learner's preferences are diagnosed using learner models, and then user interfaces are customized in an adaptive manner to accommodate the preferences. In this research, the learning user interfaces were designed based on a learning-style model by Felder & Silverman, so that different learner preferences are revealed through user interactions with the system. Then, a learning style modeling is done from learner behavior patterns using Decision Tree and Neural Network approaches. In this way, an intelligent learning system adaptive to learning styles can be built. Further research efforts are being made to accommodate various other kinds of learner characteristics such as emotion and motivation as well as learning mastery in providing adaptive learning support.

  • PDF

A Development of PM10 Forecasting System (미세먼지 예보시스템 개발)

  • Koo, Youn-Seo;Yun, Hui-Young;Kwon, Hee-Yong;Yu, Suk-Hyun
    • Journal of Korean Society for Atmospheric Environment
    • /
    • v.26 no.6
    • /
    • pp.666-682
    • /
    • 2010
  • The forecasting system for Today's and Tomorrow's PM10 was developed based on the statistical model and the forecasting was performed at 9 AM to predict Today's 24 hour average PM10 concentration and at 5 PM to predict Tomorrow's 24 hour average PM10. The Today's forecasting model was operated based on measured air quality and meteorological data while Tomorrow's model was run by monitored data as well as the meteorological data calculated from the weather forecasting model such as MM5 (Mesoscale Meteorological Model version 5). The observed air quality data at ambient air quality monitoring stations as well as measured and forecasted meteorological data were reviewed to find the relationship with target PM10 concentrations by the regression analysis. The PM concentration, wind speed, precipitation rate, mixing height and dew-point deficit temperature were major variables to determine the level of PM10 and the wind direction at 500 hpa height was also a good indicator to identify the influence of long-range transport from other countries. The neural network, regression model, and decision tree method were used as the forecasting models to predict the class of a comprehensive air quality index and the final forecasting index was determined by the most frequent index among the three model's predicted indexes. The accuracy, false alarm rate, and probability of detection in Tomorrow's model were 72.4%, 0.0%, and 42.9% while those in Today's model were 80.8%, 12.5%, and 77.8%, respectively. The statistical model had the limitation to predict the rapid changing PM10 concentration by long-range transport from the outside of Korea and in this case the chemical transport model would be an alternative method.