• 제목/요약/키워드: Decision Tree Based Machine Learning

검색결과 229건 처리시간 0.028초

머신러닝을 활용한 모돈의 생산성 예측모델 (Forecasting Sow's Productivity using the Machine Learning Models)

  • 이민수;최영찬
    • 농촌지도와개발
    • /
    • 제16권4호
    • /
    • pp.939-965
    • /
    • 2009
  • The Machine Learning has been identified as a promising approach to knowledge-based system development. This study aims to examine the ability of machine learning techniques for farmer's decision making and to develop the reference model for using pig farm data. We compared five machine learning techniques: logistic regression, decision tree, artificial neural network, k-nearest neighbor, and ensemble. All models are well performed to predict the sow's productivity in all parity, showing over 87.6% predictability. The model predictability of total litter size are highest at 91.3% in third parity and decreasing as parity increases. The ensemble is well performed to predict the sow's productivity. The neural network and logistic regression is excellent classifier for all parity. The decision tree and the k-nearest neighbor was not good classifier for all parity. Performance of models varies over models used, showing up to 104% difference in lift values. Artificial Neural network and ensemble models have resulted in highest lift values implying best performance among models.

  • PDF

Machine Learning Based Keyphrase Extraction: Comparing Decision Trees, Naïve Bayes, and Artificial Neural Networks

  • Sarkar, Kamal;Nasipuri, Mita;Ghose, Suranjan
    • Journal of Information Processing Systems
    • /
    • 제8권4호
    • /
    • pp.693-712
    • /
    • 2012
  • The paper presents three machine learning based keyphrase extraction methods that respectively use Decision Trees, Na$\ddot{i}$ve Bayes, and Artificial Neural Networks for keyphrase extraction. We consider keyphrases as being phrases that consist of one or more words and as representing the important concepts in a text document. The three machine learning based keyphrase extraction methods that we use for experimentation have been compared with a publicly available keyphrase extraction system called KEA. The experimental results show that the Neural Network based keyphrase extraction method outperforms two other keyphrase extraction methods that use the Decision Tree and Na$\ddot{i}$ve Bayes. The results also show that the Neural Network based method performs better than KEA.

머신러닝 알고리즘 기반의 의료비 예측 모델 개발 (Development of Medical Cost Prediction Model Based on the Machine Learning Algorithm)

  • Han Bi KIM;Dong Hoon HAN
    • Journal of Korea Artificial Intelligence Association
    • /
    • 제1권1호
    • /
    • pp.11-16
    • /
    • 2023
  • Accurate hospital case modeling and prediction are crucial for efficient healthcare. In this study, we demonstrate the implementation of regression analysis methods in machine learning systems utilizing mathematical statics and machine learning techniques. The developed machine learning model includes Bayesian linear, artificial neural network, decision tree, decision forest, and linear regression analysis models. Through the application of these algorithms, corresponding regression models were constructed and analyzed. The results suggest the potential of leveraging machine learning systems for medical research. The experiment aimed to create an Azure Machine Learning Studio tool for the speedy evaluation of multiple regression models. The tool faciliates the comparision of 5 types of regression models in a unified experiment and presents assessment results with performance metrics. Evaluation of regression machine learning models highlighted the advantages of boosted decision tree regression, and decision forest regression in hospital case prediction. These findings could lay the groundwork for the deliberate development of new directions in medical data processing and decision making. Furthermore, potential avenues for future research may include exploring methods such as clustering, classification, and anomaly detection in healthcare systems.

CAE와 Decision-tree를 이용한 사출성형 공정개선에 관한 연구 (A Study on the Improvement of Injection Molding Process Using CAE and Decision-tree)

  • 황순환;한성렬;이후진
    • 한국산학기술학회논문지
    • /
    • 제22권4호
    • /
    • pp.580-586
    • /
    • 2021
  • 현재 사출성형분야의 Computer Aided Testing(CAT) 방법론으로 CAE(Computer Aided Engineering)를 이용한 수치 해석 기법이 주를 이루고 있다. 그러나 최근 시뮬레이션에 추가로 인공지능 기법을 응용하는 방법론이 연구되고 있다. 우리는 지난 연구에서 다양한 Machine Learning 기법을 활용하여 사출 성형 공정에 따른 변형 결과를 비교하였으며, 최종적으로 MLP(Multi-Layer Perceptron) 예측모델을 생성하였고, HMA(Hybrid Metaheuristic Algorithm)를 이용하여 최적화 결과를 얻어냈다. 그러나 MLP는 예측 성능이 우수한 반면 블랙박스와 같이 결정 과정에 대한 설명이 부족하다. 본 연구에서는 Radiator Tank 부품에 대하여 사출 성형 해석 소프트웨어인 Autodesk Moldflow 2018을 이용하여 수치 해석 기법으로 데이터를 생성하고, Machine Learning 소프트웨어인 RapidMiner Studio version 9.5를 활용하여 여러 Machine Learning Algorithms 모델을 생성하여 평균 제곱근 오차를 비교하였다. Decision-tree는 Root Mean Square Error(RMSE) 값이 다른 Machine Learning 기법에 비해 양호한 예측 성능을 갖추고 있었다. Decision-tree의 크기를 결정하는 Maximal Depth에 따라 분류 기준을 높일 수 있지만 복잡성도 함께 증가시켰다. Decision-tree를 이용하여 구속 조건을 만족하는 중간 값을 선정하여 시뮬레이션을 진행한 결과 기존의 시뮬레이션만 진행한 것보다 7.7%의 개선 효과가 있었다.

머신러닝을 활용한 브랜드별 국내 중고차 가격 예측 모델에 관한 연구 (A Study on the Prediction Models of Used Car Prices for Domestic Brands Using Machine Learning)

  • 임승준;이정호;류춘호
    • 서비스연구
    • /
    • 제13권3호
    • /
    • pp.105-126
    • /
    • 2023
  • 국내 중고차 시장은 지속적으로 성장하고 있으며, 이와 동시에 중고차 온라인 플랫폼 서비스 역시 함께 매년 시장 점유율을 확대하고 있다. 중고차 온라인 플랫폼 서비스는 차량의 제원, 점검 이력, 사고 내역, 그리고 세부 옵션 등을 서비스 이용자에게 제공하고 있다. 대부분의 기존 연구는 차량의 제원과 차량의 일부 옵션을 활용한 중고차 가격의 예측이었으며, 중고차 가격과 일부 제원 변수 간 비선형 관계임을 확인하였다. 이에 따라 연구자들은 이러한 비선형 문제를 해결하기 위해 머신러닝(Machine Learning) 모델의 실행을 제안하였으며, 그 결과 회귀(Regression) 기반 머신러닝 모델은 변수의 실질적인 영향력과 방향성을 알 수 있는 장점이 존재하였으나, 트리(Decision Tree) 기반 머신러닝 모델에 비해 비용함수 수치가 저조한 단점이 존재하였다. 본 연구는 국내 브랜드를 대상으로 차량의 제원과 차량의 옵션, 총 70여 개의 변수를 모두 활용하여 회귀 기반 머신러닝 모델과 트리 기반 머신러닝 모델을 순차적으로 실행하여 두 유형의 머신러닝 모델의 장점을 취합하고자 하였다. 이를 통해 브랜드별 변수의 실질적 영향력과 방향성을 확인한 후 브랜드별 가장 우수한 트리 기반 머신러닝 모델을 선정하였다. 본 연구의 시사점은 다음과 같다. 중고차 온라인 플랫폼 서비스를 이용하는 구매자와 판매자가 전반적인 중고차 가격 예측을 지원할 수 있다. 이에 따라 중고차 온라인 플랫폼 서비스 이용자 간 정보의 비대칭으로 인한 문제 해결 역시 지원이 가능할 것으로 기대한다.

이미지 보간을 위한 의사결정나무 분류 기법의 적용 및 구현 (Adopting and Implementation of Decision Tree Classification Method for Image Interpolation)

  • 김동형
    • 디지털산업정보학회논문지
    • /
    • 제16권1호
    • /
    • pp.55-65
    • /
    • 2020
  • With the development of display hardware, image interpolation techniques have been used in various fields such as image zooming and medical imaging. Traditional image interpolation methods, such as bi-linear interpolation, bi-cubic interpolation and edge direction-based interpolation, perform interpolation in the spatial domain. Recently, interpolation techniques in the discrete cosine transform or wavelet domain are also proposed. Using these various existing interpolation methods and machine learning, we propose decision tree classification-based image interpolation methods. In other words, this paper is about the method of adaptively applying various existing interpolation methods, not the interpolation method itself. To obtain the decision model, we used Weka's J48 library with the C4.5 decision tree algorithm. The proposed method first constructs attribute set and select classes that means interpolation methods for classification model. And after training, interpolation is performed using different interpolation methods according to attributes characteristics. Simulation results show that the proposed method yields reasonable performance.

기계적 학습의 알고리즘을 이용하여 아파트 공사에서 반복 공정의 효과 비교에 관한 연구 (Identifying the Effects of Repeated Tasks in an Apartment Construction Project Using Machine Learning Algorithm)

  • 김현주
    • 한국BIM학회 논문집
    • /
    • 제6권4호
    • /
    • pp.35-41
    • /
    • 2016
  • Learning effect is an observation that the more times a task is performed, the less time is required to produce the same amount of outcomes. The construction industry heavily relies on repeated tasks where the learning effect is an important measure to be used. However, most construction durations are calculated and applied in real projects without considering the learning effects in each of the repeated activities. This paper applied the learning effect to the repeated activities in a small sized apartment construction project. The result showed that there was about 10 percent of difference in duration (one approach of the total duration with learning effects in 41 days while the other without learning effect in 36.5 days). To make the comparison between the two approaches, a large number of BIM based computer simulations were generated and useful patterns were recognized using machine learning algorithm named Decision Tree (See5). Machine learning is a data-driven approach for pattern recognition based on observational evidence.

Prediction of the number of public bicycle rental in Seoul using Boosted Decision Tree Regression Algorithm

  • KIM, Hyun-Jun;KIM, Hyun-Ki
    • 한국인공지능학회지
    • /
    • 제10권1호
    • /
    • pp.9-14
    • /
    • 2022
  • The demand for public bicycles operated by the Seoul Metropolitan Government is increasing every year. The size of the Seoul public bicycle project, which first started with about 5,600 units, increased to 3,7500 units as of September 2021, and the number of members is also increasing every year. However, as the size of the project grows, excessive budget spending and deficit problems are emerging for public bicycle projects, and new bicycles, rental office costs, and bicycle maintenance costs are blamed for the deficit. In this paper, the Azure Machine Learning Studio program and the Boosted Decision Tree Regression technique are used to predict the number of public bicycle rental over environmental factors and time. Predicted results it was confirmed that the demand for public bicycles was high in the season except for winter, and the demand for public bicycles was the highest at 6 p.m. In addition, in this paper compare four additional regression algorithms in addition to the Boosted Decision Tree Regression algorithm to measure algorithm performance. The results showed high accuracy in the order of the First Boosted Decision Tree Regression Algorithm (0.878802), second Decision Forest Regression (0.838232), third Poison Regression (0.62699), and fourth Linear Regression (0.618773). Based on these predictions, it is expected that more public bicycles will be placed at rental stations near public transportation to meet the growing demand for commuting hours and that more bicycles will be placed in rental stations in summer than winter and the life of bicycles can be extended in winter.

Fuzzy Classification Rule Learning by Decision Tree Induction

  • Lee, Keon-Myung;Kim, Hak-Joon
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • 제3권1호
    • /
    • pp.44-51
    • /
    • 2003
  • Knowledge acquisition is a bottleneck in knowledge-based system implementation. Decision tree induction is a useful machine learning approach for extracting classification knowledge from a set of training examples. Many real-world data contain fuzziness due to observation error, uncertainty, subjective judgement, and so on. To cope with this problem of real-world data, there have been some works on fuzzy classification rule learning. This paper makes a survey for the kinds of fuzzy classification rules. In addition, it presents a fuzzy classification rule learning method based on decision tree induction, and shows some experiment results for the method.

자연어 처리 기반 『상한론(傷寒論)』 변병진단체계(辨病診斷體系) 분류를 위한 기계학습 모델 선정 (Selecting Machine Learning Model Based on Natural Language Processing for Shanghanlun Diagnostic System Classification)

  • 김영남
    • 대한상한금궤의학회지
    • /
    • 제14권1호
    • /
    • pp.41-50
    • /
    • 2022
  • Objective : The purpose of this study is to explore the most suitable machine learning model algorithm for Shanghanlun diagnostic system classification using natural language processing (NLP). Methods : A total of 201 data items were collected from 『Shanghanlun』 and 『Clinical Shanghanlun』, 'Taeyangbyeong-gyeolhyung' and 'Eumyangyeokchahunobokbyeong' were excluded to prevent oversampling or undersampling. Data were pretreated using a twitter Korean tokenizer and trained by logistic regression, ridge regression, lasso regression, naive bayes classifier, decision tree, and random forest algorithms. The accuracy of the models were compared. Results : As a result of machine learning, ridge regression and naive Bayes classifier showed an accuracy of 0.843, logistic regression and random forest showed an accuracy of 0.804, and decision tree showed an accuracy of 0.745, while lasso regression showed an accuracy of 0.608. Conclusions : Ridge regression and naive Bayes classifier are suitable NLP machine learning models for the Shanghanlun diagnostic system classification.

  • PDF