• Title/Summary/Keyword: tree-based models

Search Result 437, Processing Time 0.027 seconds

Selection and Management Strategies for Restoration and Conservation Target Sites of Mankyua chejuense using Species Distribution Models (종 분포 모형을 활용한 제주고사리삼의 복원 및 보전 대상지 선정과 관리방안)

  • Lee, Sang-Wook;Jang, Rae-Ik;Oh, Hong-Shik;Jeon, Seong-Woo
    • Journal of the Korean Society of Environmental Restoration Technology
    • /
    • v.26 no.3
    • /
    • pp.29-42
    • /
    • 2023
  • As the destruction of habitats due to recent development continues, there is also increasing interest in endangered species. Mankyua chejuense is a vulnerable species that is sensitive to changes in population and habitat, and it has recently been upgraded from Endangered Species II to Endangered Species I, requiring significant management efforts. So in this study, we analyzed the potential habitats of Mankyua chejuense using MaxEnt(Maximum Entropy) modeling. We developed three models: one that considered only environmental characteristics, one that considered artificial factors, and one that reflected the habitat of dominant tree species in the overstory. Based on previous studies, we incorporated environmental and human influence factors for the habitats of Mankyua chejuense into spatial information, and we also used the habitat distribution models of dominant tree species, including Ulmus parvifolia, Maclura tricuspidata, and Ligustrum obtusifolium, that have been previously identified as major overstory species of Mankyua chejuense. Our analysis revealed that rock exposure, elevation, slope, forest type, building density, and soil type were the main factors determining the potential habitat of Mankyua chejuense. Differences among the three models were observed in the edges of the habitats due to human influence factors, and results varied depending on the similarity of the habitats of Mankyua chejuense and the dominant tree species in the overstory. The potential habitats of Mankyua chejuense presented in this study include areas where the species could potentially inhabit in addition to existing habitats. Therefore, these results can be used for the conservation and management planning of Mankyua chejuense.

A Study on a car Insurance purchase Prediction Using Two-Class Logistic Regression and Two-Class Boosted Decision Tree

  • AN, Su Hyun;YEO, Seong Hee;KANG, Minsoo
    • Korean Journal of Artificial Intelligence
    • /
    • v.9 no.1
    • /
    • pp.9-14
    • /
    • 2021
  • This paper predicted a model that indicates whether to buy a car based on primary health insurance customer data. Currently, automobiles are being used to land transportation and living, and the scope of use and equipment is expanding. This rapid increase in automobiles has caused automobile insurance to emerge as an essential business target for insurance companies. Therefore, if the car insurance sales are predicted and sold using the information of existing health insurance customers, it can generate continuous profits in the insurance company's operating performance. Therefore, this paper aims to analyze existing customer characteristics and implement a predictive model to activate advertisements for customers interested in such auto insurance. The goal of this study is to maximize the profits of insurance companies by devising communication strategies that can optimize business models and profits for customers. This study was conducted through the Microsoft Azure program, and an automobile insurance purchase prediction model was implemented using Health Insurance Cross-sell Prediction data. The program algorithm uses Two-Class Logistic Regression and Two-Class Boosted Decision Tree at the same time to compare two models and predict and compare the results. According to the results of this study, when the Threshold is 0.3, the AUC is 0.837, and the accuracy is 0.833, which has high accuracy. Therefore, the result was that customers with health insurance could induce a positive reaction to auto insurance purchases.

Application of machine learning methods for predicting the mechanical properties of rubbercrete

  • Miladirad, Kaveh;Golafshani, Emadaldin Mohammadi;Safehian, Majid;Sarkar, Alireza
    • Advances in concrete construction
    • /
    • v.14 no.1
    • /
    • pp.15-34
    • /
    • 2022
  • The use of waste rubber in concrete can reduce natural aggregate consumption and improve some technical properties of concrete. Although there are several equations for estimating the mechanical properties of concrete containing waste rubber, limited numbers of machine learning-based models have been proposed to predict the mechanical properties of rubbercrete. In this study, an extensive database of the mechanical properties of rubbercrete was gathered from a comprehensive survey of the literature. To model the mechanical properties of rubbercrete, M5P tree and linear gene expression programming (LGEP) methods as two machine learning techniques were employed to achieve reliable mathematical equations. Two procedures of input variable selection were considered in this study. The crucial component ratios of rubbercrete and concrete age were assumed as the input variables in the first procedure. In contrast, the volumes of the coarse and fine waste rubber and the compressive strength of concrete without waste rubber were considered the second procedure of the input variables. The results show that the models obtained by LGEP are more accurate than those achieved by the M5P model tree and existing traditional equations. Besides, the volumes of the coarse and fine waste rubber and the compressive strength of concrete without waste rubber are better predictors of the mechanical properties of rubbercrete compared to the first procedure of input variable selection.

Core Keywords Extraction forEvaluating Online Consumer Reviews Using a Decision Tree: Focusing on Star Ratings and Helpfulness Votes (의사결정나무를 활용한 온라인 소비자 리뷰 평가에 영향을 주는 핵심 키워드 도출 연구: 별점과 좋아요를 중심으로)

  • Min, Kyeong Su;Yoo, Dong Hee
    • The Journal of Information Systems
    • /
    • v.32 no.3
    • /
    • pp.133-150
    • /
    • 2023
  • Purpose This study aims to develop classification models using a decision tree algorithm to identify core keywords and rules influencing online consumer review evaluations for the robot vacuum cleaner on Amazon.com. The difference from previous studies is that we analyze core keywords that affect the evaluation results by dividing the subjects that evaluate online consumer reviews into self-evaluation (star ratings) and peer evaluation (helpfulness votes). We investigate whether the core keywords influencing star ratings and helpfulness votes vary across different products and whether there is a similarity in the core keywords related to star ratings or helpfulness votes across all products. Design/methodology/approach We used random under-sampling to balance the dataset. We progressively removed independent variables based on decreasing importance through backwards elimination to evaluate the classification model's performance. As a result, we identified classification models that best predict star ratings and helpfulness votes for each product's online consumer reviews. Findings We have identified that the core keywords influencing self-evaluation and peer evaluation vary across different products, and even for the same model or features, the core keywords are not consistent. Therefore, companies' producers and marketing managers need to analyze the core keywords of each product to highlight the advantages and prepare customized strategies that compensate for the shortcomings.

A Study on the Prediction Models of Used Car Prices for Domestic Brands Using Machine Learning (머신러닝을 활용한 브랜드별 국내 중고차 가격 예측 모델에 관한 연구)

  • Seungjun Yim;Joungho Lee;Choonho Ryu
    • Journal of Service Research and Studies
    • /
    • v.13 no.3
    • /
    • pp.105-126
    • /
    • 2023
  • The domestic used car market continues to grow along with the used car online platform service. The used car online platform service discloses vehicle specifications, accident history, inspection history, and detailed options to service consumers. Most of the preceding studies were predictions of used car prices using vehicle specifications and some options for vehicles. As a result of the study, it was confirmed that there was a nonlinear relationship between used car prices and some specification variables. Accordingly, the researchers tried to solve the nonlinear problem by executing a Machine Learning model. In common, the Regression based Machine Learning model had the advantage of knowing the actual influence and direction of variables, but there was a disadvantage of low Cost Function figures compared to the Decision Tree based Machine Learning model. This study attempted to predict used car prices of six domestic brands by utilizing both vehicle specifications and vehicle options. Through this, we tried to collect the advantages of the two types of Machine Learning models. To this end, we sequentially conducted a regression based Machine Learning model and a decision tree based Machine Learning model. As a result of the analysis, the practical influence and direction of each brand variable, and the best tree based Machine Learning model were selected. The implications of this study are as follows. It will help buyers and sellers who use used car online platform services to predict approximate used car prices. And it is hoped that it will help solve the problem caused by information inequality among users of the used car online platform service.

Customer Churning Forecasting and Strategic Implication in Online Auto Insurance using Decision Tree Algorithms (의사결정나무를 이용한 온라인 자동차 보험 고객 이탈 예측과 전략적 시사점)

  • Lim, Se-Hun;Hur, Yeon
    • Information Systems Review
    • /
    • v.8 no.3
    • /
    • pp.125-134
    • /
    • 2006
  • This article adopts a decision tree algorithm(C5.0) to predict customer churning in online auto insurance environment. Using a sample of on-line auto insurance customers contracts sold between 2003 and 2004, we test how decision tree-based model(C5.0) works on the prediction of customer churning. We compare the result of C5.0 with those of logistic regression model(LRM), multivariate discriminant analysis(MDA) model. The result shows C5.0 outperforms other models in the predictability. Based on the result, this study suggests a way of setting marketing strategy and of developing online auto insurance business.

Design and Performance Measurement of a Genetic Algorithm-based Group Classification Method : The Case of Bond Rating (유전 알고리듬 기반 집단분류기법의 개발과 성과평가 : 채권등급 평가를 중심으로)

  • Min, Jae-H.;Jeong, Chul-Woo
    • Journal of the Korean Operations Research and Management Science Society
    • /
    • v.32 no.1
    • /
    • pp.61-75
    • /
    • 2007
  • The purpose of this paper is to develop a new group classification method based on genetic algorithm and to com-pare its prediction performance with those of existing methods in the area of bond rating. To serve this purpose, we conduct various experiments with pilot and general models. Specifically, we first conduct experiments employing two pilot models : the one searching for the cluster center of each group and the other one searching for both the cluster center and the attribute weights in order to maximize classification accuracy. The results from the pilot experiments show that the performance of the latter in terms of classification accuracy ratio is higher than that of the former which provides the rationale of searching for both the cluster center of each group and the attribute weights to improve classification accuracy. With this lesson in mind, we design two generalized models employing genetic algorithm : the one is to maximize the classification accuracy and the other one is to minimize the total misclassification cost. We compare the performance of these two models with those of existing statistical and artificial intelligent models such as MDA, ANN, and Decision Tree, and conclude that the genetic algorithm-based group classification method that we propose in this paper significantly outperforms the other methods in respect of classification accuracy ratio as well as misclassification cost.

Estimation of ultimate bearing capacity of shallow foundations resting on cohesionless soils using a new hybrid M5'-GP model

  • Khorrami, Rouhollah;Derakhshani, Ali
    • Geomechanics and Engineering
    • /
    • v.19 no.2
    • /
    • pp.127-139
    • /
    • 2019
  • Available methods to determine the ultimate bearing capacity of shallow foundations may not be accurate enough owing to the complicated failure mechanism and diversity of the underlying soils. Accordingly, applying new methods of artificial intelligence can improve the prediction of the ultimate bearing capacity. The M5' model tree and the genetic programming are two robust artificial intelligence methods used for prediction purposes. The model tree is able to categorize the data and present linear models while genetic programming can give nonlinear models. In this study, a combination of these methods, called the M5'-GP approach, is employed to predict the ultimate bearing capacity of the shallow foundations, so that the advantages of both methods are exploited, simultaneously. Factors governing the bearing capacity of the shallow foundations, including width of the foundation (B), embedment depth of the foundation (D), length of the foundation (L), effective unit weight of the soil (${\gamma}$) and internal friction angle of the soil (${\varphi}$) are considered for modeling. To develop the new model, experimental data of large and small-scale tests were collected from the literature. Evaluation of the new model by statistical indices reveals its better performance in contrast to both traditional and recent approaches. Moreover, sensitivity analysis of the proposed model indicates the significance of various predictors. Additionally, it is inferred that the new model compares favorably with different models presented by various researchers based on a comprehensive ranking system.

Development of Diameter Growth and Mortality Prediction Models of Pinus Koraiensis Based on Periodic Annual Increment (정기평균생장을 이용한 잣나무 임분의 흉고직경 생장예측모델 및 고사예측모델의 개발)

  • Kim, Seonyoung;Seol, Ara;Chung, Joosang
    • Journal of Korean Society of Forest Science
    • /
    • v.100 no.1
    • /
    • pp.1-7
    • /
    • 2011
  • The objective of this study was to improve the performance of the existing individual-tree/distantindependent stand growth model in predicting the growth of Pinus koraiensis forest stands. The parameters of diameter growth and mortality prediction models were estimated using periodic annual increment (PAI) of permanent plots and the performance of the models were compared with that of the existing ones using mean anuual increment (MAI). The diameter growth model includes crown ratio, potential diameter growth and modifier to compute for competitions of trees of a stand. In deriving the mortality prediction model, the parameters were estimated based on PAI which was also estimated as the function of MAI due to the lacking of permanent plot data. The results of this study showed that the newly-estimated functions based on PAI provide more realistic patterns in diameter growth of individual trees. The new approach using PAI in mortality model seems to overcome the over-estimate problem by the MAI-based model in estimating mortality of stand trees.

Feature-Based Multi-Resolution Modeling of Solids Using History-Based Boolean Operations - Part II : Implementation Using a Non-Manifold Modeling System -

  • Lee Sang Hun;Lee Kyu-Yeul;Woo Yoonwhan;Lee Kang-Soo
    • Journal of Mechanical Science and Technology
    • /
    • v.19 no.2
    • /
    • pp.558-566
    • /
    • 2005
  • We propose a feature-based multi-resolution representation of B-rep solid models using history-based Boolean operations based on the merge-and-select algorithm. Because union and subtraction are commutative in the history-based Boolean operations, the integrity of the models at various levels of detail (LOD) is guaranteed for the reordered features regardless of whether the features are subtractive or additive. The multi-resolution solid representation proposed in this paper includes a non-manifold topological merged-set model of all feature primitives as well as a feature-modeling tree reordered consistently with a given LOD criterion. As a result, a B-rep solid model for a given LOD can be provided quickly, because the boundary of the model is evaluated without any geometric calculation and extracted from the merged set by selecting the entities contributing to the LOD model shape.