• Title/Summary/Keyword: Regression Tree

Search Result 689, Processing Time 0.029 seconds

An application to Zero-Inflated Poisson Regression Model

  • Kim, Kyung-Moo
    • Journal of the Korean Data and Information Science Society
    • /
    • v.14 no.1
    • /
    • pp.45-53
    • /
    • 2003
  • The Zero-Inflated Poisson regression is a model for count data with exess zeros. When the reponse variables have excess zeros, it is not easy to apply the Poisson regression model. In this paper, we study and simulate the zero-inflated Poisson regression model. An real example was applied to this model. Regression parameters are estimated by using MLE's. We also compare the fitness of zero-inflated Poisson model with the Poisson regression and decision tree model.

  • PDF

Estimation of fruit number of apple tree based on YOLOv5 and regression model (YOLOv5 및 다항 회귀 모델을 활용한 사과나무의 착과량 예측 방법)

  • Hee-Jin Gwak;Yunju Jeong;Ik-Jo Chun;Cheol-Hee Lee
    • Journal of IKEEE
    • /
    • v.28 no.2
    • /
    • pp.150-157
    • /
    • 2024
  • In this paper, we propose a novel algorithm for predicting the number of apples on an apple tree using a deep learning-based object detection model and a polynomial regression model. Measuring the number of apples on an apple tree can be used to predict apple yield and to assess losses for determining agricultural disaster insurance payouts. To measure apple fruit load, we photographed the front and back sides of apple trees. We manually labeled the apples in the captured images to construct a dataset, which was then used to train a one-stage object detection CNN model. However, when apples on an apple tree are obscured by leaves, branches, or other parts of the tree, they may not be captured in images. Consequently, it becomes difficult for image recognition-based deep learning models to detect or infer the presence of these apples. To address this issue, we propose a two-stage inference process. In the first stage, we utilize an image-based deep learning model to count the number of apples in photos taken from both sides of the apple tree. In the second stage, we conduct a polynomial regression analysis, using the total apple count from the deep learning model as the independent variable, and the actual number of apples manually counted during an on-site visit to the orchard as the dependent variable. The performance evaluation of the two-stage inference system proposed in this paper showed an average accuracy of 90.98% in counting the number of apples on each apple tree. Therefore, the proposed method can significantly reduce the time and cost associated with manually counting apples. Furthermore, this approach has the potential to be widely adopted as a new foundational technology for fruit load estimation in related fields using deep learning.

Relation between the Shade Hours and the Landscape Tree Growth in the Apartment Housing Areas (공동주택단지내 조경수목의 생장과 피음시간과의 관계)

  • 윤근영;안건용
    • Korean Journal of Environment and Ecology
    • /
    • v.10 no.1
    • /
    • pp.49-57
    • /
    • 1996
  • To figure out the relation between the shade hours and the landscape tree growth in the apartment housing areas, the present sizes and planting positions of 4 tree species in Gwacheon-si apartment housing areas were surveyed. Then, shade hours were analyzed and the data were analyzed by simple linear regression method. As a whole, the R$^{2}$ was too low to generalize the regression equation. Therefore, it was presumed that the gravity of shade hours in landscape tree growth in this sample site was relatively lower than that of any other environmental factors. However, it was presumed that the characteristics of shade intolerant and tolerant tree were turned up, because Pinus strobus showed a low negative correlation with shade housm and Acer palmatum and Magnolia denudata showed a low positive correlation with shade hours generally. And, it was proved that the statistically significant cases were the tree diameter at root collar and tree sidth of Acer palmatum and tree width of Magnolia denudata with shade hours showing a low correlation coefficient less than 0.4.

  • PDF

Forecasting Sow's Productivity using the Machine Learning Models (머신러닝을 활용한 모돈의 생산성 예측모델)

  • Lee, Min-Soo;Choe, Young-Chan
    • Journal of Agricultural Extension & Community Development
    • /
    • v.16 no.4
    • /
    • pp.939-965
    • /
    • 2009
  • The Machine Learning has been identified as a promising approach to knowledge-based system development. This study aims to examine the ability of machine learning techniques for farmer's decision making and to develop the reference model for using pig farm data. We compared five machine learning techniques: logistic regression, decision tree, artificial neural network, k-nearest neighbor, and ensemble. All models are well performed to predict the sow's productivity in all parity, showing over 87.6% predictability. The model predictability of total litter size are highest at 91.3% in third parity and decreasing as parity increases. The ensemble is well performed to predict the sow's productivity. The neural network and logistic regression is excellent classifier for all parity. The decision tree and the k-nearest neighbor was not good classifier for all parity. Performance of models varies over models used, showing up to 104% difference in lift values. Artificial Neural network and ensemble models have resulted in highest lift values implying best performance among models.

  • PDF

"Pool-the-Maximum-Violators" Algorithm

  • Kikuo Yanagi;Akio Kudo;Park, Yong-Beom
    • Journal of the Korean Statistical Society
    • /
    • v.21 no.2
    • /
    • pp.201-207
    • /
    • 1992
  • The algorithm for obtaining the isotonic regression in simple tree order, the most basic and simplest model next to the simple order, is considered. We propose to call it "Pool-the-Maximum-Violators" algorithm (PMVA) in conjunction with the "Pool-Adjacent-Violators" algorithm (PAVA) in the simple order. The dual problem of obtaining the isotonic regression in simple tree order is our main concern. An intuitively appealing relation between the primal and the dual problems is demonstrated. The interesting difference is that in simple order the required number of pooling is at least the number of initial violating pairs and any path leads to the solution, whereas in the simple tree order it is at most the number of initial violators and there is only one advisable path although there may be some others leading to the same solution.o the same solution.

  • PDF

IMPERVIOUS SURFACE ESTIMATION USING REMOTE SENSING IMAGES AND TREE REGRESSIOIN

  • Kim, Soo-Young;Kim, Jong-Hong;Heo, Joon;Heo, Jun-Haeng
    • Proceedings of the KSRS Conference
    • /
    • v.1
    • /
    • pp.239-242
    • /
    • 2006
  • Impervious surface is an important index for the estimation of urbanization and environmental change. In addition, impervious surface has an influence on the parameters of rainfall-runoff model during rainy season. The increase of impervious surface causes peak discharge increasing and fast concentration time in urban area. Accordingly, impervious surface estimation is an important factor of urban rainfall-runoff model development and calibration. In this study, impervious surface estimation is performed by using remote sensing images such as landsat-7 ETM+ and high resolution satellite image and regression tree algorithm based on case study area ? Jungnang-cheon basin in Korea.

  • PDF

Development of Allometry and Individual Basal Area Growth Model for Major Species in Korea (우리나라 주요수종의 Allometry와 개체목 흉고단면적 생장모델 개발)

  • Choi, Jung-Kee
    • Journal of Forest and Environmental Science
    • /
    • v.27 no.1
    • /
    • pp.47-54
    • /
    • 2011
  • Allometry and basal area equations were developed with various tree measurement variables for the major species; Quercus variabilis, Quercus mongolica, Pinus koraiensis and Larix leptolepis in Korea. For allometry models, the relationships between total height-DBH, crown width-DBH, height to the widest portion of the crown-total height, and height to base of crown-total height were investigated. Multiple regression methods were used to relate annual basal area growth to tree variables of initial size (DBH, total height, and crown width), relative size (relative diameter and relative height) as well as competition measures (competition index, crown class, and live crown ratio).

Tree-Structure-Aware Genetic Operators in Genetic Programming

  • Seo, Kisung;Pang, Chulhyuk
    • Journal of Electrical Engineering and Technology
    • /
    • v.9 no.2
    • /
    • pp.749-754
    • /
    • 2014
  • In this paper, we suggest tree-structure-aware GP (Genetic Programming) operators that heed tree distributions in structure space and their possible structural difficulties. The main idea of the proposed GP operators is to place the generated offspring of crossover and/or mutation in a specified region of tree structure space insofar as possible by biasing the tree structures of the altered subtrees, taking into account the observation that most solutions are found in that region. To demonstrate the effectiveness of the proposed approach, experiments on the binomial-3 regression, multiplexor and even parity problems are performed. The results show that the results using the proposed tree-structure-aware operators are superior to the results of standard GP for all three test problems in both success rate and number of evaluations.

Comparison of Data Mining Classification Algorithms for Categorical Feature Variables (범주형 자료에 대한 데이터 마이닝 분류기법 성능 비교)

  • Sohn, So-Young;Shin, Hyung-Won
    • IE interfaces
    • /
    • v.12 no.4
    • /
    • pp.551-556
    • /
    • 1999
  • In this paper, we compare the performance of three data mining classification algorithms(neural network, decision tree, logistic regression) in consideration of various characteristics of categorical input and output data. $2^{4-1}$. 3 fractional factorial design is used to simulate the comparison situation where factors used are (1) the categorical ratio of input variables, (2) the complexity of functional relationship between the output and input variables, (3) the size of randomness in the relationship, (4) the categorical ratio of an output variable, and (5) the classification algorithm. Experimental study results indicate the following: decision tree performs better than the others when the relationship between output and input variables is simple while logistic regression is better when the other way is around; and neural network appears a better choice than the others when the randomness in the relationship is relatively large. We also use Taguchi design to improve the practicality of our study results by letting the relationship between the output and input variables as a noise factor. As a result, the classification accuracy of neural network and decision tree turns out to be higher than that of logistic regression, when the categorical proportion of the output variable is even.

  • PDF

CORRELATION ANALYSIS BETWEEN FOREST VOLUME, ETM+ BANDS, AND HEIGHT ESTIMATED FROM C-BAND SRTM PRODUCT

  • Kim, Jin-Woo;Kim, Jong-Hong;Lee, Jung-Bin;Heo, Joon
    • Proceedings of the KSRS Conference
    • /
    • v.1
    • /
    • pp.512-515
    • /
    • 2006
  • Forest stand height and volume are important indicators for management purpose as well as for the environmental analysis. Shuttle Radar Topography Mission (SRTM) is backscattered over forest canopy and DSM can be acquired from such scattering characteristic, while National Elevation Dataset (NED) provides bare earth elevation data. The difference between SRTM and NED is estimated as tree height, and it is correlated with forest parameters, it is correlated with forest parameters, including average DBH, Trees per acre, net BF per acre, and total Net MBF. Especially, among them, net Board Foot(BF) per acre is the index that well represents forest volume. The Project site was Douglas-fir dominating plantation area in the western Washington an the northern Oregon in the U.S. This study shows a relationship of high correlation between the forest parameters and the product from SRTM, NED, and ETM+. This research performs multi regression analysis and regression tree algorithm, and can get more improved relationship between several parameters.

  • PDF