• Title/Summary/Keyword: negative binomial regression model

Search Result 113, Processing Time 0.031 seconds

Bayesian Analysis for the Zero-inflated Regression Models (영과잉 회귀모형에 대한 베이지안 분석)

  • Jang, Hak-Jin;Kang, Yun-Hee;Lee, S.;Kim, Seong-W.
    • The Korean Journal of Applied Statistics
    • /
    • v.21 no.4
    • /
    • pp.603-613
    • /
    • 2008
  • We often encounter the situation that discrete count data have a large portion of zeros. In this case, it is not appropriate to analyze the data based on standard regression models such as the poisson or negative binomial regression models. In this article, we consider Bayesian analysis for two commonly used models. They are zero-inflated poisson and negative binomial regression models. We use the Bayes factor as a model selection tool and computation is proceeded via Markov chain Monte Carlo methods. Crash count data are analyzed to support theoretical results.

Traffic Accident Models of 3-Legged Signalized Intersections in the Case of Cheongju (3지 신호교차로의 교통사고 발생모형 - 청주시를 사례로 -)

  • Park, Byung-Ho;Han, Sang-Uk;Kim, Tae-Young
    • Journal of the Korean Society of Safety
    • /
    • v.24 no.2
    • /
    • pp.94-99
    • /
    • 2009
  • This study deals with the traffic accidents at the 3-legged signalized intersections in Cheongu. The goals are to analyze the geometric, traffic and operational conditions of intersections and to develop a various functional forms that predict the accidents. The models are developed through the correlation analysis, the multiple linear, the multiple nonlinear, Poisson and negative binomial regression analysis. In this study, two multiple linear, two multiple nonlinear and two negative binomial regression models were calibrated. These models were all analyzed to be statistically significant. All the models include 2 common variables(traffic volume and lane width) and model-specific variables. These variables are, therefore, evaluated to be critical to the accident reduction of Cheongju.

A Zero-Inated Model for Insurance Data (제로팽창 모형을 이용한 보험데이터 분석)

  • Choi, Jong-Hoo;Ko, In-Mi;Cheon, Soo-Young
    • The Korean Journal of Applied Statistics
    • /
    • v.24 no.3
    • /
    • pp.485-494
    • /
    • 2011
  • When the observations can take only the non-negative integer values, it is called the count data such as the numbers of car accidents, earthquakes, or insurance coverage. In general, the Poisson regression model has been used to model these count data; however, this model has a weakness in that it is restricted by the equality of the mean and the variance. On the other hand, the count data often tend to be too dispersed to allow the use of the Poisson model in practice because the variance of data is significantly larger than its mean due to heterogeneity within groups. When overdispersion is not taken into account, it is expected that the resulting parameter estimates or standard errors will be inefficient. Since coverage is the main issue for insurance, some accidents may not be covered by insurance, and the number covered by insurance may be zero. This paper considers the zero-inflated model for the count data including many zeros. The performance of this model has been investigated by using of real data with overdispersion and many zeros. The results indicate that the Zero-Inflated Negative Binomial Regression Model performs the best for model evaluation.

Analysis of Accident Characteristics and Development of Accident Models in the Signalized Intersections of Cheongju and Cheongwon (지방부 신호교차로 사고특성분석 및 모형개발 (청주.청원을 중심으로))

  • Park, Byung-Ho;Yoo, Doo-Seon;Yang, Jeong-Mo;Lee, Young-Min
    • Journal of Korean Society of Transportation
    • /
    • v.26 no.2
    • /
    • pp.35-46
    • /
    • 2008
  • The purposes of this study are to analyze the characteristics and to develop the models of traffic accidents. In pursuing the above, this study gives particular attentions to developing the models(multiple linear, poisson and negative binomial regression) using the data of Cheongju and Cheongwon signalized intersections. The main results analyzed are as follows. First, the accident characteristics of rural area were defined by factor. Second, 4 accident models which are all statistically significant were developed. Finally, such the variables as $X_2$ and $X_{11}$ were evaluated to be specific variables which reflect the characteristics of rural area.

Modeling clustered count data with discrete weibull regression model

  • Yoo, Hanna
    • Communications for Statistical Applications and Methods
    • /
    • v.29 no.4
    • /
    • pp.413-420
    • /
    • 2022
  • In this study we adapt discrete weibull regression model for clustered count data. Discrete weibull regression model has an attractive feature that it can handle both under and over dispersion data. We analyzed the eighth Korean National Health and Nutrition Examination Survey (KNHANES VIII) from 2019 to assess the factors influencing the 1 month outpatient stay in 17 different regions. We compared the results using clustered discrete Weibull regression model with those of Poisson, negative binomial, generalized Poisson and Conway-maxwell Poisson regression models, which are widely used in count data analyses. The results show that the clustered discrete Weibull regression model using random intercept model gives the best fit. Simulation study is also held to investigate the performance of the clustered discrete weibull model under various dispersion setting and zero inflated probabilities. In this paper it is shown that using a random effect with discrete Weibull regression can flexibly model count data with various dispersion without the risk of making wrong assumptions about the data dispersion.

Effects on Regression Estimates under Misspecified Generalized Linear Mixed Models for Counts Data

  • Jeong, Kwang Mo
    • The Korean Journal of Applied Statistics
    • /
    • v.25 no.6
    • /
    • pp.1037-1047
    • /
    • 2012
  • The generalized linear mixed model(GLMM) is widely used in fitting categorical responses of clustered data. In the numerical approximation of likelihood function the normality is assumed for the random effects distribution; subsequently, the commercial statistical packages also routinely fit GLMM under this normality assumption. We may also encounter departures from the distributional assumption on the response variable. It would be interesting to investigate the impact on the estimates of parameters under misspecification of distributions; however, there has been limited researche on these topics. We study the sensitivity or robustness of the maximum likelihood estimators(MLEs) of GLMM for counts data when the true underlying distribution is normal, gamma, exponential, and a mixture of two normal distributions. We also consider the effects on the MLEs when we fit Poisson-normal GLMM whereas the outcomes are generated from the negative binomial distribution with overdispersion. Through a small scale Monte Carlo study we check the empirical coverage probabilities of parameters and biases of MLEs of GLMM.

Relationship between Interstate Highway Accidents and Heterogeneous Geometrics by Random Parameter Negative Binomial Model - A case of Interstate Highway in Washington State, USA (확률적 모수를 고려한 음이항모형에 의한 교통사고와 기하구조와의 관계 - 미국 워싱턴 주(州) 고속도로를 중심으로)

  • Park, Minho
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.33 no.6
    • /
    • pp.2437-2445
    • /
    • 2013
  • The objective of this study is finding the relationship between interstate highway accident frequencies and geometrics using Random Parameter Negative Binomial model. Even though it is impossible to take account of the same design criteria to the all segments or corridors on the road in reality, previous research estimated the fixed value of coefficients without considering each segment's characteristic. The drawback of the traditional negative binomial is not to explain the integrated variations in terms of time and the distinct characters specific segment has. This results in under-estimation of the standard error which inflates the t-value and finally, affects the modeling estimation. Therefore, this study tries to find the relationship of accident frequencies with the heterogeneous geometrics using 9-years and 7-interstate highway data in Washington State area. 16-types of geometrics are used to derive the model which is compared with the traditional negative binomial Model to understand which Model is more suitable. In addition, by calculating marginal effect and elasticity, heterogeneous variables' effect to the accidents are estimated. Hopefully, this study will help to estiblish the future policy of geometrics.

The Effects of Collaborative R&D Activity on Product and Process Innovation: A Negative Binomial Modeling Approach (기업의 공동연구개발활동이 제품혁신 및 공정혁신에 미치는 영향 - 음이항회귀모형을 활용하여 -)

  • Kim, Chanyong;Choi, Ye Seul;Lim, Up
    • Journal of the Korean Regional Science Association
    • /
    • v.31 no.4
    • /
    • pp.107-128
    • /
    • 2015
  • Technology innovation is a competitive weapon of sustainable economic growth at the urban and regional level and the growth of firms. In this study, we empirically investigate the effects of collaborative R&D activity on product innovative outputs and process innovative outputs in manufacturing firms in Korea. We analyze the links between collaborative R&D activity and two types of innovative outputs using an alternative negative binomial regression model. The major finding is that collaborative R&D activity has significant positive effects on both product and process innovation. The results also identify a positive link between all types of innovative outputs and other R&D activities including internal R&D activity, patent activity, external technology and capital goods acquisitions. To induce corporate growth that enhances the productivity of individual firms and produces prolonged economic growth, policy makers should place greater emphasis on creating effective arrangements to promote establishing collaborative R&D strategies for manufacturing firms.

Accident Models of Circular Intersections in Korea (국내 원형교차로 사고모형)

  • Lee, Seung Ju;Park, Min Kyu;Park, Byung Ho
    • Journal of the Korean Society of Safety
    • /
    • v.29 no.1
    • /
    • pp.54-58
    • /
    • 2014
  • This study deals with the accidents of circular intersections in Korea. The goal is to develop the accident models for 94 circular intersections. In pursuing the above, this study gives particular attentions to collecting the data of geometric structure and accidents, and comparatively analyzing such the models as Poisson and NB regression and multiple regression model using SPSS 17.0 and LIMDEP 3.0. The main results are as follows. First, the negative binomial model among various models was analyzed to be the most appropriate. Second, 3 independent variables was adopted in the model, and these variables was analyzed to have a positive relation to the accident rate. Finally, the reduced width of circulatory roadway, removal of the parking lot within circulatory roadway and appropriate levels of approach lane were required to improve the safety of circular intersection.

A Study on Shipments of Swimming Crab Using Negative Binomial Regression Model (음이항회귀모형을 이용한 꽃게 출하량에 관한 연구)

  • Nam, Yeongeun;Seo, Jihyun;Choi, Gayeong;Lee, Kyeongjun
    • Journal of the Korean Data Analysis Society
    • /
    • v.20 no.6
    • /
    • pp.2941-2951
    • /
    • 2018
  • The purpose of this paper is to analyse the effect of ocean weather factors on shipments of swimming crab. We use the data of data portal and ocean weather factors (mean wind velocity, mean atmospheric pressure, mean relative humidity, mean air temperature, mean water temperature, mean maximum wave height, mean significant wave height, maximum significant wave height, maximum wave height, mean wave period, maximum wave period). We did statistical analysis using Poisson regression analysis and negative binomial regression analysis. As the result of study, important factors influential in the shipments of swimming crab turn out to be mean wind velocity, mean atmospheric pressure, mean relative humidity, mean water temperature, maximum wave height, mean wave period and maximum wave period. the shipments of swimming crab increases as mean wind velocity, mean atmospheric pressure, mean relative humidity, mean water temperature increases or mean wave period increase. However, as maximum wave height, maximum wave period decreases, the shipment of swimming crab increases.