• Title/Summary/Keyword: 포아송 count data

Search Result 33, Processing Time 0.022 seconds

Bayesian Approaches to Zero Inflated Poisson Model (영 과잉 포아송 모형에 대한 베이지안 방법 연구)

  • Lee, Ji-Ho;Choi, Tae-Ryon;Wo, Yoon-Sung
    • The Korean Journal of Applied Statistics
    • /
    • v.24 no.4
    • /
    • pp.677-693
    • /
    • 2011
  • In this paper, we consider Bayesian approaches to zero inflated Poisson model, one of the popular models to analyze zero inflated count data. To generate posterior samples, we deal with a Markov Chain Monte Carlo method using a Gibbs sampler and an exact sampling method using an Inverse Bayes Formula(IBF). Posterior sampling algorithms using two methods are compared, and a convergence checking for a Gibbs sampler is discussed, in particular using posterior samples from IBF sampling. Based on these sampling methods, a real data analysis is performed for Trajan data (Marin et al., 1993) and our results are compared with existing Trajan data analysis. We also discuss model selection issues for Trajan data between the Poisson model and zero inflated Poisson model using various criteria. In addition, we complement the previous work by Rodrigues (2003) via further data analysis using a hierarchical Bayesian model.

Bayesian Analysis for the Zero-inflated Regression Models (영과잉 회귀모형에 대한 베이지안 분석)

  • Jang, Hak-Jin;Kang, Yun-Hee;Lee, S.;Kim, Seong-W.
    • The Korean Journal of Applied Statistics
    • /
    • v.21 no.4
    • /
    • pp.603-613
    • /
    • 2008
  • We often encounter the situation that discrete count data have a large portion of zeros. In this case, it is not appropriate to analyze the data based on standard regression models such as the poisson or negative binomial regression models. In this article, we consider Bayesian analysis for two commonly used models. They are zero-inflated poisson and negative binomial regression models. We use the Bayes factor as a model selection tool and computation is proceeded via Markov chain Monte Carlo methods. Crash count data are analyzed to support theoretical results.

A new sample selection model for overdispersed count data (과대산포 가산자료의 새로운 표본선택모형)

  • Jo, Sung Eun;Zhao, Jun;Kim, Hyoung-Moon
    • The Korean Journal of Applied Statistics
    • /
    • v.31 no.6
    • /
    • pp.733-749
    • /
    • 2018
  • Sample selection arises as a result of the partial observability of the outcome of interest in a study. Heckman introduced a sample selection model to analyze such data and proposed a full maximum likelihood estimation method under the assumption of normality. Recently sample selection models for binomial and Poisson response variables have been proposed. Based on the theory of symmetry-modulated distribution, we extend these to a model for overdispersed count data. This type of data with no sample selection is often modeled using negative binomial distribution. Hence we propose a sample selection model for overdispersed count data using the negative binomial distribution. A real data application is employed. Simulation studies reveal that our estimation method based on profile log-likelihood is stable.

Overdispersion in count data - a review (가산자료(count data)의 과산포 검색: 일반화 과정)

  • 김병수;오경주;박철용
    • The Korean Journal of Applied Statistics
    • /
    • v.8 no.2
    • /
    • pp.147-161
    • /
    • 1995
  • The primary objective of this paper is to review parametric models and test statistics related to overdspersion of count data. Poisson or binomial assumption often fails to explain overdispersion. We reviewed real examples of overdispersion in count data that occurred in toxicological or teratological experiments. We also reviewed several models that were suggested for implementing experiments. We also reviewed several models that were suggested for implementing the extra-binomial variation or hyper-Poisson variability, and we noted how these models were generalized and further developed. The approaches that have been suggested for the overdispersion fall into two broad categories. The one is to develop a parametric model for it, and the other is to assume a particular relationship between the variance and the mean of the response variable and to derive a score test staistics for detecting the overdispersion. Recently, Dean(1992) derived a general score test statistics for detecting overdispersion from the exponential family.

  • PDF

The Effects of Dispersion Parameters and Test for Equality of Dispersion Parameters in Zero-Truncated Bivariate Generalized Poisson Models (제로절단된 이변량 일반화 포아송 분포에서 산포모수의 효과 및 산포의 동일성에 대한 검정)

  • Lee, Dong-Hee;Jung, Byoung-Cheol
    • The Korean Journal of Applied Statistics
    • /
    • v.23 no.3
    • /
    • pp.585-594
    • /
    • 2010
  • This study, investigates the effects of dispersion parameters between two response variables in zero-truncated bivariate generalized Poisson distributions. A Monte Carlo study shows that the zero-truncated bivariate Poisson and negative binomial models fit poorly wherein the zero-truncated bivariate count data has heterogeneous dispersion parameters on dependent variables. In addition, we derive the score test for testing the equality of the dispersion parameters and compare its efficiency with the likelihood ratio test.

Analysis of counts in the one-way layout (일원배열 가산자료에서의 처리효과 비교)

  • 이선호
    • The Korean Journal of Applied Statistics
    • /
    • v.10 no.1
    • /
    • pp.105-119
    • /
    • 1997
  • Barnwal and Paul(1988) derived the likelihood ratio statistic and $C(\alpha)$ statistic for testing the equality of the means of several groups of count data in the presence of a common dispersion parameter. These tests are generalized to be applicable without the restriction of a common dispersion parameter. And the assumed model of data is also extended from negative binomial to double exponential Poisson model. Monte Carlo simulations show the superiority of $C(\alpha)$ statistic based on the double exponential Poisson family which has a very simple form and requires estimates of the parameters only under the null hypothesis.

  • PDF

The study on the determinants of the number of job changes (중소기업 청년인턴 이직횟수 결정요인 분석)

  • Park, Sungik;Ryu, Jangsoo;Kim, Jonghan;Cho, Jangsik
    • Journal of the Korean Data and Information Science Society
    • /
    • v.26 no.2
    • /
    • pp.387-397
    • /
    • 2015
  • In this paper, the determinants of the number of job changes in the SMEs (small and medium enterprises) youth-intern project is analysed, utilizing SMEs youth-intern DB and employment insurance DB. Since the number of job changes are count data which take integer values other than negative values, general linear regression analysis becomes inappropriate. Therefore, four models such as Poisson regression model, zero inflated Poisson regression model, negative binomial regression model and zero inflated negative binomial regression model are tried to fit count data. A zero inflated negative binomial regression model is selected to be the best model. Major results are the followings. First, the number of job changes is shown to be significantly smaller in the treatment group than in the control group. Second, the number of job changes turns out to be significantly smaller in the young-age group than in the old-age group. Third, it is also shown that the number of job changes of man is significantly greater than that of woman. Lastly, the number of job changes in the bigger firm is shown to be significantly less than that of the smaller firm.

A Zero-Inated Model for Insurance Data (제로팽창 모형을 이용한 보험데이터 분석)

  • Choi, Jong-Hoo;Ko, In-Mi;Cheon, Soo-Young
    • The Korean Journal of Applied Statistics
    • /
    • v.24 no.3
    • /
    • pp.485-494
    • /
    • 2011
  • When the observations can take only the non-negative integer values, it is called the count data such as the numbers of car accidents, earthquakes, or insurance coverage. In general, the Poisson regression model has been used to model these count data; however, this model has a weakness in that it is restricted by the equality of the mean and the variance. On the other hand, the count data often tend to be too dispersed to allow the use of the Poisson model in practice because the variance of data is significantly larger than its mean due to heterogeneity within groups. When overdispersion is not taken into account, it is expected that the resulting parameter estimates or standard errors will be inefficient. Since coverage is the main issue for insurance, some accidents may not be covered by insurance, and the number covered by insurance may be zero. This paper considers the zero-inflated model for the count data including many zeros. The performance of this model has been investigated by using of real data with overdispersion and many zeros. The results indicate that the Zero-Inflated Negative Binomial Regression Model performs the best for model evaluation.

Ex-ante and Ex-post Economic Value Analysis on Ecological River Restoration Project (생태하천복원사업 전후 경제적 가치 비교분석)

  • Lee, Yoon;Chang, Hoon;Yoon, Taeyeon;Chung, Young-Keun;Park, Heeyoung
    • Journal of the Korean Regional Science Association
    • /
    • v.31 no.3
    • /
    • pp.39-54
    • /
    • 2015
  • To assess an economic value of Cheonggyecheon river restoration project, an in-depth exit survey data was collected to apply travel cost method in this study. Poisson model, Negative Binomial, Zero-truncated Poisson, and Zero-truncated Negative Binomial model were executed due to the nature of count data. Empirical results showed that regressors were statistically significant and corresponded to general consumer theory. Since our survey data showed over-dispersion, Zero-truncated Negative Binomial was selected as an optimal one to analyze travel demand of Cheonggyecheon by model goodness of fit test among those aforementioned empirical models. Estimating an economic value of Cheonggyecheon river restoration project, which is known as an ecological river restoration project, we used annual visit of individual traveler and an optimal model. Suffice to say that the annual economic value of Cheonggyecheon river restoration project was estimated as 193.4 billion won in 2013.

Fit of the number of insurance solicitor's turnovers using zero-inflated negative binomial regression (영과잉 음이항회귀 모형을 이용한 보험설계사들의 이직횟수 적합)

  • Chun, Heuiju
    • Journal of the Korean Data and Information Science Society
    • /
    • v.28 no.5
    • /
    • pp.1087-1097
    • /
    • 2017
  • This study aims to find the best model to fit the number of insurance solicitor's turnovers of life insurance companies using count data regression models such as poisson regression, negative binomial regression, zero-inflated poisson regression, or zero-inflated negative binomial regression. Out of the four models, zero-inflated negative binomial model has been selected based on AIC and SBC criteria, which is due to over-dispersion and high proportion of zero-counts. The significant factors to affect insurance solicitor's turnover found to be a work period in current company, a total work period as financial planner, an affiliated corporation, and channel management satisfaction. We also have found that as the job satisfaction or the channel management satisfaction gets lower as channel management satisfaction, the number of insurance solicitor's turnovers increases. In addition, the total work period as financial planner has positive relationship with the number of insurance solicitor's turnovers, but the work period in current company has negative relationship with it.