• 제목/요약/키워드: overdispersed data

검색결과 14건 처리시간 0.022초

Estimating Parameters in Overdispersed Binary Data

  • Lee, Sunho
    • Communications for Statistical Applications and Methods
    • /
    • 제7권1호
    • /
    • pp.269-276
    • /
    • 2000
  • there are several methods available for estimating parameters in overdispersed binary response data with the litter effect. Simulations are performed to compare methods for estimating an overall mean and an overdispersion parameter using moments a maximum likelihood under a beta-binomial distribution a maximum quasi-likelihood and a maximum extended quasi-likelihood.

  • PDF

과대산포 가산자료의 새로운 표본선택모형 (A new sample selection model for overdispersed count data)

  • 조성은;조준;김형문
    • 응용통계연구
    • /
    • 제31권6호
    • /
    • pp.733-749
    • /
    • 2018
  • 어떠한 연구에서 관심의 대상이 되는 관찰치가 부분적으로 관측 가능할 때 표본선택의 문제가 일어난다. 이러한 자료를 분석하기 위해 헤크만은 표본선택 모형을 개발하였고 이변량 정규분표의 가정 하에 최대우도방법을 사용하여 모수를 추정하였다. 최근 이항자료와 포아송 자료에 대한 표본선택모형이 제안되었다. 이를 분포조정에 기초하여 과대산포 자료에 대한 모형으로 확장하고자 한다. 표본선택이 없는 과대산포 자료는 흔히 음이항 분포로 분석되어진다. 따라서 음이항 분포를 이용하고 분포조정을 도입한 과대산포 자료에 대한 새로운 모형을 제시하고자 한다. 실제 자료를 이용하여 분석을 하였다. 모의실험 결과 프로파일 우도함수를 이용하여 모수에 대해 추정한 결과는 안정적이다.

Modelling Count Responses with Overdispersion

  • Jeong, Kwang Mo
    • Communications for Statistical Applications and Methods
    • /
    • 제19권6호
    • /
    • pp.761-770
    • /
    • 2012
  • We frequently encounter outcomes of count that have extra variation. This paper considers several alternative models for overdispersed count responses such as a quasi-Poisson model, zero-inflated Poisson model and a negative binomial model with a special focus on a generalized linear mixed model. We also explain various goodness-of-fit criteria by discussing their appropriateness of applicability and cautions on misuses according to the patterns of response categories. The overdispersion models for counts data have been explained through two examples with different response patterns.

외래이용빈도 분석의 모형과 기법 (A Ppoisson Regression Aanlysis of Physician Visits)

  • 이영조;한달선;배상수
    • 보건행정학회지
    • /
    • 제3권2호
    • /
    • pp.159-176
    • /
    • 1993
  • The utilization of outpatient care services involves two steps of sequential decisions. The first step decision is about whether to initiate the utilization and the second one is about how many more visits to make after the initiation. Presumably, the initiation decision is largely made by the patient and his or her family, while the number of additional visits is decided under a strong influence of the physician. Implication is that the analysis of the outpatient care utilization requires to specify each of the two decisions underlying the utilization as a distinct stochastic process. This paper is concerned with the number of physician visits, which is, by definition, a discrete variable that can take only non-negative integer values. Since the initial visit is considered in the analysis of whether or not having made any physician visit, the focus on the number of visits made in addition to the initial one must be enough. The number of additional visits, being a kind of count data, could be assumed to exhibit a Poisson distribution. However, it is likely that the distribution is over dispersed since the number of physician visits tends to cluster around a few values but still vary widely. A recently reported study of outpatient care utilization employed an analysis based upon the assumption of a negative binomial distribution which is a type of overdispersed Poisson distribution. But there is an indication that the use of Poisson distribution making adjustments for over-dispersion results in less loss of efficiency in parameter estimation compared to the use of a certain type of distribution like a negative binomial distribution. An analysis of the data for outpatient care utilization was performed focusing on an assessment of appropriateness of available techniques. The data used in the analysis were collected by a community survey in Hwachon Gun, Kangwon Do in 1990. It was observed that a Poisson regression with adjustments for over-dispersion is superior to either an ordinary regression or a Poisson regression without adjustments oor over-dispersion. In conclusion, it seems the most approprite to assume that the number of physician visits made in addition to the initial visist exhibits an overdispersed Poisson distribution when outpatient care utilization is studied based upon a model which embodies the two-part character of the decision process uderlying the utilization.

  • PDF

Score Tests for Overdispersion

  • Kim, Choong-Rak;Jeong, Mee-Seon;Yang, Mee-Yeong
    • Journal of the Korean Statistical Society
    • /
    • 제23권1호
    • /
    • pp.207-216
    • /
    • 1994
  • Count data are often overdispersed, and an appropriate test for the existence of the overdispersion is necessary. In this paper we derive a score test based on the extended quasi-likelihood and the pseudolikelihood after adjusting to the Bartlett factor. Also, we compare it with Levene (1960)'s F-type test suggested by Ganio and Schafer (1992).

  • PDF

Sample size calculations for clustered count data based on zero-inflated discrete Weibull regression models

  • Hanna Yoo
    • Communications for Statistical Applications and Methods
    • /
    • 제31권1호
    • /
    • pp.55-64
    • /
    • 2024
  • In this study, we consider the sample size determination problem for clustered count data with many zeros. In general, zero-inflated Poisson and binomial models are commonly used for zero-inflated data; however, in real data the assumptions that should be satisfied when using each model might be violated. We calculate the required sample size based on a discrete Weibull regression model that can handle both underdispersed and overdispersed data types. We use the Monte Carlo simulation to compute the required sample size. With our proposed method, a unified model with a low failure risk can be used to cope with the dispersed data type and handle data with many zeros, which appear in groups or clusters sharing a common variation source. A simulation study shows that our proposed method provides accurate results, revealing that the sample size is affected by the distribution skewness, covariance structure of covariates, and amount of zeros. We apply our method to the pancreas disorder length of the stay data collected from Western Australia.

Effects of Overdispersion on Testing for Serial Dependence in the Time Series of Counts Data

  • Kim, Hee-Young;Park, You-Sung
    • Communications for Statistical Applications and Methods
    • /
    • 제17권6호
    • /
    • pp.829-843
    • /
    • 2010
  • To test for the serial dependence in time series of counts data, Jung and Tremayne (2003) evaluated the size and power of several tests under the class of INARMA models based on binomial thinning operations for Poisson marginal distributions. The overdispersion phenomenon(i.e., a variance greater than the expectation) is common in the real world. Overdispersed count data can be modeled by using alternative thinning operations such as random coefficient thinning, iterated thinning, and quasi-binomial thinning. Such thinning operations can lead to time series models of counts with negative binomial or generalized Poisson marginal distributions. This paper examines whether the test statistics used by Jung and Tremayne (2003) on serial dependence in time series of counts data are affected by overdispersion.

Negative Binomial Varying Coefficient Partially Linear Models

  • Kim, Young-Ju
    • Communications for Statistical Applications and Methods
    • /
    • 제19권6호
    • /
    • pp.809-817
    • /
    • 2012
  • We propose a semiparametric inference for a generalized varying coefficient partially linear model(VCPLM) for negative binomial data. The VCPLM is useful to model real data in that varying coefficients are a special type of interaction between explanatory variables and partially linear models fit both parametric and nonparametric terms. The negative binomial distribution often arise in modelling count data which usually are overdispersed. The varying coefficient function estimators and regression parameters in generalized VCPLM are obtained by formulating a penalized likelihood through smoothing splines for negative binomial data when the shape parameter is known. The performance of the proposed method is then evaluated by simulations.

일반화된 선형 혼합 모형(GENERALIZED LINEAR MIXED MODEL: GLMM)에 관한 최근의 연구 동향 (A Study for Recent Development of Generalized Linear Mixed Model)

  • 이준영
    • 응용통계연구
    • /
    • 제13권2호
    • /
    • pp.541-562
    • /
    • 2000
  • 일반화된 선형 혼합 모형(GLMM)은 자료가 계수의 형태로 나타나는 범주형 자료의 경우, 혹은 집락의 형태나 과산포된 비정규 자료, 또는 비선형 모형에 따르는 자료를 다루기 위한 모형 설정에 사용된다. 본 연구에서는 이에 대한 개요와 더불어, 이 모형의 적합을 위해 제시된 통계적 기법들중 의사가능도(quasi-likelihood: QL)를 이용한 추정 방법 및 Monte-Carlo 기법을 이용한 추정 방법들에 대해 조사하였다. 또한 GLMM에 대한 현재의 연구 방향 및 앞으로의 연구 가능 주제들에 대해서도 언급하였다.

  • PDF