• Title/Summary/Keyword: Overdispersion

Search Result 47, Processing Time 0.028 seconds

Using the corrected Akaike's information criterion for model selection (모형 선택에서의 수정된 AIC 사용에 대하여)

  • Song, Eunjung;Won, Sungho;Lee, Woojoo
    • The Korean Journal of Applied Statistics
    • /
    • v.30 no.1
    • /
    • pp.119-133
    • /
    • 2017
  • Corrected Akaike's information criterion (AICc) is known to have better finite sample properties. However, Akaike's information criterion (AIC) is still widely used to select an optimal prediction model among several candidate models due to of a lack of research on benefits obtained using AICc. In this paper, we compare the performance of AIC and AICc through numerical simulations and confirm the advantage of using AICc. In addition, we also consider the performance of quasi Akaike's information criterion (QAIC) and the corrected quasi Akaike's information criterion (QAICc) for binomial and Poisson data under overdispersion phenomenon.

Effects on Regression Estimates under Misspecified Generalized Linear Mixed Models for Counts Data

  • Jeong, Kwang Mo
    • The Korean Journal of Applied Statistics
    • /
    • v.25 no.6
    • /
    • pp.1037-1047
    • /
    • 2012
  • The generalized linear mixed model(GLMM) is widely used in fitting categorical responses of clustered data. In the numerical approximation of likelihood function the normality is assumed for the random effects distribution; subsequently, the commercial statistical packages also routinely fit GLMM under this normality assumption. We may also encounter departures from the distributional assumption on the response variable. It would be interesting to investigate the impact on the estimates of parameters under misspecification of distributions; however, there has been limited researche on these topics. We study the sensitivity or robustness of the maximum likelihood estimators(MLEs) of GLMM for counts data when the true underlying distribution is normal, gamma, exponential, and a mixture of two normal distributions. We also consider the effects on the MLEs when we fit Poisson-normal GLMM whereas the outcomes are generated from the negative binomial distribution with overdispersion. Through a small scale Monte Carlo study we check the empirical coverage probabilities of parameters and biases of MLEs of GLMM.

Negative binomial loglinear mixed models with general random effects covariance matrix

  • Sung, Youkyung;Lee, Keunbaik
    • Communications for Statistical Applications and Methods
    • /
    • v.25 no.1
    • /
    • pp.61-70
    • /
    • 2018
  • Modeling of the random effects covariance matrix in generalized linear mixed models (GLMMs) is an issue in analysis of longitudinal categorical data because the covariance matrix can be high-dimensional and its estimate must satisfy positive-definiteness. To satisfy these constraints, we consider the autoregressive and moving average Cholesky decomposition (ARMACD) to model the covariance matrix. The ARMACD creates a more flexible decomposition of the covariance matrix that provides generalized autoregressive parameters, generalized moving average parameters, and innovation variances. In this paper, we analyze longitudinal count data with overdispersion using GLMMs. We propose negative binomial loglinear mixed models to analyze longitudinal count data and we also present modeling of the random effects covariance matrix using the ARMACD. Epilepsy data are analyzed using our proposed model.

Estimating Parameters in Overdispersed Binary Data

  • Lee, Sunho
    • Communications for Statistical Applications and Methods
    • /
    • v.7 no.1
    • /
    • pp.269-276
    • /
    • 2000
  • there are several methods available for estimating parameters in overdispersed binary response data with the litter effect. Simulations are performed to compare methods for estimating an overall mean and an overdispersion parameter using moments a maximum likelihood under a beta-binomial distribution a maximum quasi-likelihood and a maximum extended quasi-likelihood.

  • PDF

Development of the U-turn Accident Model at 4-Legged Signalized Intersections in Urban Areas (도시부 4지 신호교차로 유턴 사고모형 개발)

  • Kang, JongHo;Kim, KyungWhan;Ha, ManBok;Kim, SeongMun
    • International Journal of Highway Engineering
    • /
    • v.16 no.2
    • /
    • pp.119-129
    • /
    • 2014
  • PURPOSES : The purpose of this study is to develop the U-turn accident model at 4-legged signalized intersections in urban areas. METHODS : In order to analyze the characteristics of the accidents which are associated with U-turn operation at 4-legged signalized intersections in urban areas and develop an U-turn accident model by regression analysis, the tests of overdispersion and zero-inflation are conducted about the dependent variables of number of accidents and EPDO (Equivalent Property Damage Only). RESULTS : As their results, the Poisson model fits best for number of accident and the ZIP (Zero Inflated Poisson) fits best for EPOD, the variables of conflict traffic, width of opposing road, traffic passing speed are adopted as independent variable for both models. The variables of number of bus berths and rate of U-turn signal time at which the U-turn is permitted are adopted as independent variable only for EPDO. CONCLUSIONS : These study results suggest that U-turn would be permitted at the intersection where the width of opposing road is wider than 11.9 meters, the passing vehicle speed is not high and U-turn operation is not hindered by the buses stopping at bus stops.

Improvement of Multivariable, Nonlinear, and Overdispersion Modeling with Deep Learning: A Case Study on Prediction of Vehicle Fuel Consumption Rate (딥러닝을 이용한 다변량, 비선형, 과분산 모델링의 개선: 자동차 연료소모량 예측)

  • HAN, Daeseok;YOO, Inkyoon;LEE, Suhyung
    • International Journal of Highway Engineering
    • /
    • v.19 no.4
    • /
    • pp.1-7
    • /
    • 2017
  • PURPOSES : This study aims to improve complex modeling of multivariable, nonlinear, and overdispersion data with an artificial neural network that has been a problem in the civil and transport sectors. METHODS: Deep learning, which is a technique employing artificial neural networks, was applied for developing a large bus fuel consumption model as a case study. Estimation characteristics and accuracy were compared with the results of conventional multiple regression modeling. RESULTS : The deep learning model remarkably improved estimation accuracy of regression modeling, from R-sq. 18.76% to 72.22%. In addition, it was very flexible in reflecting large variance and complex relationships between dependent and independent variables. CONCLUSIONS : Deep learning could be a new alternative that solves general problems inherent in conventional statistical methods and it is highly promising in planning and optimizing issues in the civil and transport sectors. Extended applications to other fields, such as pavement management, structure safety, operation of intelligent transport systems, and traffic noise estimation are highly recommended.

Multi-Site Stochastic Weather Generator for Daily Rainfall in Korea (시공간구조를 가지는 확률적 강우 모형)

  • Kwak, Minjung;Kim, Yongku
    • The Korean Journal of Applied Statistics
    • /
    • v.27 no.3
    • /
    • pp.475-485
    • /
    • 2014
  • A stochastic weather generator based on a generalized linear model (GLM) approach is a commonly used tools to simulate a time series of daily weather. In this paper, we propose a multi-site weather generator with applications to historical data in South Korea. The proposed method extends the approach of Kim et al. (2012) by considering spatial dependence in the model. To reduce this phenomenon, we also incorporate a time series of seasonal mean precipitations of South Korea in the GLM weather generator as a covariate. Spatial dependence was incorporated into the model through a latent Gaussian process. We apply the proposed model to precipitation data provided by 62 stations in Korea from 1973{2011.

Traffic Accident Models of Cheongju Four-Legged Signalized Intersections by Accident Type (사고유형에 따른 청주시 4지 신호교차로 교통사고모형)

  • Park, Byung-Ho;Han, Sang-Wook;Kim, Tae-Young;Kim, Won-Ho
    • Journal of Korean Society of Transportation
    • /
    • v.26 no.5
    • /
    • pp.153-162
    • /
    • 2008
  • This study deals with the traffic accidents at the 4-legged signalized intersections in Cheong-ju. The purpose is to comparatively analyze the characteristics and models by the accident type using the data of 143 intersections. In pursuing the above, this study gives particular emphasis to modeling such the accidents as head on collision, rear end collision, side swipe, side right angle collision, and others. The main results are the followings. First, the overdispersion tests show that the negative binomial regression models are appropriate to the traffic accident data in the above contexts. Second, five accident models are developed, which are all analyzed to be statistically significant. Finally, the models are comparatively evaluated using the common variable(ADT) and type-specific variables.

On the Extension of Test Statistics for Detecting Negative Binomial Departures from the Poisson Assumption (포아송으로부터 부의 이항분포로의 이탈에 대한 검정통계량의 확장)

  • 이선호
    • Journal of the Korean Statistical Society
    • /
    • v.22 no.2
    • /
    • pp.171-190
    • /
    • 1993
  • 포아송분포로부터 부의 이항분포로의 이탈을 검색하는 통계량들이 자료의 형태에 따라 여러가지 제시되었다. 그런데 대립가설인 부의 이항분포의 모수화 방법에 따라 분산과 평균의 구조가 변하고 국소 최적 검정 통계량도 달라진다는 것이 알려졌다. 본 논문에서는 대립가설을 일반적인 포아송 혼합분포로까지 확장시키고, 일반적인 형태의 분산과 평균의 구조에도 검정 가능한 새로운 통계량 L을 소개하고 있다. 또한 L 통계량은 포아송 분포로부터 부의 이항분포로의 이탈을 다루는 기존의 여러 통계량들의 일반화된 형태임을 보였다. 점근적 상대효율과 모의 실험을 통하여 L 통계량과 기존의 통계량들을 비교한 결과 분산과 평균사이의 구조에 상관없이 L 통계량이 우수한 것임을 입증하였다.

  • PDF

Developing Rear-End Collision Models of Roundabouts in Korea (국내 회전교차로의 추돌사고 모형 개발)

  • Park, Byung Ho;Beak, Tae Hun
    • Journal of the Korean Society of Safety
    • /
    • v.29 no.6
    • /
    • pp.151-157
    • /
    • 2014
  • This study deals with the rear-end collision at roundabouts. The purpose of this study is to develop the accident models of rear-end collision in Korea. In pursuing the above, this study gives particular attention to developing the appropriate models using Poisson, negative binomial model, ZAM, multiple linear and nonlinear regression models, and statistical analysis tools. The main results are as follows. First, the Vuong statistics and overdispersion parameters indicate that ZIP is the most appropriate model among count data models. Second, RMSE, MPB, MAD and correlation coefficient tests show that the multiple nonlinear model is the most suitable to the rear-end collision data. Finally, such the independent variables as traffic volume, ratio of heavy vehicle, number of circulatory roadway lane, number of crosswalk and stop line are adopted in the optimal model.