• Title/Summary/Keyword: 음이항

Search Result 150, Processing Time 0.025 seconds

Fitting Distribution of Accident Frequency of Freeway Horizontal Curve Sections & Development of Negative Binomial Regression Models (고속도로 평면선형상 사고빈도분포 추정을 통한 음이항회귀모형 개발 (기하구조요인을 중심으로))

  • 강민욱;도철웅;손봉수
    • Journal of Korean Society of Transportation
    • /
    • v.20 no.7
    • /
    • pp.197-204
    • /
    • 2002
  • 교통사고예측 및 예방을 위해서는 실제적으로 도로설계과정에서 제어가 가능한 도로 기하구조요소에 대한 사고관계를 파악함이 타당하다. 즉, 도로의 설계자는 도로건설에 앞서 기하구조요소와 사고와의 관계를 현장자료를 통해 정확히 밝혀 도로설계에 반영해야 한다. 이를 위해, 교통사고의 빈도분포를 박히는 것은 가장 기본이 되는 일이며, 교통사고 예측모형개발에 선행되어야 한다. 일반적으로 교통사고건수의 경우 분산이 평균보다 큰 과분산(overdispersion)의 특징을 가지고 있어 음이항 분포를 따른다고 알려져 있다. 따라서 본 논문은 사고모형의 개발에 앞서, 사고발생지점에 대한 도로설계요소와 기타 잠재적인 사고발생 관련요인이 비교적 잘 파악되어있는 호남고속도로를 중심으로 평면 선형상 곡선부에 대하여 교통사고의 분포를 적합도 검정을 통해 알아보고자 하였다. 사고자료는 한국도로송사의 호남고속도로 5년(1996∼2000)간 자료를 분석에 맞게 정리하였으며, 강민욱과 송봉수(2002)에서 제시한 평면선형에 있어서의 구간분할법을 이용하여 배향곡선구간과 단일곡선구간에 대한 사고분석을 하였다. 적합도 분석결과, 예상대로 음이항분포가 사고건수를 설명하기에 가장 적합한 확률분포로 제시되었으며, 이를 통해 최우추정법을 이용한 음이항회귀모형을 개발하였다. 구간분할법을 적용한 음이항회귀모형의 경우, 기존의 확률회귀토형에 비하여 높은 결정계수를 갖았으며, 모형에서 적용된 기하구조요소로는 차량 노출계수, 곡선반경, 단위거리 당 편경사변화값 등이다.

A study on MERS-CoV outbreak in Korea using Bayesian negative binomial branching processes (베이지안 음이항 분기과정을 이용한 한국 메르스 발생 연구)

  • Park, Yuha;Choi, Ilsu
    • Journal of the Korean Data and Information Science Society
    • /
    • v.28 no.1
    • /
    • pp.153-161
    • /
    • 2017
  • Branching processes which is used for epidemic dispersion as stochastic process model have advantages to estimate parameters by real data. We have to estimate both mean and dispersion parameter in order to use the negative binomial distribution as an offspring distribution on branching processes. In existing studies on biology and epidemiology, it is estimated using maximum-likelihood methods. However, for most of epidemic data, it is hard to get the best precision of maximum-likelihood estimator. We suggest a Bayesian inference that have good properties of statistics for small-sample. After estimating dispersion parameter we modelled the posterior distribution for 2015 Korea MERS cases. As the result, we found that the estimated dispersion parameter is relatively stable no matter how we assume prior distribution. We also computed extinction probabilities on branching processes using estimated dispersion parameters.

Zero-Inflated INGARCH Using Conditional Poisson and Negative Binomial: Data Application (조건부 포아송 및 음이항 분포를 이용한 영-과잉 INGARCH 자료 분석)

  • Yoon, J.E.;Hwang, S.Y.
    • The Korean Journal of Applied Statistics
    • /
    • v.28 no.3
    • /
    • pp.583-592
    • /
    • 2015
  • Zero-inflation has recently attracted much attention in integer-valued time series. This article deals with conditional variance (volatility) modeling for the zero-inflated count time series. We incorporate zero-inflation property into integer-valued GARCH (INGARCH) via conditional Poisson and negative binomial marginals. The Cholera frequency time series is analyzed as a data application. Estimation is carried out using EM-algorithm as suggested by Zhu (2012).

Bivariate Zero-Inflated Negative Binomial Regression Model with Heterogeneous Dispersions (서로 다른 산포를 허용하는 이변량 영과잉 음이항 회귀모형)

  • Kim, Dong-Seok;Jeong, Seul-Gi;Lee, Dong-Hee
    • Communications for Statistical Applications and Methods
    • /
    • v.18 no.5
    • /
    • pp.571-579
    • /
    • 2011
  • We propose a new bivariate zero-inflated negative binomial regression model to allow heterogeneous dispersions. To show the performance of our proposed model, Health Care data in Deb and Trivedi (1997) are used to compare it with the other bivariate zero-inflated negative binomial model proposed by Wang (2003) that has a common dispersion between the two response variables. This empirical study shows better results from the views of log-likelihood and AIC.

The Data-based Prediction of Police Calls Using Machine Learning (기계학습을 활용한 데이터 기반 경찰신고건수 예측)

  • Choi, Jaehun
    • The Journal of Bigdata
    • /
    • v.3 no.2
    • /
    • pp.101-112
    • /
    • 2018
  • The purpose of the study is to predict the number of police calls using neural network which is one of the machine learning and negative binomial regression, by using the data of 112 police calls received from Chungnam Provincial Police Agency from June 2016 to May 2017. The variables which may affect the police calls have been selected for developing the prediction model : time, holiday, the day before holiday, season, temperature, precipitation, wind speed, jurisdictional area, population, the number of foreigners, single house rate and other house rate. Some variables show positive correlation, and others negative one. The comparison of the methods can be summarized as follows. Neural network has correlation coefficient of 0.7702 between predicted and actual values with RMSE 2.557. Negative binomial regression on the other hand shows correlation coefficient of 0.7158 with RMSE 2.831. Neural network has low interpretability, but an excellent predictability compared with the negative binomial regression. Based on the prediction model, the police agency can do the optimal manpower allocation for given values in the selected variables.

A Development of Traffic Accident Model by Random Parameter : Focus on Capital Area and Busan 4-legs Signalized Intersections (확률모수를 이용한 교통사고예측모형 개발 -수도권 및 부산광역시 4지 교차로를 대상으로-)

  • Lee, Geun-Hee;Rho, Jeong-Hyun
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.14 no.6
    • /
    • pp.91-99
    • /
    • 2015
  • This study intends to build a traffic accident predictive model considering road geometrics, traffic and enviromental characteristics and identify the relationship of 4-legs intersection accidents in Seoul and Busan metropolitan area. The RPNB(Random Parameter Negative Binomial) model shows improvement over the fixed NB(Negative Binomial) and out of 53 variables, 10 variables (main road number of lane, main road vehicle traffic volume(left), minor road vehicle traffic volume(right), main road drive restriction, minor road sight distance, minor road median strip, minor road speed limit, minor road speed restriction) showed to have significant variables affecting traffic accident occurrences in 4-legs signilized intersections. Also, among 10 significant variables, 2 variables(minor road sight distance, minor road speed restriction) found to be random parameters.

A Bayesian zero-inflated negative binomial regression model based on Pólya-Gamma latent variables with an application to pharmaceutical data (폴랴-감마 잠재변수에 기반한 베이지안 영과잉 음이항 회귀모형: 약학 자료에의 응용)

  • Seo, Gi Tae;Hwang, Beom Seuk
    • The Korean Journal of Applied Statistics
    • /
    • v.35 no.2
    • /
    • pp.311-325
    • /
    • 2022
  • For count responses, the situation of excess zeros often occurs in various research fields. Zero-inflated model is a common choice for modeling such count data. Bayesian inference for the zero-inflated model has long been recognized as a hard problem because the form of conditional posterior distribution is not in closed form. Recently, however, Pillow and Scott (2012) and Polson et al. (2013) proposed a Pólya-Gamma data-augmentation strategy for logistic and negative binomial models, facilitating Bayesian inference for the zero-inflated model. We apply Bayesian zero-inflated negative binomial regression model to longitudinal pharmaceutical data which have been previously analyzed by Min and Agresti (2005). To facilitate posterior sampling for longitudinal zero-inflated model, we use the Pólya-Gamma data-augmentation strategy.

Testing for Overdispersion in a Bivariate Negative Binomial Distribution Using Bootstrap Method (이변량 음이항 모형에서 붓스트랩 방법을 이용한 과대산포에 대한 검정)

  • Jhun, Myoung-Shic;Jung, Byoung-Cheol
    • The Korean Journal of Applied Statistics
    • /
    • v.21 no.2
    • /
    • pp.341-353
    • /
    • 2008
  • The bootstrap method for the score test statistic is proposed in a bivariate negative binomial distribution. The Monte Carlo study shows that the score test for testing overdispersion underestimates the nominal significance level, while the score test for "intrinsic correlation" overestimates the nominal one. To overcome this problem, we propose a bootstrap method for the score test. We find that bootstrap methods keep the significance level close to the nominal significance level for testing the hypothesis. An empirical example is provided to illustrate the results.

Accident Models of Rotary by Vehicle Type (차량유형별 로터리 사고모형)

  • Han, Su-San;Park, Byeong-Ho
    • Journal of Korean Society of Transportation
    • /
    • v.29 no.6
    • /
    • pp.67-74
    • /
    • 2011
  • This study deals with the traffic accidents data from the Korean rotaries (circular intersections) to verify their characteristics affected by different vehicle types. This paper categorized the data into three groups based on vehicle types, and developed a set of accident models. The paper proposed two ZIP models and one negative binomial model through a statistical analysis for three vehicle types: automobile, truck and van, and others. The differences among those models were then statistically compared.

A new sample selection model for overdispersed count data (과대산포 가산자료의 새로운 표본선택모형)

  • Jo, Sung Eun;Zhao, Jun;Kim, Hyoung-Moon
    • The Korean Journal of Applied Statistics
    • /
    • v.31 no.6
    • /
    • pp.733-749
    • /
    • 2018
  • Sample selection arises as a result of the partial observability of the outcome of interest in a study. Heckman introduced a sample selection model to analyze such data and proposed a full maximum likelihood estimation method under the assumption of normality. Recently sample selection models for binomial and Poisson response variables have been proposed. Based on the theory of symmetry-modulated distribution, we extend these to a model for overdispersed count data. This type of data with no sample selection is often modeled using negative binomial distribution. Hence we propose a sample selection model for overdispersed count data using the negative binomial distribution. A real data application is employed. Simulation studies reveal that our estimation method based on profile log-likelihood is stable.