• Title/Summary/Keyword: Count 모형

Search Result 107, Processing Time 0.022 seconds

Developing the Pedestrian Accident Models Using Tobit Model (토빗모형을 이용한 가로구간 보행자 사고모형 개발)

  • Lee, Seung Ju;Kim, Yun Hwan;Park, Byung Ho
    • International Journal of Highway Engineering
    • /
    • v.16 no.3
    • /
    • pp.101-107
    • /
    • 2014
  • PURPOSES : This study deals with the pedestrian accidents in case of Cheongju. The goals are to develop the pedestrian accident model. METHODS : To analyze the accident, count data models, truncated count data models and Tobit regression models are utilized in this study. The dependent variable is the number of accident. Independent variables are traffic volume, intersection geometric structure and the transportation facility. RESULTS : The main results are as follows. First, Tobit model was judged to be more appropriate model than other models. Also, these models were analyzed to be statistically significant. Second, such the main variables related to accidents as traffic volume, pedestrian volume, number of Entry/exit, number of crosswalk and bus stop were adopted in the above model. CONCLUSIONS : The optimal model for pedestrian accidents is evaluated to be Tobit model.

A Comparative Study on Estimation Models for the Value of Access to a Natural Recreation Site: Focusing on the Estuary Area of Yeongsan River (자연휴양지 방문편익 추정모형의 비교 연구 - 영산강 하구를 대상으로)

  • Shin, Youngchul
    • Environmental and Resource Economics Review
    • /
    • v.21 no.4
    • /
    • pp.981-998
    • /
    • 2012
  • In this paper, several count data model of travel cost recreation demand with Poisson and negative binominal specification are applied to estimate the value of access to the estuary area of Yeongsan river from visitor survey data. The results show that the negative binomial model that accounts for truncation and overdispersion provides the better goodness-of-fit, and therefore the value per visit(i.e. consumer surplus) is 89,350 won for resident of Jeolla province and 432,526 won for that of other provinces. If don't correct overdispersion by relying on Poisson estimates, the consumer surplus will be underestimated. Whereas the consumer surplus will be overestimated unless correct truncation by using estimates of untruncated models. As a result, the truncated negative binomial model should be applied to estimate the travel demand and the consumer surplus per visit by using survey data from single site visitors.

  • PDF

A Bayesian zero-inflated negative binomial regression model based on Pólya-Gamma latent variables with an application to pharmaceutical data (폴랴-감마 잠재변수에 기반한 베이지안 영과잉 음이항 회귀모형: 약학 자료에의 응용)

  • Seo, Gi Tae;Hwang, Beom Seuk
    • The Korean Journal of Applied Statistics
    • /
    • v.35 no.2
    • /
    • pp.311-325
    • /
    • 2022
  • For count responses, the situation of excess zeros often occurs in various research fields. Zero-inflated model is a common choice for modeling such count data. Bayesian inference for the zero-inflated model has long been recognized as a hard problem because the form of conditional posterior distribution is not in closed form. Recently, however, Pillow and Scott (2012) and Polson et al. (2013) proposed a Pólya-Gamma data-augmentation strategy for logistic and negative binomial models, facilitating Bayesian inference for the zero-inflated model. We apply Bayesian zero-inflated negative binomial regression model to longitudinal pharmaceutical data which have been previously analyzed by Min and Agresti (2005). To facilitate posterior sampling for longitudinal zero-inflated model, we use the Pólya-Gamma data-augmentation strategy.

A Development of Traffic Accident Prediction Model at Rural Unsignalized Intersections Using Random Parameter (Random Parameter를 이용한 지방부 무신호교차로 교통사고 예측모형개발)

  • Lee, Kyu-Hoon;Oh, Ju-Taek;Park, Jeong-Soon
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.16 no.4
    • /
    • pp.64-75
    • /
    • 2017
  • Previous count models using fixed parameter can not consider the unobserved heterogeneity, as the standard error of the count value is underestimated, excessive t-values are derived thereby reducing the reliability of the model. Also, the study of unsignalized intersections are inadequate because of the difficulty of collecting data and statistical limits for accurate analytical processes compared to the signalized intersections. The purpose of this study is to analyze the factors affecting traffic accidents by constructing the count model using random parameters, and it aimed to distinguish between existing studies based on the rural unsignalized intersections. As a result of the analysis, 7 variables were presented as significant variables, and 2 variables(presence of crosswalk, speed limit) were presented as random parameter.

Bayesian Analysis of a Zero-inflated Poisson Regression Model: An Application to Korean Oral Hygienic Data (영과잉 포아송 회귀모형에 대한 베이지안 추론: 구강위생 자료에의 적용)

  • Lim, Ah-Kyoung;Oh, Man-Suk
    • The Korean Journal of Applied Statistics
    • /
    • v.19 no.3
    • /
    • pp.505-519
    • /
    • 2006
  • We consider zero-inflated count data, which is discrete count data but has too many zeroes compared to the Poisson distribution. Zero-inflated data can be found in various areas. Despite its increasing importance in practice, appropriate statistical inference on zero-inflated data is limited. Classical inference based on a large number theory does not fit unless the sample size is very large. And regular Poisson model shows lack of St due to many zeroes. To handle the difficulties, a mixture of distributions are considered for the zero-inflated data. Specifically, a mixture of a point mass at zero and a Poisson distribution is employed for the data. In addition, when there exist meaningful covariates selected to the response variable, loglinear link is used between the mean of the response and the covariates in the Poisson distribution part. We propose a Bayesian inference for the zero-inflated Poisson regression model by using a Markov Chain Monte Carlo method. We applied the proposed method to a Korean oral hygienic data and compared the inference results with other models. We found that the proposed method is superior in that it gives small parameter estimation error and more accurate predictions.

A Study on Phon Call Big Data Analytics (전화통화 빅데이터 분석에 관한 연구)

  • Kim, Jeongrae;Jeong, Chanki
    • Journal of Information Technology and Architecture
    • /
    • v.10 no.3
    • /
    • pp.387-397
    • /
    • 2013
  • This paper proposes an approach to big data analytics for phon call data. The analytical models for phon call data is composed of the PVPF (Parallel Variable-length Phrase Finding) algorithm for identifying verbal phrases of natural language and the word count algorithm for measuring the usage frequency of keywords. In the proposed model, we identify words using the PVPF algorithm, and measure the usage frequency of the identified words using word count algorithm in MapReduce. The results can be interpreted from various viewpoints. We design and implement the model based HDFS (Hadoop Distributed File System), verify the proposed approach through a case study of phon call data. So we extract useful results through analysis of keyword correlation and usage frequency.

Bayesian Analysis for the Zero-inflated Regression Models (영과잉 회귀모형에 대한 베이지안 분석)

  • Jang, Hak-Jin;Kang, Yun-Hee;Lee, S.;Kim, Seong-W.
    • The Korean Journal of Applied Statistics
    • /
    • v.21 no.4
    • /
    • pp.603-613
    • /
    • 2008
  • We often encounter the situation that discrete count data have a large portion of zeros. In this case, it is not appropriate to analyze the data based on standard regression models such as the poisson or negative binomial regression models. In this article, we consider Bayesian analysis for two commonly used models. They are zero-inflated poisson and negative binomial regression models. We use the Bayes factor as a model selection tool and computation is proceeded via Markov chain Monte Carlo methods. Crash count data are analyzed to support theoretical results.

A new sample selection model for overdispersed count data (과대산포 가산자료의 새로운 표본선택모형)

  • Jo, Sung Eun;Zhao, Jun;Kim, Hyoung-Moon
    • The Korean Journal of Applied Statistics
    • /
    • v.31 no.6
    • /
    • pp.733-749
    • /
    • 2018
  • Sample selection arises as a result of the partial observability of the outcome of interest in a study. Heckman introduced a sample selection model to analyze such data and proposed a full maximum likelihood estimation method under the assumption of normality. Recently sample selection models for binomial and Poisson response variables have been proposed. Based on the theory of symmetry-modulated distribution, we extend these to a model for overdispersed count data. This type of data with no sample selection is often modeled using negative binomial distribution. Hence we propose a sample selection model for overdispersed count data using the negative binomial distribution. A real data application is employed. Simulation studies reveal that our estimation method based on profile log-likelihood is stable.

Zero-Inflated INGARCH Using Conditional Poisson and Negative Binomial: Data Application (조건부 포아송 및 음이항 분포를 이용한 영-과잉 INGARCH 자료 분석)

  • Yoon, J.E.;Hwang, S.Y.
    • The Korean Journal of Applied Statistics
    • /
    • v.28 no.3
    • /
    • pp.583-592
    • /
    • 2015
  • Zero-inflation has recently attracted much attention in integer-valued time series. This article deals with conditional variance (volatility) modeling for the zero-inflated count time series. We incorporate zero-inflation property into integer-valued GARCH (INGARCH) via conditional Poisson and negative binomial marginals. The Cholera frequency time series is analyzed as a data application. Estimation is carried out using EM-algorithm as suggested by Zhu (2012).