• Title/Summary/Keyword: Poisson and negative binomial Regression model

Search Result 54, Processing Time 0.021 seconds

Analysis of Food Poisoning via Zero Inflation Models

  • Jung, Hwan-Sik;Kim, Byung-Jip;Cho, Sin-Sup;Yeo, In-Kwon
    • The Korean Journal of Applied Statistics
    • /
    • v.25 no.5
    • /
    • pp.859-864
    • /
    • 2012
  • Poisson regression and negative binomial regression are usually used to analyze counting data; however, these models are unsuitable for fit zero-inflated data that contain unexpected zero-valued observations. In this paper, we review the zero-inflated regression in which Bernoulli process and the counting process are hierarchically mixed. It is known that zero-inflated regression can efficiently model the over-dispersion problem. Vuong statistic is employed to compare performances of the zero-inflated models with other standard models.

Development of Roundabout Accident Models by Region (지역별 회전교차로 사고모형 개발 및 논의)

  • Son, Seul Ki;Park, Byung Ho
    • International Journal of Highway Engineering
    • /
    • v.20 no.2
    • /
    • pp.67-74
    • /
    • 2018
  • PURPOSES : The goal of this study is the development of roundabout accident models for urban and non-urban areas. METHODS : This study performed a comparative analysis of the regional factors affecting accidents. Traffic accident data were collected for the period 2010~2014 from the TAAS data set of the Road Traffic Authority. To develop the roundabout accident models, the Poisson and negative binomial regression models were used. A total of 25 explanatory variables such as geometry, and traffic volume were used. RESULTS : The key findings are as follows: First, it was found that the null hypotheses that the number of accidents is the same should be rejected. Second, three Poisson regression accident models, which are statistically significant (${\rho}^2$ of 0.154 and 0.385) were developed. Third, it was noted that although the common variable of the three models (models I~III) is the number of entry lanes, the specific variables are entry lane width, roundabout sign, number of circulatory roadways, splitter island, number of exit lanes, exit lane width, number of approach roads, and truck apron. CONCLUSIONS : The results of this study can provide suggestive countermeasures for decreasing the number of roundabout accidents.

The study on the determinants of the number of job changes (중소기업 청년인턴 이직횟수 결정요인 분석)

  • Park, Sungik;Ryu, Jangsoo;Kim, Jonghan;Cho, Jangsik
    • Journal of the Korean Data and Information Science Society
    • /
    • v.26 no.2
    • /
    • pp.387-397
    • /
    • 2015
  • In this paper, the determinants of the number of job changes in the SMEs (small and medium enterprises) youth-intern project is analysed, utilizing SMEs youth-intern DB and employment insurance DB. Since the number of job changes are count data which take integer values other than negative values, general linear regression analysis becomes inappropriate. Therefore, four models such as Poisson regression model, zero inflated Poisson regression model, negative binomial regression model and zero inflated negative binomial regression model are tried to fit count data. A zero inflated negative binomial regression model is selected to be the best model. Major results are the followings. First, the number of job changes is shown to be significantly smaller in the treatment group than in the control group. Second, the number of job changes turns out to be significantly smaller in the young-age group than in the old-age group. Third, it is also shown that the number of job changes of man is significantly greater than that of woman. Lastly, the number of job changes in the bigger firm is shown to be significantly less than that of the smaller firm.

A Study on the Socio-economic Characteristics of the Angler Population and the Estimation of A Fishing Frequency Function (유어낚시인구의 사회경제학적 특성과 출조빈도함수의 추정에 관한 연구)

  • Park Cheol-Hyung
    • The Journal of Fisheries Business Administration
    • /
    • v.36 no.1 s.67
    • /
    • pp.81-101
    • /
    • 2005
  • This article is to estimate the fishing frequency function in Korean recreational fishery with respect to socio-economic characteristics of anglers. First, the study described the characteristics of the entire angler population on the view points of 9 socio-economic variables. And then, the study divided the total angler population into three groups of in-land, sea, and mixed angler populations in order to investigate the differences in their characteristics. The study could confirm the existence of differences in regions, size of regions, and educational levels between the in - land and the sea angler populations by testing heterogeneity in the frequency table. The fishing frequency function is estimated using Poisson regression model in order to accomodate the count data(non-negative discrete random variable) aspects of the fishing frequency. However, the model specification error is found due to overdispersion of data. The model exhibits the lack of goodness of fit. The negative binomial regression model is adopted to cure the overdispersion of the data as an alternative estimation methodology. Finally, the study can confirm overdispersion does not exist in the model any more and the goodness of fit improved significantly to the reasonable level. The results of estimation of fishing frequency population modeled by the negative binomial regression models are following. The three variables of region, sex, and education have effects on the decision making process of fishing frequency in the case of in-land recreation fishery. On the other hand, the three variables of sex, age, and marriage status do the same job in the case of sea angler population. Among the left-over variables, both income and use of Internet variables now affect on the process in mixed angler population. Finally, the results of whole angler population show that all of the previous variables are proven to be statistically significant due to the summation of data with all three sub-groups of angler population.

  • PDF

Rear-end Accident Models of Rural Area Signalized Intersections in the Cases of Cheongju and Cheongwon (청주.청원 지방부 신호교차로의 후미추돌 사고모형)

  • Park, Byoung-Ho;In, Byung-Chul
    • International Journal of Highway Engineering
    • /
    • v.11 no.2
    • /
    • pp.151-158
    • /
    • 2009
  • This study deals with the rear-end collisions in the rural aiea. The objectives of this study are 1) to analyze the characteristics of rear-end accidents of signalized intersections, and 2) to develop the accident models for Cheongju-Cheongwon. In pursing the above, this study gives the particular attentions to comparing the characters of urban and rural area. In this study, the dependent variables are the number of accidents and value of EPDO(equivalent property damage only), and independent variables are the traffic volumes and geometric elements. The main results analyzed are the followings. First, the statistical analyses show that the Poisson accident model using the number of accident as a dependant variable are statistically significant and the negative binomial accident model using the value of EPDO are statistically significant. Second, the independent variables of Poisson model are analyzed to be the ratio of high-occupancy vehicles, total traffic volume and the sum of exit/entry, and those of negative binomial regression are the main road width, total traffic volume and the ratio of high-occupancy vehicles. Finally, the specific independent variables to the rural area are the main road width, the ratio of high occupancy vehicle, and the sum exit/entry.

  • PDF

Modeling clustered count data with discrete weibull regression model

  • Yoo, Hanna
    • Communications for Statistical Applications and Methods
    • /
    • v.29 no.4
    • /
    • pp.413-420
    • /
    • 2022
  • In this study we adapt discrete weibull regression model for clustered count data. Discrete weibull regression model has an attractive feature that it can handle both under and over dispersion data. We analyzed the eighth Korean National Health and Nutrition Examination Survey (KNHANES VIII) from 2019 to assess the factors influencing the 1 month outpatient stay in 17 different regions. We compared the results using clustered discrete Weibull regression model with those of Poisson, negative binomial, generalized Poisson and Conway-maxwell Poisson regression models, which are widely used in count data analyses. The results show that the clustered discrete Weibull regression model using random intercept model gives the best fit. Simulation study is also held to investigate the performance of the clustered discrete weibull model under various dispersion setting and zero inflated probabilities. In this paper it is shown that using a random effect with discrete Weibull regression can flexibly model count data with various dispersion without the risk of making wrong assumptions about the data dispersion.

Fit of the number of insurance solicitor's turnovers using zero-inflated negative binomial regression (영과잉 음이항회귀 모형을 이용한 보험설계사들의 이직횟수 적합)

  • Chun, Heuiju
    • Journal of the Korean Data and Information Science Society
    • /
    • v.28 no.5
    • /
    • pp.1087-1097
    • /
    • 2017
  • This study aims to find the best model to fit the number of insurance solicitor's turnovers of life insurance companies using count data regression models such as poisson regression, negative binomial regression, zero-inflated poisson regression, or zero-inflated negative binomial regression. Out of the four models, zero-inflated negative binomial model has been selected based on AIC and SBC criteria, which is due to over-dispersion and high proportion of zero-counts. The significant factors to affect insurance solicitor's turnover found to be a work period in current company, a total work period as financial planner, an affiliated corporation, and channel management satisfaction. We also have found that as the job satisfaction or the channel management satisfaction gets lower as channel management satisfaction, the number of insurance solicitor's turnovers increases. In addition, the total work period as financial planner has positive relationship with the number of insurance solicitor's turnovers, but the work period in current company has negative relationship with it.

Developing the Pedestrian Accident Models of Intersections using Tobit Model (토빗모형을 이용한 교차로 보행자 사고모형 개발)

  • Lee, Seung Ju;Lim, Jin Kang;Park, Byung Ho
    • Journal of the Korean Society of Safety
    • /
    • v.29 no.5
    • /
    • pp.154-159
    • /
    • 2014
  • This study deals with the pedestrian accidents of intersections in case of Cheongju. The objective is to develop the pedestrian accident models using Tobit regression model. In pursuing the above, the pedestrian accident data from 2007 to 2011 were collected from TAAS data set of Road Traffic Authority. To analyze the accident, Poisson, negative binomial and Tobit regression models were utilized in this study. The dependent variable were the number of accident by intersection. Independent variables are traffic volume, intersection geometric structure and the transportation facility. The main results were as follows. First, Tobit model was judged to be more appropriate model than other models. Also, these models were analyzed to be statistically significant. Second, such the main variables related to accidents as traffic volume, pedestrian volume, number of traffic island, crossing length and the pedestrian countdown signal systems were adopted in the above model.

A Study on Impact of Factors Influencing Maritime Freight Rates Using Poisson and Negative Binomial Regression Analysis on Blank Sailings of Shipping Companies (포아송 및 음이항 회귀분석을 이용한 해상운임 결정요인이 해운선사의 블랭크 세일링에 미치는 영향 분석 연구)

  • Won-Hyeong Ryu;Hyung-Sik Nam
    • Journal of Navigation and Port Research
    • /
    • v.48 no.1
    • /
    • pp.62-77
    • /
    • 2024
  • In the maritime shipping industry, imbalance between supply and demand has persistently increased, leading to the utilization of blank sailings by major shipping companies worldwide as a key means of flexibly adjusting vessel capacity in response to shipping market conditions. Traditionally, blank sailings have been frequently implemented around the Chinese New Year period. However, due to unique circumstances such as the global pandemic starting in 2020 and trade tensions between the United States and China, shipping companies have recently conducted larger-scale blank sailings compared to the past. As blank sailings directly impact freight transport delays, they can have negative repercussions from perspectives of both businesses and consumers. Therefore, this study employed Poisson regression models and negative binomial regression models to analyze the influence of maritime freight rate determinants on shipping companies' decisions regarding blank sailings, aiming to proactively address potential consequences. Results of the analysis indicated that, in Poisson regression analysis for 2M, significant variables included global container shipping volume, container vessel capacity, container ship scrapping volume, container ship newbuilding index, and OECD inflation. In negative binomial regression analysis, ocean alliance showed significance with global container shipping volume and container ship order volume, the alliance with container ship capacity and interest rates, non-alliance with international oil prices, global supply chain pressure index, container ship capacity, OECD inflation, and total alliance with container ship capacity and interest rates.

Bayesian Analysis for the Zero-inflated Regression Models (영과잉 회귀모형에 대한 베이지안 분석)

  • Jang, Hak-Jin;Kang, Yun-Hee;Lee, S.;Kim, Seong-W.
    • The Korean Journal of Applied Statistics
    • /
    • v.21 no.4
    • /
    • pp.603-613
    • /
    • 2008
  • We often encounter the situation that discrete count data have a large portion of zeros. In this case, it is not appropriate to analyze the data based on standard regression models such as the poisson or negative binomial regression models. In this article, we consider Bayesian analysis for two commonly used models. They are zero-inflated poisson and negative binomial regression models. We use the Bayes factor as a model selection tool and computation is proceeded via Markov chain Monte Carlo methods. Crash count data are analyzed to support theoretical results.