• Title/Summary/Keyword: Negative binomial regression

Search Result 167, Processing Time 0.025 seconds

Bivariate Zero-Inflated Negative Binomial Regression Model with Heterogeneous Dispersions (서로 다른 산포를 허용하는 이변량 영과잉 음이항 회귀모형)

  • Kim, Dong-Seok;Jeong, Seul-Gi;Lee, Dong-Hee
    • Communications for Statistical Applications and Methods
    • /
    • v.18 no.5
    • /
    • pp.571-579
    • /
    • 2011
  • We propose a new bivariate zero-inflated negative binomial regression model to allow heterogeneous dispersions. To show the performance of our proposed model, Health Care data in Deb and Trivedi (1997) are used to compare it with the other bivariate zero-inflated negative binomial model proposed by Wang (2003) that has a common dispersion between the two response variables. This empirical study shows better results from the views of log-likelihood and AIC.

The Data-based Prediction of Police Calls Using Machine Learning (기계학습을 활용한 데이터 기반 경찰신고건수 예측)

  • Choi, Jaehun
    • The Journal of Bigdata
    • /
    • v.3 no.2
    • /
    • pp.101-112
    • /
    • 2018
  • The purpose of the study is to predict the number of police calls using neural network which is one of the machine learning and negative binomial regression, by using the data of 112 police calls received from Chungnam Provincial Police Agency from June 2016 to May 2017. The variables which may affect the police calls have been selected for developing the prediction model : time, holiday, the day before holiday, season, temperature, precipitation, wind speed, jurisdictional area, population, the number of foreigners, single house rate and other house rate. Some variables show positive correlation, and others negative one. The comparison of the methods can be summarized as follows. Neural network has correlation coefficient of 0.7702 between predicted and actual values with RMSE 2.557. Negative binomial regression on the other hand shows correlation coefficient of 0.7158 with RMSE 2.831. Neural network has low interpretability, but an excellent predictability compared with the negative binomial regression. Based on the prediction model, the police agency can do the optimal manpower allocation for given values in the selected variables.

Accident Models of Circular Intersections by Weather Condition in Korea (기상상태에 따른 국내 원형교차로 사고모형)

  • Park, Byung Ho;Han, Su San
    • Journal of the Korean Society of Safety
    • /
    • v.27 no.6
    • /
    • pp.178-184
    • /
    • 2012
  • This study deals with the traffic accidents by weather condition. The objectives are to comparatively analyze the characteristics, and to develop the models of traffic accidents by weather condition. In pursuing the above, this paper gives particular attentions to testing the differences between two groups, and developing the models(Poisson and negative binomial regression) using the data of domestic circular intersections. The main results are as follows. First, three Poisson models and one negative binomial models which were all statistically significant were developed using the number of accident and EPDO by the clear weather and other as the dependant variables. Second, the differences between two models were comparatively analyzed using the chosen variables. This paper might be expected to give some implications to traffic safety policy-making to reduce and prevent the traffic accidents in circular intersections.

Risk Factors Influencing Probability and Severity of Elder Abuse in Community-dwelling Older Adults: Applying Zero-inflated Negative Binomial Modeling of Abuse Count Data (영과잉 가산자료(Zero-inflated Count Data) 분석 방법을 이용한 지역사회 거주 노인의 노인학대 발생과 심각성에 미치는 위험요인 분석)

  • Jang, Mi Heui;Park, Chang Gi
    • Journal of Korean Academy of Nursing
    • /
    • v.42 no.6
    • /
    • pp.819-832
    • /
    • 2012
  • Purpose: This study was conducted to identify risk factors that influence the probability and severity of elder abuse in community-dwelling older adults. Methods: This study was a cross-sectional descriptive study. Self-report questionnaires were used to collect data from community-dwelling Koreans, 65 and older (N=416). Logistic regression, negative binomial regression and zero-inflated negative binomial regression model for abuse count data were utilized to determine risk factors for elder abuse. Results: The rate of older adults who experienced any one category of abuse was 32.5%. By zero-inflated negative binomial regression analysis, the experience of verbal-psychological abuse was associated with marital status and family support, while the experience of physical abuse was associated with self-esteem, perceived economic stress and family support. Family support was found to be a salient risk factor of probability of abuse in both verbal-psychological and physical abuse. Self-esteem was found to be a salient risk factor of probability and severity of abuse in physical abuse alone. Conclusion: The findings suggest that tailored prevention and intervention considering both types of elder abuse and target populations might be beneficial for preventative efficiency of elder abuse.

Analysis of Food Poisoning via Zero Inflation Models

  • Jung, Hwan-Sik;Kim, Byung-Jip;Cho, Sin-Sup;Yeo, In-Kwon
    • The Korean Journal of Applied Statistics
    • /
    • v.25 no.5
    • /
    • pp.859-864
    • /
    • 2012
  • Poisson regression and negative binomial regression are usually used to analyze counting data; however, these models are unsuitable for fit zero-inflated data that contain unexpected zero-valued observations. In this paper, we review the zero-inflated regression in which Bernoulli process and the counting process are hierarchically mixed. It is known that zero-inflated regression can efficiently model the over-dispersion problem. Vuong statistic is employed to compare performances of the zero-inflated models with other standard models.

Forecasting hierarchical time series for foodborne disease outbreaks (식중독 발생 건수에 대한 계층 시계열 예측)

  • In-Kwon Yeo
    • The Korean Journal of Applied Statistics
    • /
    • v.37 no.4
    • /
    • pp.499 -508
    • /
    • 2024
  • In this paper, we investigate hierarchical time series forecasting that adhere to a hierarchical structure when deriving predicted values by analyzing segmented data as well as aggregated datasets. The occurrences of food poisoning by a specific pathogen are analyzed using zero-inflated Poisson regression models and negative binomial regression models. The occurrences of major, miscellaneous, and overall food poisoning are analyzed using Poisson regression models and negative binomial regression models. For hierarchical time series forecasting, the MinT estimation proposed by Wickramasuriya et al. (2019) is employed. Negative predicted values resulting from hierarchical adjustments are adjusted to zero, and weights are multiplied to the remaining lowest-level variables to satisfy the hierarchical structure. Empirical analysis revealed that there is little difference between hierarchical and non-hierarchical adjustments in predictions based on pathogens. However, hierarchical adjustments generally yield superior results for predictions concerning major, miscellaneous, and overall occurrences. Without hierarchical adjustment, instances may occur where the predicted frequencies of the lowest-level variables exceed that of major or miscellaneous occurrences. However, the proposed method enables the acquisition of predictions that adhere to the hierarchical structure.

Development of Recognition and Reaction Time Prediction Model in Road Signs using Negative Binomial Regression (음이항회귀식을 이용한 도로표지의 인지반응시간 추정모형 개발)

  • Park, Hyung-Jin;Lee, Ki-Young;Kim, Jung-Young
    • Journal of the Ergonomics Society of Korea
    • /
    • v.25 no.4
    • /
    • pp.23-33
    • /
    • 2006
  • The purpose of this study is to determine the economical standard of road signs by verifying the difference of driver's recognition and reaction time according to the space rate of letters on the road signs. For this reason, indoor simulations was conducted to confirm difference of recognition and reaction time on six sign-targets having different space rate. Also, a negative binomial regression model was used to find the main factors which could lower the rate of misreading. For this model, increasing of legibility of sign is not only simple enlargement of sign, but also suitable match of letters and sign. The result of this study is capable of verifying the importance of the space rate in road signs, and being utilized as a effective method to determine the standard of the road signs.

A Bayesian zero-inflated negative binomial regression model based on Pólya-Gamma latent variables with an application to pharmaceutical data (폴랴-감마 잠재변수에 기반한 베이지안 영과잉 음이항 회귀모형: 약학 자료에의 응용)

  • Seo, Gi Tae;Hwang, Beom Seuk
    • The Korean Journal of Applied Statistics
    • /
    • v.35 no.2
    • /
    • pp.311-325
    • /
    • 2022
  • For count responses, the situation of excess zeros often occurs in various research fields. Zero-inflated model is a common choice for modeling such count data. Bayesian inference for the zero-inflated model has long been recognized as a hard problem because the form of conditional posterior distribution is not in closed form. Recently, however, Pillow and Scott (2012) and Polson et al. (2013) proposed a Pólya-Gamma data-augmentation strategy for logistic and negative binomial models, facilitating Bayesian inference for the zero-inflated model. We apply Bayesian zero-inflated negative binomial regression model to longitudinal pharmaceutical data which have been previously analyzed by Min and Agresti (2005). To facilitate posterior sampling for longitudinal zero-inflated model, we use the Pólya-Gamma data-augmentation strategy.

Analysis of Disaster Occurrences in Mongolia Based on Climatic Variables (기후변수를 기반으로 한 몽골 재해발생 분석)

  • Da Hye Lee;Onon-Ujin Otgonbayar;In Hong Chang
    • Journal of Integrative Natural Science
    • /
    • v.17 no.3
    • /
    • pp.93-103
    • /
    • 2024
  • Mongolia's diverse geographical landscape and harsh climate make it particularly susceptible to various natural disasters, including forest fires, heavy rains, dust storms, and heavy snow. This study aims to explore the relationships between key climatic variables and the frequency of these disasters. We collected monthly data from January 2022 to April 2024, encompassing average temperature, temperature variability (absolute temperature difference), average humidity, and precipitation across the capitals of Mongolia's 21 provinces and the capital city Ulaanbaatar. The data were analyzed using multiple statistical models: Linear Regression, Poisson Regression, and Negative Binomial Regression. Descriptive statistics provided initial insights into the variability and distribution of the climatic variables and disaster occurrences. The models aimed to identify significant predictors and quantify their impact on disaster frequencies. Our approach involved standardizing the predictor variables to ensure comparability and interpretability of the regression coefficients. Our findings indicate that climatic variables significantly affect the frequency of natural disasters. The Negative Binomial Regression model was particularly suitable for our data, which exhibited overdispersion common characteristic in count data such as disaster occurrences. Understanding these relationships is crucial for developing targeted disaster management strategies and policies to mitigate the adverse effects of climate change on Mongolian communities. This research provides valuable insights into how climatic changes impact disaster occurrences, offering a foundation for informed decision-making and policy development to enhance community resilience.

A Study on the Socio-economic Characteristics of the Angler Population and the Estimation of A Fishing Frequency Function (유어낚시인구의 사회경제학적 특성과 출조빈도함수의 추정에 관한 연구)

  • Park Cheol-Hyung
    • The Journal of Fisheries Business Administration
    • /
    • v.36 no.1 s.67
    • /
    • pp.81-101
    • /
    • 2005
  • This article is to estimate the fishing frequency function in Korean recreational fishery with respect to socio-economic characteristics of anglers. First, the study described the characteristics of the entire angler population on the view points of 9 socio-economic variables. And then, the study divided the total angler population into three groups of in-land, sea, and mixed angler populations in order to investigate the differences in their characteristics. The study could confirm the existence of differences in regions, size of regions, and educational levels between the in - land and the sea angler populations by testing heterogeneity in the frequency table. The fishing frequency function is estimated using Poisson regression model in order to accomodate the count data(non-negative discrete random variable) aspects of the fishing frequency. However, the model specification error is found due to overdispersion of data. The model exhibits the lack of goodness of fit. The negative binomial regression model is adopted to cure the overdispersion of the data as an alternative estimation methodology. Finally, the study can confirm overdispersion does not exist in the model any more and the goodness of fit improved significantly to the reasonable level. The results of estimation of fishing frequency population modeled by the negative binomial regression models are following. The three variables of region, sex, and education have effects on the decision making process of fishing frequency in the case of in-land recreation fishery. On the other hand, the three variables of sex, age, and marriage status do the same job in the case of sea angler population. Among the left-over variables, both income and use of Internet variables now affect on the process in mixed angler population. Finally, the results of whole angler population show that all of the previous variables are proven to be statistically significant due to the summation of data with all three sub-groups of angler population.

  • PDF