• Title/Summary/Keyword: Zero-Inflated

Search Result 79, Processing Time 0.02 seconds

Prediction of K-league soccer scores using bivariate Poisson distributions (이변량 포아송분포를 이용한 K-리그 골 점수의 예측)

  • Lee, Jang Taek
    • Journal of the Korean Data and Information Science Society
    • /
    • v.25 no.6
    • /
    • pp.1221-1229
    • /
    • 2014
  • In this paper we choose the best model among several bivariate Poisson models on Korean soccer data. The models considered allow for correlation between the number of goals of two competing teams. We use an R package called bivpois for bivariate Poisson regression models and the data of K-league for season 1983-2012. Finally we conclude that the best fitted model supported by the AIC and BIC is the bivariate Poisson model with constant covariance. The zero and diagonal inflated models did not improve the model fit. The model can be used to examine home-away effect, goodness of fit, attack and defense parameters.

Predictors for Aggressive Behavior of Patients with Mental Illness in a Closed Psychiatric Ward using Zero-Inflated Poisson Regression: A Retrospective Study (영과잉포아송회귀분석을 활용한 안정병동에 입원한 정신질환자의 공격행동 예측요인)

  • Kim, Jung Ho;Shin, Sung Hee
    • Journal of East-West Nursing Research
    • /
    • v.28 no.2
    • /
    • pp.160-169
    • /
    • 2022
  • Purpose: This study was conducted to identify predictors related to aggressive behavior of patients with mental illness admitted to a closed psychiatric ward. Methods: This study adopted a retrospective design which analyzed the hospital medical records of 363 patients with mental illness admitted to the psychiatric closed ward of a university hospital in Seoul, Korea. The collected data were analyzed using SPSS IBM 20.0 and STATA 12.0 SE. ZIP (Zero-Inflated Poisson) and count data analysis were used for the factor influencing the occurrence and frequency of aggressive behavior. Results: The results of ZIP model showed that the factors influencing non-probability of aggressive behavior were anxiety, non-adherence, and frustration. In addition, the factors influencing frequency of aggressive behavior were bipolar disorder and personality disorder trait. Conclusion: We found that bipolar disorder, frustration, and non-adherence are more likely to increase the likelihood of aggressive behavior in patients with mental illness. In particular, patients diagnosed with bipolar disorder were 1.95 times more likely to engage in repetitive aggressive behavior compared to those without a diagnose. However, since the results were different form previous studies, further studies on the traits of anxiety and personality disorders are needed.

The study on the determinants of the number of job changes (중소기업 청년인턴 이직횟수 결정요인 분석)

  • Park, Sungik;Ryu, Jangsoo;Kim, Jonghan;Cho, Jangsik
    • Journal of the Korean Data and Information Science Society
    • /
    • v.26 no.2
    • /
    • pp.387-397
    • /
    • 2015
  • In this paper, the determinants of the number of job changes in the SMEs (small and medium enterprises) youth-intern project is analysed, utilizing SMEs youth-intern DB and employment insurance DB. Since the number of job changes are count data which take integer values other than negative values, general linear regression analysis becomes inappropriate. Therefore, four models such as Poisson regression model, zero inflated Poisson regression model, negative binomial regression model and zero inflated negative binomial regression model are tried to fit count data. A zero inflated negative binomial regression model is selected to be the best model. Major results are the followings. First, the number of job changes is shown to be significantly smaller in the treatment group than in the control group. Second, the number of job changes turns out to be significantly smaller in the young-age group than in the old-age group. Third, it is also shown that the number of job changes of man is significantly greater than that of woman. Lastly, the number of job changes in the bigger firm is shown to be significantly less than that of the smaller firm.

Mixed-effects zero-inflated Poisson regression for analyzing the spread of COVID-19 in Daejeon (혼합효과 영과잉 포아송 회귀모형을 이용한 대전광역시 코로나 발생 동향 분석)

  • Kim, Gwanghee;Lee, Eunjee
    • The Korean Journal of Applied Statistics
    • /
    • v.34 no.3
    • /
    • pp.375-388
    • /
    • 2021
  • This paper aims to help prevent the spread of COVID-19 by analyzing confirmed cases of COVID-19 in Daejeon. A high volume of visitors, downtown areas, and psychological fatigue with prolonged social distancing were considered as risk factors associated with the spread of COVID-19. We considered the weekly confirmed cases in each administrative district as a response variable. Explanatory variables were the number of passengers getting off at a bus station in each administrative district and the elapsed time since the Korean government had imposed distancing in daily life. We employed a mixed-effects zero-inflated Poisson regression model because the number of cases was repeatedly measured with excess zero-count data. We conducted k-means clustering to identify three groups of administrative districts having different characteristics in terms of the number of bars, the population size, and the distance to the closest college. Considering that the number of confirmed cases might vary depending on districts' characteristics, the clustering information was incorporated as a categorical explanatory variable. We found that Covid-19 was more prevalent as population size increased and a district is downtown. As the number of passengers getting off at a downtown district increased, the confirmed cases significantly increased.

Development of a Accident Frequency Prediction Model at Rural Multi-Lane Highways (지방부 다차로 도로구간에서의 사고 예측모형 개발 (대도시권 외곽 및 구릉지 특성의 도로구간 중심으로))

  • Lee, Dong-Min;Kim, Do-Hun;Seong, Nak-Mun
    • Journal of Korean Society of Transportation
    • /
    • v.27 no.4
    • /
    • pp.207-215
    • /
    • 2009
  • Generally, traffic accidents can be influenced by variables driving conditions including geometric, roadside design, and traffic conditions. Under the circumstance, homogeneous roadway segments were firstly identified using typical geometric variables obtained from field data collections in this study. These field data collections were conducted at highways located in several areas having various regional conditions for examples, outside metropolitan city; level and rolling rural areas. Due to many zero cells in crash database, a Zero Inflated Poisson model was used to develop crash prediction model to overestimated results in this study. It was found that EXPO, radius, grade, guardrail, mountainous terrain, crosswalk and bus-stop have statistically significant influence on vehicle to vehicle crashes at rural multi-lane roadway segments.

Analysis of Elderly Drivers' Accident Models Considering Operations and Physical Characteristics (고령운전자 운전 및 신체특성을 반영한 교통사고 분석 연구)

  • Lim, Sam Jin;Park, Jun Tae;Kim, Young Il;Kim, Tae Ho
    • Journal of Korean Society of Transportation
    • /
    • v.30 no.6
    • /
    • pp.37-46
    • /
    • 2012
  • The number of traffic accidents caused by elderly drivers over the age of 65 has surged over the past ten years from 37,000 to 274,000 cases. The proportion of elderly drivers' accidents has jumped 3.1 times from 1.2% to 3.7% out of all traffic accidents, and traffic safety organizations are pursuing diverse measures to address the situation. Above all, connecting safety measures with an in-depth research on behavioral and physical characteristics of elderly drivers will prove vital. This study conducted an empirical research linking the driving characteristics and traffic accidents by elderly drivers based on the Driving Aptitude Test items and traffic accident data, which enabled the measurement of behavioral characteristics of elderly drivers. In developing the Influence Model, we applied the zero-inflated Poisson (ZIP) regression model and selected an accident prediction model based on the Bayesian Influence in regards to the ZIP regression model and the zero-inflated negative binomial (ZINB) regression model. According to the results of the AAE analysis, the ZIP regression model was more appropriate and it was found that three variables? prediction of velocity, diversion, and cognitive ability? had a relation of influence with traffic accidents caused by elderly drivers.

Estimating Travel Frequency of Public Bikes in Seoul Considering Intermediate Stops (경유지를 고려한 서울시 공공자전거 통행발생량 추정 모형 개발)

  • Jonghan Park;Joonho Ko
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.22 no.3
    • /
    • pp.1-19
    • /
    • 2023
  • Bikes have recently emerged as an alternative to carbon neutrality. To understand the demand for public bikes, we endeavored to estimate travel frequency of public bike by considering the intermediate stops. Using the GPS trajectory data of 'Ttareungyi', a public bike service in Seoul, we identified a stay point and estimated travel frequency reflecting population, land use, and physical characteristics. Application of map matching and a stay point detection algorithm revealed that stay point appeared in about 12.1% of the total trips. Compared to a trip without stay point, the trip with stay point has a longer average travel distance and travel time and a higher occurrence rate during off-peak hours. According to visualization analysis, the stay points are mainly found in parks, leisure facilities, and business facilities. To consider the stay point, the unit of analysis was set as a hexagonal grid rather than the existing rental station base. Travel frequency considering the stay point were analyzed using the Zero-Inflated Negative Binomial (ZINB) model. Results of our analysis revealed that the travel frequency were higher in bike infrastructure where the safety of bike users was secured, such as 'Bikepath' and 'Bike and pedestrian path'. Also, public bikes play a role as first & last mile means of access to public transportation. The measure of travel frequency was also observed to increase in life and employment centers. Considering the results of this analysis, securing safety facilities and space for users should be given priority when planning any additional expansion of bike infrastructure. Moreover, there is a necessity to establish a plan to supply bike infrastructure facilities linked to public transportation, especially the subway.

Threshold-asymmetric volatility models for integer-valued time series

  • Kim, Deok Ryun;Yoon, Jae Eun;Hwang, Sun Young
    • Communications for Statistical Applications and Methods
    • /
    • v.26 no.3
    • /
    • pp.295-304
    • /
    • 2019
  • This article deals with threshold-asymmetric volatility models for over-dispersed and zero-inflated time series of count data. We introduce various threshold integer-valued autoregressive conditional heteroscedasticity (ARCH) models as incorporating over-dispersion and zero-inflation via conditional Poisson and negative binomial distributions. EM-algorithm is used to estimate parameters. The cholera data from Kolkata in India from 2006 to 2011 is analyzed as a real application. In order to construct the threshold-variable, both local constant mean which is time-varying and grand mean are adopted. It is noted via a data application that threshold model as an asymmetric version is useful in modelling count time series volatility.

An Analysis of Spatial Determinants of Inventor Networks in Korea (발명자 네트워크의 공간적 결정요인 분석)

  • Jeong, Jun Ho
    • Journal of the Economic Geographical Society of Korea
    • /
    • v.19 no.1
    • /
    • pp.1-17
    • /
    • 2016
  • This paper attempts to explore the spatial structure of inventor networks and their determinants among 230 shi-gun-gu regions in Korea by investigating the residence of co-inventors engaged in Korean patent applications to the Korean Intellectual Office and exploiting a zero inflated negative binomial model to accommodate an estimation to the count nature of a dependent variable and its excess of zeros. Several variables are found to affect the spatial linkage of inventor networks. Spatial links extend beyond the region if it has more own R&D-related specific assets (private R&D, patent productivity, population, education); if it is physically close to and has technological similarity with the other region. The assets of the other region plays a positive role if, in a similar way, the other region has more R&D-related specific assets.

  • PDF

A Study of Consistency in Estimating the Number of Vacant Jobs Using the Labor Force Survey at Establishments (사업체노동력조사를 활용한 빈 일자리 수 추정에 대한 정합성 연구)

  • Park, Seung-Hwan
    • Asia-Pacific Journal of Business
    • /
    • v.13 no.3
    • /
    • pp.329-341
    • /
    • 2022
  • Purpose - The purpose of this study was to investigate consistency in estimating the number of vacant jobs using the two business labor force survey with two different time points of survey. Design/methodology/approach - We studied the cause of the differences in estimating the number of vacant jobs between the monthly sample and the new sample in business labor force survey. Findings - To summarize our findings, As the size of the company increases, the number of vacant jobs in the company also increases, and the probability that the number of vacant jobs in the company is zero decreases. The monthly sample was assessed to have a higher likelihood that the number of vacant jobs in the company was zero and the number of vacant jobs was considerable compared to the local sample. Research implications or Originality - Because local survey sample companies tend to minimize the number of vacant jobs even when they reply under the same conditions, the estimation result of the number of vacant jobs in the current monthly survey differs significantly from the estimation result of the local survey. Divergent "degrees of knowledge of question items," survey methodologies, or investigators could be the causes of the various response trends.