• Title/Summary/Keyword: Poisson regression

Search Result 241, Processing Time 0.041 seconds

Predicting football scores via Poisson regression model: applications to the National Football League

  • Saraiva, Erlandson F.;Suzuki, Adriano K.;Filho, Ciro A.O.;Louzada, Francisco
    • Communications for Statistical Applications and Methods
    • /
    • v.23 no.4
    • /
    • pp.297-319
    • /
    • 2016
  • Football match predictions are of great interest to fans and sports press. In the last few years it has been the focus of several studies. In this paper, we propose the Poisson regression model in order to football match outcomes. We applied the proposed methodology to two national competitions: the 2012-2013 English Premier League and the 2015 Brazilian Football League. The number of goals scored by each team in a match is assumed to follow Poisson distribution, whose average reflects the strength of the attack, defense and the home team advantage. Inferences about all unknown quantities involved are made using a Bayesian approach. We calculate the probabilities of win, draw and loss for each match using a simulation procedure. Besides, also using simulation, the probability of a team qualifying for continental tournaments, being crowned champion or relegated to the second division is obtained.

The Analysis of the Number of Donations Based on a Mixture of Poisson Regression Model (포아송 분포의 혼합모형을 이용한 기부 횟수 자료 분석)

  • Kim In-Young;Park Su-Bum;Kim Byung-Soo;Park Tae-Kyu
    • The Korean Journal of Applied Statistics
    • /
    • v.19 no.1
    • /
    • pp.1-12
    • /
    • 2006
  • The aim of this study is to analyse a survey data on the number of charitable donations using a mixture of two Poisson regression models. The survey was conducted in 2002 by Volunteer 21, an nonprofit organization, based on Koreans, who were older than 20. The mixture of two Poisson distributions is used to model the number of donations based on the empirical distribution of the data. The mixture of two Poisson distributions implies the whole population is subdivided into two groups, one with lesser number of donations and the other with larger number of donations. We fit the mixture of Poisson regression models on the number of donations to identify significant covariates. The expectation-maximization algorithm is employed to estimate the parameters. We computed 95% bootstrap confidence interval based on bias-corrected and accelerated method and used then for selecting significant explanatory variables. As a result, the income variable with four categories and the volunteering variable (1: experience of volunteering, 0: otherwise) turned out to be significant with the positive regression coefficients both in the lesser and the larger donation groups. However, the regression coefficients in the lesser donation group were larger than those in larger donation group.

The Reanalysis of the Donation Data Using the Zero-Inflated Possion Regression (0이 팽창된 포아송 회귀모형을 이용한 기부회수 자료의 재분석)

  • Kim, In-Young;Park, Tae-Kyu;Kim, Byung-Soo
    • The Korean Journal of Applied Statistics
    • /
    • v.22 no.4
    • /
    • pp.819-827
    • /
    • 2009
  • Kim et al. (2006) analyzed the donation data surveyed by Voluneteer 21 in year 2002 at South Korea using a Poisson regression based on the mixture of two Poissons and detected significant variables for affecting the number of donations. However, noting the large deviation between the predicted and the actual frequencies of zero, we developed in this note a Poisson regression model based on a distribution in which zero inflated Poisson was added to the mixture of two Poissons. Thus the population distribution is now a mixture of three Poissons in which one component is concentrated on zero mass. We used the EM algorithm for estimating the regression parameters and detected the same variables with Kim et al's for significantly affecting the response. However, we could estimate the proportion of the fixed zero group to be 0.201, which was the characteristic of this model. We also noted that among two significant variables, the income and the volunteer experience(yes, no), the second variable could be utilized as a strategric variable for promoting the donation.

A Ppoisson Regression Aanlysis of Physician Visits (외래이용빈도 분석의 모형과 기법)

  • 이영조;한달선;배상수
    • Health Policy and Management
    • /
    • v.3 no.2
    • /
    • pp.159-176
    • /
    • 1993
  • The utilization of outpatient care services involves two steps of sequential decisions. The first step decision is about whether to initiate the utilization and the second one is about how many more visits to make after the initiation. Presumably, the initiation decision is largely made by the patient and his or her family, while the number of additional visits is decided under a strong influence of the physician. Implication is that the analysis of the outpatient care utilization requires to specify each of the two decisions underlying the utilization as a distinct stochastic process. This paper is concerned with the number of physician visits, which is, by definition, a discrete variable that can take only non-negative integer values. Since the initial visit is considered in the analysis of whether or not having made any physician visit, the focus on the number of visits made in addition to the initial one must be enough. The number of additional visits, being a kind of count data, could be assumed to exhibit a Poisson distribution. However, it is likely that the distribution is over dispersed since the number of physician visits tends to cluster around a few values but still vary widely. A recently reported study of outpatient care utilization employed an analysis based upon the assumption of a negative binomial distribution which is a type of overdispersed Poisson distribution. But there is an indication that the use of Poisson distribution making adjustments for over-dispersion results in less loss of efficiency in parameter estimation compared to the use of a certain type of distribution like a negative binomial distribution. An analysis of the data for outpatient care utilization was performed focusing on an assessment of appropriateness of available techniques. The data used in the analysis were collected by a community survey in Hwachon Gun, Kangwon Do in 1990. It was observed that a Poisson regression with adjustments for over-dispersion is superior to either an ordinary regression or a Poisson regression without adjustments oor over-dispersion. In conclusion, it seems the most approprite to assume that the number of physician visits made in addition to the initial visist exhibits an overdispersed Poisson distribution when outpatient care utilization is studied based upon a model which embodies the two-part character of the decision process uderlying the utilization.

  • PDF

Prediction of cyanobacteria population based on Poisson regression based on hydro-meteorological condition (수문기상 조건을 고려한 Poisson regression 기반의 Cyanobacteria 개체수 예측)

  • Cho, Hemie;Huong, Nguyen Thi;Moon, Jangwon;Kwon, Hyun-Han
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2020.06a
    • /
    • pp.208-208
    • /
    • 2020
  • 지구온난화와 하천환경의 변화로 수질 오염이 심각해지고 녹조 현상 등의 피해가 증가하고 있다. 특히, 기후변화로 인해 온도와 강우량의 변동성이 동시에 증가하고 있어 하천의 수환경 관리측면에서 어려움이 증가하고 있다. 최근 하천 개발 사업으로 인한 인공 구조물 축조로 하천의 오염도 변화는 중요한 논점으로 대두되었으며, 그에 따라 정확한 수질 전망이 요구되고 있다. 녹조평가에 있어 주요 대리변수(proxy variable)로 chlorophyll-a(Chl-a)가 사용되고 있지만, Chl-a는 규조류와 남조류(cyanobacteria) 모두에서 발견되는 지표로서, 녹조의 유해성을 Chl-a 수질 지표만을 사용하여 판단하기에는 한계가 있다. Chl-a뿐만 아니라 수질에 대한 유량, 온도, 영양염류 등의 영향 또한 기존 연구에서 밝혀진 바 있다. 하지만 기존의 물리기반의 결정론적모형은 수질의 추계학적(stochastic) 특성을 반영하는데 제한적이며, 다양한 수문기상학적 조건을 고려한 시나리오 기반의 분석을 수행하는데 한계가 있다. 따라서 본 연구에서는 특정 지점의 보 건설 이후 수문기상 자료를 이용하여 유해 남조류 개체수와 관계있는 수문기상학적 요인을 평가하고 최종적으로 Bayesian Poisson Regression 기반의 중·장기 녹조 예측 모형을 개발하였으며, 해설결과에 대한 불확실성 정보도 제공할 수 있도록 하였다.

  • PDF

Semiparametric Bayesian Regression Model for Multiple Event Time Data

  • Kim, Yongdai
    • Journal of the Korean Statistical Society
    • /
    • v.31 no.4
    • /
    • pp.509-518
    • /
    • 2002
  • This paper is concerned with semiparametric Bayesian analysis of the proportional intensity regression model of the Poisson process for multiple event time data. A nonparametric prior distribution is put on the baseline cumulative intensity function and a usual parametric prior distribution is given to the regression parameter. Also we allow heterogeneity among the intensity processes in different subjects by using unobserved random frailty components. Gibbs sampling approach with the Metropolis-Hastings algorithm is used to explore the posterior distributions. Finally, the results are applied to a real data set.

Semiparametric Kernel Poisson Regression for Longitudinal Count Data

  • Hwang, Chang-Ha;Shim, Joo-Yong
    • Communications for Statistical Applications and Methods
    • /
    • v.15 no.6
    • /
    • pp.1003-1011
    • /
    • 2008
  • Mixed-effect Poisson regression models are widely used for analysis of correlated count data such as those found in longitudinal studies. In this paper, we consider kernel extensions with semiparametric fixed effects and parametric random effects. The estimation is through the penalized likelihood method based on kernel trick and our focus is on the efficient computation and the effective hyperparameter selection. For the selection of hyperparameters, cross-validation techniques are employed. Examples illustrating usage and features of the proposed method are provided.

Marginal Likelihoods for Bayesian Poisson Regression Models

  • Kim, Hyun-Joong;Balgobin Nandram;Kim, Seong-Jun;Choi, Il-Su;Ahn, Yun-Kee;Kim, Chul-Eung
    • Communications for Statistical Applications and Methods
    • /
    • v.11 no.2
    • /
    • pp.381-397
    • /
    • 2004
  • The marginal likelihood has become an important tool for model selection in Bayesian analysis because it can be used to rank the models. We discuss the marginal likelihood for Poisson regression models that are potentially useful in small area estimation. Computation in these models is intensive and it requires an implementation of Markov chain Monte Carlo (MCMC) methods. Using importance sampling and multivariate density estimation, we demonstrate a computation of the marginal likelihood through an output analysis from an MCMC sampler.

Models for forecasting food poisoning occurrences (식중독 발생 예측모형)

  • Yeo, In-Kwon
    • Journal of the Korean Data and Information Science Society
    • /
    • v.23 no.6
    • /
    • pp.1117-1125
    • /
    • 2012
  • The occurrence of food poisoning is usually modeled by meteorological variables like the temperature and the humidity. In this paper, we investigate the relationship between food poisoning occurrence and climate variables in Korea and compare Poisson regression and autoregressive moving average model to select the forecast model. We confirm that lagged climate variables affect the food poisoning occurrences. However, it turns out that, from the viewpoint of the prediction, the number of previous occurrences is more influential to the current occurrence than meteorological variables and Poisson regression model is less reliable.