• 제목/요약/키워드: negative binomial

검색결과 297건 처리시간 0.023초

기계학습을 활용한 데이터 기반 경찰신고건수 예측 (The Data-based Prediction of Police Calls Using Machine Learning)

  • 최재훈
    • 한국빅데이터학회지
    • /
    • 제3권2호
    • /
    • pp.101-112
    • /
    • 2018
  • 본 연구는 기계학습의 하나인 신경망 분석과 음이항 회귀분석을 활용하여 경찰신고건수를 예측하고자 2016년 6월부터 2017년 5월까지 충남지방경찰청에 접수된 112신고 데이터를 이용하여 예측모델을 개발하였다. 모델을 개발하기 위해 경찰신고건수에 영향을 줄 수 있는 시간, 휴일, 휴일 전날, 계절, 기온, 강수량, 풍속, 관할면적, 인구, 외국인 수, 단독주택비율, 기타주택비율 변수 등을 활용하였다. 변수의 종류에 따라 몇몇은 경찰신고건수와 양의 상관관계 또는 음의 상관관계가 확인되었다. 사용된 두 개의 방법론을 비교한바, 신경망분석의 예측 결과는 예측 값과 실제 값의 상관계수 0.7702, RMSE 2.557이고, 음이항 회귀분석은 상관계수 0.7158, RMSE 2.831으로 나타났다. 신경망분석은 해석가능성은 낮지만, 음이항 회귀분석에 비해 예측력이 뛰어나다는 것이 확인되었다. 향후 경찰관서에서 본 연구의 예측모델을 기초로 하여 최적의 경찰력 배치를 할 수 있을 것으로 기대된다.

Extreme Value of Moving Average Processes with Negative Binomial Noise Distribution

  • Park, You-Sung
    • Journal of the Korean Statistical Society
    • /
    • 제21권2호
    • /
    • pp.167-177
    • /
    • 1992
  • In this paper, we investigate the limiting distribution of $M_n = max (X_1, X-2, \cdots, X_n)$ in the infinite moving average process ${X_t = \sum c_i Z_{t-i}}$ generated from i.i.d. negative binomial variables $Z_i$'s. While no limit result is possible, nonetheless asymptotic bounds are derived. We also present the tail behavior of $X_t$, i.e., weighted sum of i.i.d. random variables. This continues a study made by Rootzen (1986) for discrete innovation sequences.

  • PDF

Modelling Count Responses with Overdispersion

  • Jeong, Kwang Mo
    • Communications for Statistical Applications and Methods
    • /
    • 제19권6호
    • /
    • pp.761-770
    • /
    • 2012
  • We frequently encounter outcomes of count that have extra variation. This paper considers several alternative models for overdispersed count responses such as a quasi-Poisson model, zero-inflated Poisson model and a negative binomial model with a special focus on a generalized linear mixed model. We also explain various goodness-of-fit criteria by discussing their appropriateness of applicability and cautions on misuses according to the patterns of response categories. The overdispersion models for counts data have been explained through two examples with different response patterns.

Multivariate Modified Discrete Distributions

  • Lingappaiah, G.S.
    • Journal of the Korean Statistical Society
    • /
    • 제15권1호
    • /
    • pp.71-78
    • /
    • 1986
  • In this paper, multivariate discrete distribution is dealt with, where a set of r distinct counts are misreported as another set of r counts. First, the variance for the one variable marginal case is expressed in the form of an inverted parabola. Next, for the multivariate negative binomial case, elements of the covariance matrix are evaluated with reference to asymptotic distributions. Finally, for the same case of multivariate negative binomial, Bayesian estimates of the parameters and of the modification rates are provided.

  • PDF

허베이 스피리트호의 기름유출에 따른 바다유어낚시어선 이용객의 경제적 손실평가연구 (Evaluating the Economic Damages to Anglers of the Marine Recreational Charter due to the Herbei Spirit Vessel Oil Spill)

  • 표희동
    • Ocean and Polar Research
    • /
    • 제36권3호
    • /
    • pp.289-302
    • /
    • 2014
  • This paper aims to evaluate the indirect economic damages to anglers of the marine recreational charter caused by marine pollution associated with the Herbei Spirit vessel, which spilled 12,547 kl of crude oil in Taean coastal areas in December 2007. In order to evaluate the indirect cost to anglers of the charter fishing, consumer surplus for charter fishing is estimated using a Poisson model (PM), a negative binomial model (NBM), a truncated Poisson model (TPM), and a truncated negative binomial model (TNBM), which account for the characteristics of count data (non-negative discrete data), for individual travel cost method (ITCM). Because of over-dispersion problem in PM and TPM, NBM and TNBM are considered to be more appropriate statistically. All parameters such as income, fishing careers, travel cost and catch that are estimated are statistically significant and theoretically valid. Based on TNBM results, consumer surplus per trip and per person was estimated to be 277 thousand won, total consumer surplus per person and per year about 2.3 million won, and the marginal effect of consumer surplus on % changes in catch rate is about 33 thousand won. The consumer surplus was converted into total indirect economic damages for aggregation which are evaluated to be 125 billion won, reflecting the number of anglers and damage rate.

개별여행비용법을 이용한 바다 유어 낚시의 소비자 잉여추정 (Estimating Consumer Surplus for Recreational Sea Fishing using Individual Travel Cost Method)

  • 표희동;박철형;정진호
    • Ocean and Polar Research
    • /
    • 제30권2호
    • /
    • pp.141-148
    • /
    • 2008
  • This paper aims at estimating consumer surplus for recreational sea fishing in Tongyeong coastal area using individual travel cost method. A Poisson model (PM), a negative binomial model (NBM), a truncated Poisson model (TPM), and a truncated negative binomial model (TNBM) are applied for individual travel cost method in order to account characteristics of count data (non-negative discrete data.) The survey was conducted for 462 inshore anglers using personal interview method in Tongyeong during July and October 2007. Respondents were asked about how often they do fishing, travel costs, catch, income, and so on. Because of over-dispersion problem in PM and TPM, NBM and TNBM were considered to be more appropriate statistically. All parameters estimated are statistically significant and theoretically valid. As the results based on TNBM, consumer surplus per trip was estimated to be 183,486 won, total consumer surplus per person and per year 3,399,658 won, and the marginal effect of consumer surplus on % changes in catch rate is 185,372 won.

Mixed Effects Kernel Binomial Regression

  • Hwang, Chang-Ha
    • Journal of the Korean Data and Information Science Society
    • /
    • 제19권4호
    • /
    • pp.1327-1334
    • /
    • 2008
  • Mixed effect binomial regression models are widely used for analysis of correlated count data in which the response is the result of a series of one of two possible disjoint outcomes. In this paper, we consider kernel extensions with nonparametric fixed effects and parametric random effects. The estimation is through the penalized likelihood method based on kernel trick, and our focus is on the efficient computation and the effective hyperparameter selection. For the selection of hyperparameters, cross-validation techniques are employed. Examples illustrating usage and features of the proposed method are provided.

  • PDF

An Analysis of the Control Limit in p-chart Applying Binomial Distribution Using Commercial Software

  • Yoo Wang-Jin;Park Won-Joo
    • 한국품질경영학회:학술대회논문집
    • /
    • 한국품질경영학회 1998년도 The 12th Asia Quality Management Symposium* Total Quality Management for Restoring Competitiveness
    • /
    • pp.198-207
    • /
    • 1998
  • The p chart approximate to the normal distribution has a difficulty to analyze the process condition precisely when the negative LCL is occurred. Furthermore, the probability of Type I error increases compared with using its original binomial distribution. For a long time the p chart has been used as approximated to the normal distribution because of its easy use. However, it becomes rapid and convenient to calculate the binomial distribution through the development of computer and software, so it is strongly suggested to use the binomial distribution determining control limits to reduce the probability of Type I error. In this study, I suggest that the control limits can be designed in use of binomial distribution and they can be utilized without special software by illustrating the certain work for establishing p-chart with the commercial one(EXCEL).

  • PDF

영과잉 가산자료(Zero-inflated Count Data) 분석 방법을 이용한 지역사회 거주 노인의 노인학대 발생과 심각성에 미치는 위험요인 분석 (Risk Factors Influencing Probability and Severity of Elder Abuse in Community-dwelling Older Adults: Applying Zero-inflated Negative Binomial Modeling of Abuse Count Data)

  • 장미희;박창기
    • 대한간호학회지
    • /
    • 제42권6호
    • /
    • pp.819-832
    • /
    • 2012
  • Purpose: This study was conducted to identify risk factors that influence the probability and severity of elder abuse in community-dwelling older adults. Methods: This study was a cross-sectional descriptive study. Self-report questionnaires were used to collect data from community-dwelling Koreans, 65 and older (N=416). Logistic regression, negative binomial regression and zero-inflated negative binomial regression model for abuse count data were utilized to determine risk factors for elder abuse. Results: The rate of older adults who experienced any one category of abuse was 32.5%. By zero-inflated negative binomial regression analysis, the experience of verbal-psychological abuse was associated with marital status and family support, while the experience of physical abuse was associated with self-esteem, perceived economic stress and family support. Family support was found to be a salient risk factor of probability of abuse in both verbal-psychological and physical abuse. Self-esteem was found to be a salient risk factor of probability and severity of abuse in physical abuse alone. Conclusion: The findings suggest that tailored prevention and intervention considering both types of elder abuse and target populations might be beneficial for preventative efficiency of elder abuse.

확률모수를 이용한 교통사고예측모형 개발 -수도권 및 부산광역시 4지 교차로를 대상으로- (A Development of Traffic Accident Model by Random Parameter : Focus on Capital Area and Busan 4-legs Signalized Intersections)

  • 이근희;노정현
    • 한국ITS학회 논문지
    • /
    • 제14권6호
    • /
    • pp.91-99
    • /
    • 2015
  • 본 연구는 서울, 수도권 및 부산광역시의 4지 신호교차로를 대상으로 도로의 기하구조측면, 교통특성, 환경특성 등 다양한 요인을 고려하여 교통사고예측모형을 구축하고 교차로사고와의 상호관계를 규명하고자 하였다. 분석 결과 기존의 음이항 모형보다 확률적 음이항 모형의 설명력이 높게 나타났으며 총 52개의 변수 중 10개의 변수가(주도로의 차로 수, 주도로의 좌회전 교통량, 주도로의 주행제약시설 수, 부도로의 우회전 교통량, 부도로의 교차로 시거, 교차로의 총 현시, 부도로의 중앙분리대 유무, 부도로의 제한속도, 부도로의 교통섬 유무, 부도로의 속도제약시설 수) 도시부 4지 신호교차로에서 교통사고에 영향을 미치는 유의한 변수로 나타났다. 또한 10개의 유의한 변수 중 2개의 변수가(부도로의 교차로 시거, 부도로의 차량 주행속도 제약 시설물 수)가 확률적 변수로 나타났다.