• Title/Summary/Keyword: 음이항회귀분석

Search Result 93, Processing Time 0.02 seconds

Fitting Distribution of Accident Frequency of Freeway Horizontal Curve Sections & Development of Negative Binomial Regression Models (고속도로 평면선형상 사고빈도분포 추정을 통한 음이항회귀모형 개발 (기하구조요인을 중심으로))

  • 강민욱;도철웅;손봉수
    • Journal of Korean Society of Transportation
    • /
    • v.20 no.7
    • /
    • pp.197-204
    • /
    • 2002
  • 교통사고예측 및 예방을 위해서는 실제적으로 도로설계과정에서 제어가 가능한 도로 기하구조요소에 대한 사고관계를 파악함이 타당하다. 즉, 도로의 설계자는 도로건설에 앞서 기하구조요소와 사고와의 관계를 현장자료를 통해 정확히 밝혀 도로설계에 반영해야 한다. 이를 위해, 교통사고의 빈도분포를 박히는 것은 가장 기본이 되는 일이며, 교통사고 예측모형개발에 선행되어야 한다. 일반적으로 교통사고건수의 경우 분산이 평균보다 큰 과분산(overdispersion)의 특징을 가지고 있어 음이항 분포를 따른다고 알려져 있다. 따라서 본 논문은 사고모형의 개발에 앞서, 사고발생지점에 대한 도로설계요소와 기타 잠재적인 사고발생 관련요인이 비교적 잘 파악되어있는 호남고속도로를 중심으로 평면 선형상 곡선부에 대하여 교통사고의 분포를 적합도 검정을 통해 알아보고자 하였다. 사고자료는 한국도로송사의 호남고속도로 5년(1996∼2000)간 자료를 분석에 맞게 정리하였으며, 강민욱과 송봉수(2002)에서 제시한 평면선형에 있어서의 구간분할법을 이용하여 배향곡선구간과 단일곡선구간에 대한 사고분석을 하였다. 적합도 분석결과, 예상대로 음이항분포가 사고건수를 설명하기에 가장 적합한 확률분포로 제시되었으며, 이를 통해 최우추정법을 이용한 음이항회귀모형을 개발하였다. 구간분할법을 적용한 음이항회귀모형의 경우, 기존의 확률회귀토형에 비하여 높은 결정계수를 갖았으며, 모형에서 적용된 기하구조요소로는 차량 노출계수, 곡선반경, 단위거리 당 편경사변화값 등이다.

The Data-based Prediction of Police Calls Using Machine Learning (기계학습을 활용한 데이터 기반 경찰신고건수 예측)

  • Choi, Jaehun
    • The Journal of Bigdata
    • /
    • v.3 no.2
    • /
    • pp.101-112
    • /
    • 2018
  • The purpose of the study is to predict the number of police calls using neural network which is one of the machine learning and negative binomial regression, by using the data of 112 police calls received from Chungnam Provincial Police Agency from June 2016 to May 2017. The variables which may affect the police calls have been selected for developing the prediction model : time, holiday, the day before holiday, season, temperature, precipitation, wind speed, jurisdictional area, population, the number of foreigners, single house rate and other house rate. Some variables show positive correlation, and others negative one. The comparison of the methods can be summarized as follows. Neural network has correlation coefficient of 0.7702 between predicted and actual values with RMSE 2.557. Negative binomial regression on the other hand shows correlation coefficient of 0.7158 with RMSE 2.831. Neural network has low interpretability, but an excellent predictability compared with the negative binomial regression. Based on the prediction model, the police agency can do the optimal manpower allocation for given values in the selected variables.

A Bayesian zero-inflated negative binomial regression model based on Pólya-Gamma latent variables with an application to pharmaceutical data (폴랴-감마 잠재변수에 기반한 베이지안 영과잉 음이항 회귀모형: 약학 자료에의 응용)

  • Seo, Gi Tae;Hwang, Beom Seuk
    • The Korean Journal of Applied Statistics
    • /
    • v.35 no.2
    • /
    • pp.311-325
    • /
    • 2022
  • For count responses, the situation of excess zeros often occurs in various research fields. Zero-inflated model is a common choice for modeling such count data. Bayesian inference for the zero-inflated model has long been recognized as a hard problem because the form of conditional posterior distribution is not in closed form. Recently, however, Pillow and Scott (2012) and Polson et al. (2013) proposed a Pólya-Gamma data-augmentation strategy for logistic and negative binomial models, facilitating Bayesian inference for the zero-inflated model. We apply Bayesian zero-inflated negative binomial regression model to longitudinal pharmaceutical data which have been previously analyzed by Min and Agresti (2005). To facilitate posterior sampling for longitudinal zero-inflated model, we use the Pólya-Gamma data-augmentation strategy.

Forecasting hierarchical time series for foodborne disease outbreaks (식중독 발생 건수에 대한 계층 시계열 예측)

  • In-Kwon Yeo
    • The Korean Journal of Applied Statistics
    • /
    • v.37 no.4
    • /
    • pp.499 -508
    • /
    • 2024
  • In this paper, we investigate hierarchical time series forecasting that adhere to a hierarchical structure when deriving predicted values by analyzing segmented data as well as aggregated datasets. The occurrences of food poisoning by a specific pathogen are analyzed using zero-inflated Poisson regression models and negative binomial regression models. The occurrences of major, miscellaneous, and overall food poisoning are analyzed using Poisson regression models and negative binomial regression models. For hierarchical time series forecasting, the MinT estimation proposed by Wickramasuriya et al. (2019) is employed. Negative predicted values resulting from hierarchical adjustments are adjusted to zero, and weights are multiplied to the remaining lowest-level variables to satisfy the hierarchical structure. Empirical analysis revealed that there is little difference between hierarchical and non-hierarchical adjustments in predictions based on pathogens. However, hierarchical adjustments generally yield superior results for predictions concerning major, miscellaneous, and overall occurrences. Without hierarchical adjustment, instances may occur where the predicted frequencies of the lowest-level variables exceed that of major or miscellaneous occurrences. However, the proposed method enables the acquisition of predictions that adhere to the hierarchical structure.

A Study on the Influence of the Space Syntax and the Urban Characteristics on the Incidence of Crime Using Negative Binomial Regression (음이항 회귀모형을 이용한 공간구문론 및 도시특성요소가 범죄발생에 미치는 영향 연구)

  • Kim, Hyeong Jun;Choi, Yeol
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.36 no.2
    • /
    • pp.333-340
    • /
    • 2016
  • The aim of this study is to specifically understand the characteristics of the crime by empirical analysis for the determining factors that affect determining the crime through the space syntax in Busan. In this study, poisson regression and negative binomial regression were used for accurate analysis. 8 variables that were significant of the total 13 variables. The summary if this study based on the results is as follow. Statistically significant variables are female ratio, over 65 population ratio, administration are and commercial area ratio in characteristics. And the more CCTVs a region has, the lower crime rate it shows. As a results of examing whether space syntax variables can predict crime occurrence places. Space with low connectivity come to be a crime causal factor because they have few other related spaces and thereby have low possibility of sudden appearance of interrupters, which results in low surveillance levels of foot passengers. It will provide the basic data that can contribute to urban planning and implementation of crime prevention aspects.

The study on the determinants of the number of job changes (중소기업 청년인턴 이직횟수 결정요인 분석)

  • Park, Sungik;Ryu, Jangsoo;Kim, Jonghan;Cho, Jangsik
    • Journal of the Korean Data and Information Science Society
    • /
    • v.26 no.2
    • /
    • pp.387-397
    • /
    • 2015
  • In this paper, the determinants of the number of job changes in the SMEs (small and medium enterprises) youth-intern project is analysed, utilizing SMEs youth-intern DB and employment insurance DB. Since the number of job changes are count data which take integer values other than negative values, general linear regression analysis becomes inappropriate. Therefore, four models such as Poisson regression model, zero inflated Poisson regression model, negative binomial regression model and zero inflated negative binomial regression model are tried to fit count data. A zero inflated negative binomial regression model is selected to be the best model. Major results are the followings. First, the number of job changes is shown to be significantly smaller in the treatment group than in the control group. Second, the number of job changes turns out to be significantly smaller in the young-age group than in the old-age group. Third, it is also shown that the number of job changes of man is significantly greater than that of woman. Lastly, the number of job changes in the bigger firm is shown to be significantly less than that of the smaller firm.

Rear-end Accident Models of Rural Area Signalized Intersections in the Cases of Cheongju and Cheongwon (청주.청원 지방부 신호교차로의 후미추돌 사고모형)

  • Park, Byoung-Ho;In, Byung-Chul
    • International Journal of Highway Engineering
    • /
    • v.11 no.2
    • /
    • pp.151-158
    • /
    • 2009
  • This study deals with the rear-end collisions in the rural aiea. The objectives of this study are 1) to analyze the characteristics of rear-end accidents of signalized intersections, and 2) to develop the accident models for Cheongju-Cheongwon. In pursing the above, this study gives the particular attentions to comparing the characters of urban and rural area. In this study, the dependent variables are the number of accidents and value of EPDO(equivalent property damage only), and independent variables are the traffic volumes and geometric elements. The main results analyzed are the followings. First, the statistical analyses show that the Poisson accident model using the number of accident as a dependant variable are statistically significant and the negative binomial accident model using the value of EPDO are statistically significant. Second, the independent variables of Poisson model are analyzed to be the ratio of high-occupancy vehicles, total traffic volume and the sum of exit/entry, and those of negative binomial regression are the main road width, total traffic volume and the ratio of high-occupancy vehicles. Finally, the specific independent variables to the rural area are the main road width, the ratio of high occupancy vehicle, and the sum exit/entry.

  • PDF

Accident Models of Rotary by Vehicle Type (차량유형별 로터리 사고모형)

  • Han, Su-San;Park, Byeong-Ho
    • Journal of Korean Society of Transportation
    • /
    • v.29 no.6
    • /
    • pp.67-74
    • /
    • 2011
  • This study deals with the traffic accidents data from the Korean rotaries (circular intersections) to verify their characteristics affected by different vehicle types. This paper categorized the data into three groups based on vehicle types, and developed a set of accident models. The paper proposed two ZIP models and one negative binomial model through a statistical analysis for three vehicle types: automobile, truck and van, and others. The differences among those models were then statistically compared.

Traffic Accident Models of Cheongju Four-Legged Signalized Intersections by Accident Type (사고유형에 따른 청주시 4지 신호교차로 교통사고모형)

  • Park, Byung-Ho;Han, Sang-Wook;Kim, Tae-Young;Kim, Won-Ho
    • Journal of Korean Society of Transportation
    • /
    • v.26 no.5
    • /
    • pp.153-162
    • /
    • 2008
  • This study deals with the traffic accidents at the 4-legged signalized intersections in Cheong-ju. The purpose is to comparatively analyze the characteristics and models by the accident type using the data of 143 intersections. In pursuing the above, this study gives particular emphasis to modeling such the accidents as head on collision, rear end collision, side swipe, side right angle collision, and others. The main results are the followings. First, the overdispersion tests show that the negative binomial regression models are appropriate to the traffic accident data in the above contexts. Second, five accident models are developed, which are all analyzed to be statistically significant. Finally, the models are comparatively evaluated using the common variable(ADT) and type-specific variables.

A Study on Impact of Factors Influencing Maritime Freight Rates Using Poisson and Negative Binomial Regression Analysis on Blank Sailings of Shipping Companies (포아송 및 음이항 회귀분석을 이용한 해상운임 결정요인이 해운선사의 블랭크 세일링에 미치는 영향 분석 연구)

  • Won-Hyeong Ryu;Hyung-Sik Nam
    • Journal of Navigation and Port Research
    • /
    • v.48 no.1
    • /
    • pp.62-77
    • /
    • 2024
  • In the maritime shipping industry, imbalance between supply and demand has persistently increased, leading to the utilization of blank sailings by major shipping companies worldwide as a key means of flexibly adjusting vessel capacity in response to shipping market conditions. Traditionally, blank sailings have been frequently implemented around the Chinese New Year period. However, due to unique circumstances such as the global pandemic starting in 2020 and trade tensions between the United States and China, shipping companies have recently conducted larger-scale blank sailings compared to the past. As blank sailings directly impact freight transport delays, they can have negative repercussions from perspectives of both businesses and consumers. Therefore, this study employed Poisson regression models and negative binomial regression models to analyze the influence of maritime freight rate determinants on shipping companies' decisions regarding blank sailings, aiming to proactively address potential consequences. Results of the analysis indicated that, in Poisson regression analysis for 2M, significant variables included global container shipping volume, container vessel capacity, container ship scrapping volume, container ship newbuilding index, and OECD inflation. In negative binomial regression analysis, ocean alliance showed significance with global container shipping volume and container ship order volume, the alliance with container ship capacity and interest rates, non-alliance with international oil prices, global supply chain pressure index, container ship capacity, OECD inflation, and total alliance with container ship capacity and interest rates.