• Title/Summary/Keyword: binomial data

Search Result 342, Processing Time 0.024 seconds

Pedestrian Accident Models of Roundabout Near Schools by the Number of Entry and Circulatory Lane (회전 및 진입 차로 수에 따른 학교와 인접한 회전교차로 보행자 사고모형)

  • Son, Seul Ki;Park, Byung Ho
    • Journal of the Korean Society of Safety
    • /
    • v.32 no.5
    • /
    • pp.135-140
    • /
    • 2017
  • This study deals with the safety of roundabout. The purpose of this study is to analyze the factors affecting the pedestrian accidents of roundabout near schools. In pursuing the above, this study gives particular attentions to comparatively analyzing the pedestrian accident by number of entry and circulatory lane. The traffic accident data from 2013 to 2015 are collected from TAAS data set of Road Traffic Authority. To develop the pedestrian accident model, the Poisson and negative binomial models has been utilized in this study. Such the dependent variable as the number of pedestrian accidents and the 24 independent variables as geometry, traffic volume and others are used. The main results are as follows. First, 3 Poisson and 2 negative binomial models(${\rho}^2$ of 0.153~0.426) which are all statistically significant are developed. Second, the common variable of models based on the number of circulatory roadway lane is analyzed to be the entry lane width and that of the number of entry lane is evaluated to be the design speed. Also specific variables are evaluated to splitter island, roundabout sign, number of approach road, bus stop and elementary school. Finally, the design speed might be expected to decrease the number of pedestrian accidents near schools.

Fit of the number of insurance solicitor's turnovers using zero-inflated negative binomial regression (영과잉 음이항회귀 모형을 이용한 보험설계사들의 이직횟수 적합)

  • Chun, Heuiju
    • Journal of the Korean Data and Information Science Society
    • /
    • v.28 no.5
    • /
    • pp.1087-1097
    • /
    • 2017
  • This study aims to find the best model to fit the number of insurance solicitor's turnovers of life insurance companies using count data regression models such as poisson regression, negative binomial regression, zero-inflated poisson regression, or zero-inflated negative binomial regression. Out of the four models, zero-inflated negative binomial model has been selected based on AIC and SBC criteria, which is due to over-dispersion and high proportion of zero-counts. The significant factors to affect insurance solicitor's turnover found to be a work period in current company, a total work period as financial planner, an affiliated corporation, and channel management satisfaction. We also have found that as the job satisfaction or the channel management satisfaction gets lower as channel management satisfaction, the number of insurance solicitor's turnovers increases. In addition, the total work period as financial planner has positive relationship with the number of insurance solicitor's turnovers, but the work period in current company has negative relationship with it.

A Fast Transmission of Mobile Agents Using Binomial Trees (바이노미얼 트리를 이용한 이동 에이전트의 빠른 전송)

  • Cho, Soo-Hyun;Kim, Young-Hak
    • The KIPS Transactions:PartA
    • /
    • v.9A no.3
    • /
    • pp.341-350
    • /
    • 2002
  • As network environments have been improved and the use of internet has been increased, mobile agent technologies are widely used in the fields of information retrieval, network management, electronic commerce, and parallel/distributed processing. Recently, a lot of researchers have studied the concepts of parallel/distributed processing based on mobile agents. SPMD is the parallel processing method which transmits a program to all the computers participated in parallel environment, and performs a work with different data. Therefore, to transmit fast a program to all the computers is one of important factors to reduce total execution time. In this paper, we consider the parallel environment consisting of mobile agents system, and propose a new method which transmits fast a mobile agent code to all the computers using binomial trees in order to efficiently perform the SPMD parallel processing. The proposed method is compared with another ones through experimental evaluation on the IBM's Aglets, and gets greatly better performance. Also this paper deals with fault tolerances which can be occurred in transmitting a mobile agent using binomial trees.

A simulation study for the approximate confidence intervals of hypergeometric parameter by using actual coverage probability (실제포함확률을 이용한 초기하분포 모수의 근사신뢰구간 추정에 관한 모의실험 연구)

  • Kim, Dae-Hak
    • Journal of the Korean Data and Information Science Society
    • /
    • v.22 no.6
    • /
    • pp.1175-1182
    • /
    • 2011
  • In this paper, properties of exact confidence interval and some approximate confidence intervals of hyper-geometric parameter, that is the probability of success p in the population is discussed. Usually, binomial distribution is a well known discrete distribution with abundant usage. Hypergeometric distribution frequently replaces a binomial distribution when it is desirable to make allowance for the finiteness of the population size. For example, an application of the hypergeometric distribution arises in describing a probability model for the number of children attacked by an infectious disease, when a fixed number of them are exposed to it. Exact confidence interval estimation of hypergeometric parameter is reviewed. We consider the approximation of hypergeometirc distribution to the binomial and normal distribution respectively. Approximate confidence intervals based on these approximation are also adequately discussed. The performance of exact confidence interval estimates and approximate confidence intervals of hypergeometric parameter is compared in terms of actual coverage probability by small sample Monte Carlo simulation.

The Data-based Prediction of Police Calls Using Machine Learning (기계학습을 활용한 데이터 기반 경찰신고건수 예측)

  • Choi, Jaehun
    • The Journal of Bigdata
    • /
    • v.3 no.2
    • /
    • pp.101-112
    • /
    • 2018
  • The purpose of the study is to predict the number of police calls using neural network which is one of the machine learning and negative binomial regression, by using the data of 112 police calls received from Chungnam Provincial Police Agency from June 2016 to May 2017. The variables which may affect the police calls have been selected for developing the prediction model : time, holiday, the day before holiday, season, temperature, precipitation, wind speed, jurisdictional area, population, the number of foreigners, single house rate and other house rate. Some variables show positive correlation, and others negative one. The comparison of the methods can be summarized as follows. Neural network has correlation coefficient of 0.7702 between predicted and actual values with RMSE 2.557. Negative binomial regression on the other hand shows correlation coefficient of 0.7158 with RMSE 2.831. Neural network has low interpretability, but an excellent predictability compared with the negative binomial regression. Based on the prediction model, the police agency can do the optimal manpower allocation for given values in the selected variables.

Comparative Simulation Studies on Generalized Binomial Models (일반화 이항모형의 적합도 평가)

  • Baik, E.J.;Kim, K.Y.
    • Communications for Statistical Applications and Methods
    • /
    • v.18 no.4
    • /
    • pp.507-516
    • /
    • 2011
  • Comparative studies on generalized binomial models (Moon, 2003; Ng, 1989; Paul, 1985; Kupper and Haseman, 1978; Griffiths, 1973) are restrictive in that the models compared are rather limited and MSE of the estimates is the only measure considered for the model adequacy. This paper is aimed to report simulation results which provide possible guidelines for selecting a proper model. We examine Pearson type of goodness-of-fit statistic to its degrees of freedom and AIC for the overall model quality. MSE and Bias of the individual estimates are also considered as the component fit measures. Performance of some models varies widely for a certain range of the parameter space while most of the models are quite competent. Our evaluation shows that the Extended Beta-Binomial model (Prentice, 1986) turns out to be particularly favorable in the point that it provides consistently excellent fit almost all over the values of the intra-class correlation coefficient and the probability of success.

Cost Performance Evaluation Framework through Analysis of Unstructured Construction Supervision Documents using Binomial Logistic Regression (비정형 공사감리문서 정보와 이항 로지스틱 회귀분석을 이용한 건축 현장 비용성과 평가 프레임워크 개발)

  • Kim, Chang-Won;Song, Taegeun;Lee, Kiseok;Yoo, Wi Sung
    • Journal of the Korea Institute of Building Construction
    • /
    • v.24 no.1
    • /
    • pp.121-131
    • /
    • 2024
  • This research explores the potential of leveraging unstructured data from construction supervision documents, which contain detailed inspection insights from independent third-party monitors of building construction processes. With the evolution of analytical methodologies, such unstructured data has been recognized as a valuable source of information, offering diverse insights. The study introduces a framework designed to assess cost performance by applying advanced analytical methods to the unstructured data found in final construction supervision reports. Specifically, key phrases were identified using text mining and social network analysis techniques, and these phrases were then analyzed through binomial logistic regression to assess cost performance. The study found that predictions of cost performance based on unstructured data from supervision documents achieved an accuracy rate of approximately 73%. The findings of this research are anticipated to serve as a foundational resource for analyzing various forms of unstructured data generated within the construction sector in future projects.

Effects on Regression Estimates under Misspecified Generalized Linear Mixed Models for Counts Data

  • Jeong, Kwang Mo
    • The Korean Journal of Applied Statistics
    • /
    • v.25 no.6
    • /
    • pp.1037-1047
    • /
    • 2012
  • The generalized linear mixed model(GLMM) is widely used in fitting categorical responses of clustered data. In the numerical approximation of likelihood function the normality is assumed for the random effects distribution; subsequently, the commercial statistical packages also routinely fit GLMM under this normality assumption. We may also encounter departures from the distributional assumption on the response variable. It would be interesting to investigate the impact on the estimates of parameters under misspecification of distributions; however, there has been limited researche on these topics. We study the sensitivity or robustness of the maximum likelihood estimators(MLEs) of GLMM for counts data when the true underlying distribution is normal, gamma, exponential, and a mixture of two normal distributions. We also consider the effects on the MLEs when we fit Poisson-normal GLMM whereas the outcomes are generated from the negative binomial distribution with overdispersion. Through a small scale Monte Carlo study we check the empirical coverage probabilities of parameters and biases of MLEs of GLMM.

Evaluation for usefulness of Chukwookee Data in Rainfall Frequency Analysis (강우빈도해석에서의 측우기자료의 유용성 평가)

  • Kim, Kee-Wook;Yoo, Chul-Sang;Park, Min-Kyu;Kim, Dae-Ha;Park, Sangh-Young;Kim, Hyeon-Jun
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2007.05a
    • /
    • pp.1526-1530
    • /
    • 2007
  • In this study, the chukwookee data were evaluated by applying that for the historical rainfall frequency analysis. To derive a two parameter log-normal distribution by using historical data and modern data, censored data MLE and binomial censored data MLE were applied. As a result, we found that both average and standard deviation were all estimated smaller with chukwookee data then those with only modern data. This indicates that rather big events rarely happens during the period of chukwookee data then during the modern period. The frequency analysis results using the parameters estimated were also similar to those expected. The point to be noticed is that the rainfall quantiles estimated by both methods were similar, especially for the 99% threshold. This result indicates that the historical document records like the annals of Chosun dynasty could be valuable and effective for the frequency analysis. This also means the extension of data available for frequency analysis.

  • PDF

Logistic regression model for major separation rate

  • Choi, Jae-Sung
    • Journal of the Korean Data and Information Science Society
    • /
    • v.13 no.2
    • /
    • pp.129-138
    • /
    • 2002
  • This paper deals with logistic regression models for analysing separation rates from majors. The model building procedure shows how to incoporate the effects of some factors causing from three-way nested sampling scheme and discusses what type of characteristics as independent variables directly affecting the rates should be considered.

  • PDF