• 제목/요약/키워드: binomial data

검색결과 343건 처리시간 0.024초

가산자료모형(Count Data Model)을 이용한 버스이용횟수추정에 관한 연구 (서울시 통근.통학자를 대상으로) (Count Data Model for The Estimation of Bus Ridership (Focusing on Commuters and Students in Seoul))

  • 문진수;김순관;임강원
    • 대한교통학회지
    • /
    • 제17권5호
    • /
    • pp.123-135
    • /
    • 1999
  • 개인교통수단의 선호로 인한 자가용 승용차의 급증은 서울시의 교통혼잡을 가중시키는 주요한 요인이 되고 있다. 이러한 서울시의 교통혼잡을 완화하기 위해서는 대중교통 중심의 교통체계가 구축되어야 하며 승용차 이용자를 대중교통수단으로 유인할 수 있는 대중교통 활성화정책이 필요하다. 이러한 인식하에 버스를 이용하는 통근 및 통학목적 통행자의 버스이용횟수에 대한 개별행태모형을 통하여 버스 이용에 영향을 미치는 요인을 파악함으로써 승용차 이용자를 대중교통수단으로 유인할 수 있는 정책적인 시사점을 도출하고자 하였다. 본 연구의 목적은 일주일간 버스이용횟수 추정에 적합한 가산자료모형의 적용이다. 국내에서는 가산자료모형을 이용한 연구가 많지 않은 실정이며, 또한 모형의 설정시 과산포(overdispersion)에 대한 검정을 통하여 자료에 적합한 모형을 설정하는 것이 중요함에도 불구하고 적절한 검정없이 일반적으로 사용되고 있는 포와송 회귀모형을 주로 사용하여 왔다. 그러나 본 연구에서는 가산자료모형을 선정하기 전에 과산포에 대한 통계적인 검정을 시행한 결과 음이항 회귀모형이 본 연구의 자료에 적합한 것으로 판정되었으며, 모형설정의 중요성을 살펴보기 위하여 음이항 회귀모형을 이용하여 추정한 결과와 포와송 회귀모형을 이용하여 추정한 결과를 비교하여 보았다.

  • PDF

Application of discrete Weibull regression model with multiple imputation

  • Yoo, Hanna
    • Communications for Statistical Applications and Methods
    • /
    • 제26권3호
    • /
    • pp.325-336
    • /
    • 2019
  • In this article we extend the discrete Weibull regression model in the presence of missing data. Discrete Weibull regression models can be adapted to various type of dispersion data however, it is not widely used. Recently Yoo (Journal of the Korean Data and Information Science Society, 30, 11-22, 2019) adapted the discrete Weibull regression model using single imputation. We extend their studies by using multiple imputation also with several various settings and compare the results. The purpose of this study is to address the merit of using multiple imputation in the presence of missing data in discrete count data. We analyzed the seventh Korean National Health and Nutrition Examination Survey (KNHANES VII), from 2016 to assess the factors influencing the variable, 1 month hospital stay, and we compared the results using discrete Weibull regression model with those of Poisson, negative Binomial and zero-inflated Poisson regression models, which are widely used in count data analyses. The results showed that the discrete Weibull regression model using multiple imputation provided the best fit. We also performed simulation studies to show the accuracy of the discrete Weibull regression using multiple imputation given both under- and over-dispersed distribution, as well as varying missing rates and sample size. Sensitivity analysis showed the influence of mis-specification and the robustness of the discrete Weibull model. Using imputation with discrete Weibull regression to analyze discrete data will increase explanatory power and is widely applicable to various types of dispersion data with a unified model.

조류인플루엔자 바이러스의 양-반응 모형 (Dose-Response Relationship of Avian Influenza Virus Based on Feeding Trials in Humans and Chickens)

  • 박선일;이제용;전종민
    • 한국임상수의학회지
    • /
    • 제28권1호
    • /
    • pp.101-107
    • /
    • 2011
  • This study aimed to determine dose-response (DR) curve of avian influenza (AI) virus to predict the probability of illness or adverse health effects that may result from exposure to a pathogenic microorganism in a quantitative microbial risk assessment. To determine the parametric DR relationship of several strains of AI virus, 7 feeding trial data sets challenging humans (5 sets) and chickens (2 sets) for strains of H3N2 (4 sets), H5N1 (2 sets) and H1N1 (1 set) from the published literatures. Except for one data set (study with intra-tracheal inoculation for data set no. 6), all were obtained from the studies with intranasal inoculation. The data were analyzed using three types of DR model as the basis of heterogeneity in infectivity of AI strains in humans and chickens: exponential, beta-binomial and beta-Poisson. We fitted to the data using maximum likelihood estimation to get the parameter estimates of each model. The alpha and beta values of the beta-Poisson DR model ranged 0.06-0.19 and 1.7-48.8, respectively for H3N2 strain. Corresponding values for H5N1 ranged 0.464-0.563 and 97.3-99.4, respectively. For H1N1 the parameter values were 0.103 and 12.7, respectively. Using the exponential model, r (infectivity parameter) ranged from $1.6{\times}10^{-8}$ to $1.2{\times}10^{-5}$ for H3N2 and from $7.5{\times}10^{-3}$ to $4.0{\times}10^{-2}$ for H5N1, while the value was $1.6{\times}10^{-8}$ for H1N1. The beta-Poisson DR model provided the best fit to five of 7 data sets tested, and the estimated parameter values in betabinomial model were very close to those of beta-Poisson. Our study indicated that beta-binomial or beta-Poisson model could be the choice for DR modeling of AI, even though DR relationship varied depending on the virus strains studied, as indicated in prior studies. Further DR modeling should be conducted to quantify the differences among AI virus strains.

특허출원을 통한 기업 기술혁신 원천분석 : 고기술산업과 중저기술산업의 비교 (Exploration of Enterprise Innovation Sources through Patent Analysis : Comparison of High-Tech Industries and Mid-Tech Industries)

  • 황규희;이중만
    • Journal of Information Technology Applications and Management
    • /
    • 제21권4_spc호
    • /
    • pp.331-344
    • /
    • 2014
  • This study attempts to explore the difference of innovation sources between high-tech industry and mid-tech industry through patent analysis. After extracting 119 corporates, commonly surveyed in 2007 HCCP(Human Capital Corporate Panel) and 2005~2006 Korea Innovation Survey, their patents applied for the Korean Intellectual Property Office in 2007~2012 are analysed mainly through negative binomial regression model. Analytical results shows that external information source could be opposite effects to technological innovation depending on technological level and industrial characteristics. The current results are still bounded in the statistical significance, mainly due to the limited observations and information.

일원배열 가산자료에서의 처리효과 비교 (Analysis of counts in the one-way layout)

  • 이선호
    • 응용통계연구
    • /
    • 제10권1호
    • /
    • pp.105-119
    • /
    • 1997
  • 일원배열형태의 가산 자료집합에서 각 군의 평균을 이용하여 처리효과를 비교할 수 있다. Barnwal과 Paul(1988)은 각 군의 산포모수가 같다는 가정 아래에서 처리에 따른 차이를 검정하는 우도검정통계량과 $C(\alpha)$ 통계량을 유도하였는데 본 연구에서는 이러한 가정이 만족되지 않아도 검정할 수 있도록 통계량을 일반화하였다. 또한 음이항분포 대신 Efron(1986)의 이중지수계 포아송 모형을 도입하여 새로운 통계량을 제시하였다. 모의실험을 통해 이중지수계 포아송 모형으로부터 유도된 $C(\alpha)$ 통계량이 어느 경우에나 적합함을 밝혔다.

  • PDF

Likelihood Based Confidence Intervals for the Difference of Proportions in Two Doubly Sampled Data with a Common False-Positive Error Rate

  • Lee, Seung-Chun
    • Communications for Statistical Applications and Methods
    • /
    • 제17권5호
    • /
    • pp.679-688
    • /
    • 2010
  • Lee (2010) developed a confidence interval for the difference of binomial proportions in two doubly sampled data subject to false-positive errors. The confidence interval seems to be adequate for a general double sampling model subject to false-positive misclassification. However, in many applications, the false-positive error rates could be the same. On this note, the construction of asymptotic confidence interval is considered when the false-positive error rates are common. The coverage behaviors of nine likelihood based confidence intervals are examined. It is shown that the confidence interval based Rao score with the expected information has good performance in terms of coverage probability and expected width.

수질변화의 계량화를 위한 비모수적 통계 준거에 관한 연구 (A Study of Non-parametric Statistical Tests to Quantify the Change of Water Quality)

  • 이상훈
    • 환경영향평가
    • /
    • 제6권1호
    • /
    • pp.111-119
    • /
    • 1997
  • This study was carried out to suggest the best statistical test which may be used to quantify the change of water quality between two groups. Traditional t-test may not be used in cases where the normality of underlying population distribution is not assured. Three non-parametric tests which are based on the relative order of the measurements, were studied to find out the applicability in water quality data analysis. The sign test is based on the sign of the deviation of the measurement from the median value, and the binomial distribution table is used. The signed rank test utilizes not only the sign but also the magnitude of the deviation. The Wilcoxon rank-sum test which is basically same as Mann-Whitney test, tests the mean difference between two independent samples which may have missing data. Among the three non-parametric tests studied, the singed rank test was found out to be applicable in the quantification of the change of water quality between two samples.

  • PDF

Forecasting evaluation via parametric bootstrap for threshold-INARCH models

  • Kim, Deok Ryun;Hwang, Sun Young
    • Communications for Statistical Applications and Methods
    • /
    • 제27권2호
    • /
    • pp.177-187
    • /
    • 2020
  • This article is concerned with the issue of forecasting and evaluation of threshold-asymmetric volatility models for time series of count data. In particular, threshold integer-valued models with conditional Poisson and conditional negative binomial distributions are highlighted. Based on the parametric bootstrap method, some evaluation measures are discussed in terms of one-step ahead forecasting. A parametric bootstrap procedure is explained from which directional measure, magnitude measure and expected cost of misclassification are discussed to evaluate competing models. The cholera data in Bangladesh from 1988 to 2016 is analyzed as a real application.

이단계 소지역추정 (Two Stage Small Area Estimation)

  • 이상은;신기일
    • 응용통계연구
    • /
    • 제25권2호
    • /
    • pp.293-300
    • /
    • 2012
  • 지역 또는 도메인에 작은 크기의 표본이 배정되어 추정의 정도가 낮을 때 사용하는 통계적 기법인 소지역추정에 관한 많은 연구가 진행되고 있다. 소지역추정에 사용되는 자료는 단위수준자료(unit level data)와 지역수준자료(area level data)로 분류된다. 본 논문에서는 단위수준자료를 이용하여 소지역추정을 실시한 후 얻어진 추정값에 공간통계분석기법을 도입하여 최종적인 소지역추정값을 얻는 이단계 소지역추정법을 제안하였다. 제안된 소지역추정법은 단위수준자료가 갖고 있는 정보와 지역수준자료가 갖고 있는 공간정보를 모두 이용하는 방법으로 추정의 정도를 높일 수 있는 새로운 방법이다. 본 논문에서는 경제활동인구조사 자료를 이용한 모의실험을 통해 이단계 소지역추정법의 우수성을 확인하였다.

Weighted zero-inflated Poisson mixed model with an application to Medicaid utilization data

  • Lee, Sang Mee;Karrison, Theodore;Nocon, Robert S.;Huang, Elbert
    • Communications for Statistical Applications and Methods
    • /
    • 제25권2호
    • /
    • pp.173-184
    • /
    • 2018
  • In medical or public health research, it is common to encounter clustered or longitudinal count data that exhibit excess zeros. For example, health care utilization data often have a multi-modal distribution with excess zeroes as well as a multilevel structure where patients are nested within physicians and hospitals. To analyze this type of data, zero-inflated count models with mixed effects have been developed where a count response variable is assumed to be distributed as a mixture of a Poisson or negative binomial and a distribution with a point mass of zeros that include random effects. However, no study has considered a situation where data are also censored due to the finite nature of the observation period or follow-up. In this paper, we present a weighted version of zero-inflated Poisson model with random effects accounting for variable individual follow-up times. We suggested two different types of weight function. The performance of the proposed model is evaluated and compared to a standard zero-inflated mixed model through simulation studies. This approach is then applied to Medicaid data analysis.