• Title/Summary/Keyword: statistical estimators

Search Result 591, Processing Time 0.023 seconds

Investigation of Bacterial Diversity in Membrane Bioreactor and Conventional Activated Sludge Processes from Petroleum Refineries Using Phylogenetic and Statistical Approaches

  • Silva, Cynthia;Jesus, Ederson C.;Torres, Ana P. R.;Sousa, Maira P.;Santiago, Vania M. J.;Oliveira, Valeria M.
    • Journal of Microbiology and Biotechnology
    • /
    • v.20 no.3
    • /
    • pp.447-459
    • /
    • 2010
  • Bacterial diversity of two distinct wastewater treatment systems, conventional activated sludge (CAS) and membrane bioreactor (MBR), of petroleum refineries were investigated through 16S rRNA gene libraries. Sequencing and phylogenetic analysis showed that the bacterial community composition of sludge samples was distinct between the two wastewater treatment systems. MBR clones belonged predominantly to Class Betaproteobacteria, represented mainly by genera Thiobacillus and Thauera, whereas CAS clones were mostly related to Class Alphaproteobacteria, represented by uncultured bacteria related to Order Parvularculales. Richness estimators ACE and Chao revealed that the diversity observed in both libraries at the species level is an underestimate of the total bacterial diversity present in the environment and further sampling would yield an increased observed diversity. Shannon and Simpson diversity indices were different between the libraries and revealed greater bacterial diversity for the MBR library, considering an evolutionary distance of 0.03. LIBSHUFF analyses revealed that MBR and CAS communities were significantly different at the 95% confidence level ($P{\leq}0.05$) for distances $0{\leq}D{\leq}0.20$. This work described, qualitatively and quantitatively, the structure of bacterial communities in industrial-scale MBR and CAS processes of the wastewater treatment system from petroleum refineries and demonstrated clearly differentiated communities responsible for the stable performance of wastewater treatment plants.

A sampling design for e-learning industry status survey on the business demand sector (이러닝수요부문 사업체실태조사를 위한 표본설계)

  • Kim, Hea-Jung;Kwak, Hwa-Ryun
    • Journal of the Korean Data and Information Science Society
    • /
    • v.24 no.4
    • /
    • pp.701-712
    • /
    • 2013
  • The e-learning industry status survey statistic provides information about the actual conditions of supply and demand of the e-learning industries. NIPA (National IT Industry Promotion Agency) has published the annual report of the survey results since 2004. Due to the 9th version of the KSIC (Korean standard industrial classification) revised in 2008, a refinement of the sampling design for the survey becomes necessary, especially that for the business demand sector. This article, based on the 9th revision of the KSIC, constructs a stratification of the target population used for the e-learning industry status survey on the business demand sector. Classification of strata in the business population is based on the industrial type and employment scale of business. Under the stratified population, we design a sampling scheme by using the power allocation method that enables us to satisfy a target coefficient of variation of each industrial stratum. In order to secure an accurate survey results based on the proposed sampling design, we consider the problem of calculating the design weights, derivation of parameter estimators, and formulas of their standard errors.

Output Data Analysis of Simulation: A Review (시뮬레이션 출력 자료 분석에 관한 연구)

  • Chang, Byeong-Yun
    • Journal of the Korea Society for Simulation
    • /
    • v.21 no.3
    • /
    • pp.11-16
    • /
    • 2012
  • Simulation is the imitation of the operation of a real-world process or system over time. It concerns the study of the operating characteristics of real systems. Typically, a simulation project consists of several steps such as data collection, coding, model verification, model validation, experimental design, output data analysis, and implementation. Among these steps of a simulation study this paper focus on statistical analysis methods of simulation output data. Specially, we explain how to develop confidence interval estimators for mean ${\mu}$ in terminating and non-terminating simulation cases. We, then, explore the estimation techniques for $f({\mu})$, where the function $f({\bullet})$ is a nonlinear that is continuously differentiable in a neighborhood of ${\mu}$ with $f'({\mu}){\neq}0$.

A comparison study of inverse censoring probability weighting in censored regression (중도절단 회귀모형에서 역절단확률가중 방법 간의 비교연구)

  • Shin, Jungmin;Kim, Hyungwoo;Shin, Seung Jun
    • The Korean Journal of Applied Statistics
    • /
    • v.34 no.6
    • /
    • pp.957-968
    • /
    • 2021
  • Inverse censoring probability weighting (ICPW) is a popular technique in survival data analysis. In applications of the ICPW technique such as the censored regression, it is crucial to accurately estimate the censoring probability. A simulation study is undertaken in this article to see how censoring probability estimate influences model performance in censored regression using the ICPW scheme. We compare three censoring probability estimators, including Kaplan-Meier (KM) estimator, Cox proportional hazard model estimator, and local KM estimator. For the local KM estimator, we propose to reduce the predictor dimension to avoid the curse of dimensionality and consider two popular dimension reduction tools: principal component analysis and sliced inverse regression. Finally, we found that the Cox proportional hazard model estimator shows the best performance as a censoring probability estimator in both mean and median censored regressions.

Robust confidence interval for random coefficient autoregressive model with bootstrap method (붓스트랩 방법을 적용한 확률계수 자기회귀 모형에 대한 로버스트 구간추정)

  • Jo, Na Rae;Lim, Do Sang;Lee, Sung Duck
    • The Korean Journal of Applied Statistics
    • /
    • v.32 no.1
    • /
    • pp.99-109
    • /
    • 2019
  • We compared the confidence intervals of estimators using various bootstrap methods for a Random Coefficient Autoregressive(RCA) model. We consider a Quasi score estimator and M-Quasi score estimator using Huber, Tukey, Andrew and Hempel functions as bounded functions, that do not have required assumption of distribution. A standard bootstrap method, percentile bootstrap method, studentized bootstrap method and hybrid bootstrap method were proposed for the estimations, respectively. In a simulation study, we compared the asymptotic confidence intervals of the Quasi score and M-Quasi score estimator with the bootstrap confidence intervals using the four bootstrap methods when the underlying distribution of the error term of the RCA model follows the normal distribution, the contaminated normal distribution and the double exponential distribution, respectively.

Reliability using Cronbach alpha in sample survey (표본조사에서 크론바흐알파값을 사용한 신뢰성)

  • Park, Hyeonah
    • The Korean Journal of Applied Statistics
    • /
    • v.34 no.1
    • /
    • pp.1-8
    • /
    • 2021
  • Abstract concepts in social research must use measurement tools that are assured of validity and reliability. Observation score derived by a measurement tool can be divided into a valid observation score, a biased observation score, and an error. The presence or absence of a biased value is associated with validity, and the presence or absence of an error value is associated with reliability. There are many techniques for seeing whether a measurement tool is valid and reliable. For example, there are construct validity using factor analysis and internal consistency based on the Cronbach alpha. In this study, the calculation of the Cronbach alpha is derived through a sample, so we suggest an estimator of the Cronbach alpha under complex sample design and nonresponse. In a simulation, the proposed method is compared with many other existing estimators of Cronbach alpha under a multivariate normal distribution.

Use of various drought indices to analysis drought characteristics under climate change in the Doam watershed

  • Sayed Shajahan Sadiqi;Eun-Mi Hong;Won-Ho Nam
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2023.05a
    • /
    • pp.178-178
    • /
    • 2023
  • Drought and flooding have historically coexisted in Korea, occurring at different times and with varying cycles and trends. The drought indicators measured were (PDSI), (SPI), and (SPEI) in order to statistically analyze the annual or periodic drought occurrence and objectively evaluate statistical characteristics such as the periodicity, tendency, and frequency of occurrence of droughts in the Doam watershed. To compute potential evapotranspiration (PET), both Thornthwaite (Thor) and Penman-Monteith (PM) parameterizations were considered, and the differences between the two PET estimators were analyzed. Hence, SPIs 3 and SPIs 6 revealed a tendency to worsen drought in the spring and winter and a tendency to alleviate drought in the summer in the study area. The seasonal variability trend did not occur in the SPIs 12 and PDSI, as it did in the drought index over a short period. As a result of the drought trend study, the drought from winter to spring gets more severe, in addition to the duration of the drought, although the periodicity of the recurrence of the drought ranged from 3 years to 6 years at the longest, indicating that SPIs 3 showed a brief time of around 1 year. SPIs 6 and SPIs 12 had a term of 4 to 6 years, and PDSI had a period of roughly 6 years. Based on the indicators of the PDSI, SPI, and SPEI, the drought severity increases under climate change conditions with the decrease in precipitation and increased water demand as a consequence of the temperature increase. Therefore, our findings show that national and practical measures are needed for both winter and spring droughts, which happen every year, as well as large-scale and extreme droughts, which happen every six years.

  • PDF

Performance of a Bayesian Design Compared to Some Optimal Designs for Linear Calibration (선형 캘리브레이션에서 베이지안 실험계획과 기존의 최적실험계획과의 효과비교)

  • 김성철
    • The Korean Journal of Applied Statistics
    • /
    • v.10 no.1
    • /
    • pp.69-84
    • /
    • 1997
  • We consider a linear calibration problem, $y_i = $$\alpha + \beta (x_i - x_0) + \epsilon_i$, $i=1, 2, {\cdot}{\cdot},n$ $y_f = \alpha + \beta (x_f - x_0) + \epsilon, $ where we observe $(x_i, y_i)$'s for the controlled calibration experiments and later we make inference about $x_f$ from a new observation $y_f$. The objective of the calibration design problem is to find the optimal design $x = (x_i, \cdots, x_n$ that gives the best estimates for $x_f$. We compare Kim(1989)'s Bayesian design which minimizes the expected value of the posterior variance of $x_f$ and some optimal designs from literature. Kim suggested the Bayesian optimal design based on the analysis of the characteristics of the expected loss function and numerical must be equal to the prior mean and that the sum of squares be as large as possible. The designs to be compared are (1) Buonaccorsi(1986)'s AV optimal design that minimizes the average asymptotic variance of the classical estimators, (2) D-optimal and A-optimal design for the linear regression model that optimize some functions of $M(x) = \sum x_i x_i'$, and (3) Hunter & Lamboy (1981)'s reference design from their paper. In order to compare the designs which are optimal in some sense, we consider two criteria. First, we compare them by the expected posterior variance criterion and secondly, we perform the Monte Carlo simulation to obtain the HPD intervals and compare the lengths of them. If the prior mean of $x_f$ is at the center of the finite design interval, then the Bayesian, AV optimal, D-optimal and A-optimal designs are indentical and they are equally weighted end-point design. However if the prior mean is not at the center, then they are not expected to be identical.In this case, we demonstrate that the almost Bayesian-optimal design was slightly better than the approximate AV optimal design. We also investigate the effects of the prior variance of the parameters and solution for the case when the number of experiments is odd.

  • PDF

Preliminary test estimation method accounting for error variance structure in nonlinear regression models (비선형 회귀모형에서 오차의 분산에 따른 예비검정 추정방법)

  • Yu, Hyewon;Lim, Changwon
    • The Korean Journal of Applied Statistics
    • /
    • v.29 no.4
    • /
    • pp.595-611
    • /
    • 2016
  • We use nonlinear regression models (such as the Hill Model) when we analyze data in toxicology and/or pharmacology. In nonlinear regression models an estimator of parameters and estimation of measurement about uncertainty of the estimator are influenced by the variance structure of the error. Thus, estimation methods should be different depending on whether the data are homoscedastic or heteroscedastic. However, we do not know the variance structure of the error until we actually analyze the data. Therefore, developing estimation methods robust to the variance structure of the error is an important problem. In this paper we propose a method to estimate parameters in nonlinear regression models based on a preliminary test. We define an estimator which uses either the ordinary least square estimation method or the iterative weighted least square estimation method according to the results of a simple preliminary test for the equality of the error variance. The performance of the proposed estimator is compared to those of existing estimators by simulation studies. We also compare estimation methods using real data obtained from the National Toxicology program of the United States.

Analysis of the cause-specific proportional hazards model with missing covariates (누락된 공변량을 가진 원인별 비례위험모형의 분석)

  • Minjung Lee
    • The Korean Journal of Applied Statistics
    • /
    • v.37 no.2
    • /
    • pp.225-237
    • /
    • 2024
  • In the analysis of competing risks data, some of covariates may not be fully observed for some subjects. In such cases, excluding subjects with missing covariate values from the analysis may result in biased estimates and loss of efficiency. In this paper, we studied multiple imputation and the augmented inverse probability weighting method for regression parameter estimation in the cause-specific proportional hazards model with missing covariates. The performance of estimators obtained from multiple imputation and the augmented inverse probability weighting method is evaluated by simulation studies, which show that those methods perform well. Multiple imputation and the augmented inverse probability weighting method were applied to investigate significant risk factors for the risk of death from breast cancer and from other causes for breast cancer data with missing values for tumor size obtained from the Prostate, Lung, Colorectal, and Ovarian Cancer Screen Trial Study. Under the cause-specific proportional hazards model, the methods show that race, marital status, stage, grade, and tumor size are significant risk factors for breast cancer mortality, and stage has the greatest effect on increasing the risk of breast cancer death. Age at diagnosis and tumor size have significant effects on increasing the risk of other-cause death.