• Title/Summary/Keyword: sample variance

Search Result 977, Processing Time 0.035 seconds

Unbiased Balanced Half-Sample Variance Estimation in Stratified Two-stage Sampling

  • Kim, Kyu-Seong
    • Journal of the Korean Statistical Society
    • /
    • v.27 no.4
    • /
    • pp.459-469
    • /
    • 1998
  • Balanced half sample method is a simple variance estimation method for complex sampling designs. Since it is simple and flexible, it has been widely used in large scale sample surveys. However, the usual BHS method overestimate the true variance in without replacement sampling and two-stage cluster sampling. Focusing on this point , we proposed an unbiased BHS variance estimator in a stratified two-stage cluster sampling and then described an implementation method of the proposed estimator. Finally, partially BHS design is explained as a tool of reducing the number of replications of the proposed estimator.

  • PDF

A Sanov-Type Proof of the Joint Sufficiency of the Sample Mean and the Sample Variance

  • Kim, Chul-Eung;Park, Byoung-Seon
    • Journal of the Korean Statistical Society
    • /
    • v.24 no.2
    • /
    • pp.563-568
    • /
    • 1995
  • It is well-known that the sample mean and the sample variance are jointly sufficient under normality assumption. In this paper a proof of the joint sufficiency is given without using the factorization criterion. It is related to a finite Sanov-type conditional theorem, i.e., the conditional probability density of $Y_1$ given sample mean $\mu$ and sample variance $\sigma^2$, where $Y_1, Y_2, \cdots, Y_n$ are independently and identically distributed (i.i.d.) normal random variables with mean m and variance $\delta^2$, equals that of $Y_1$ given sample mean $\mu$ and sample variance $\sigma^2$, where $Y_1, Y_2, \cdots, Y_n$ are i.i.d. normal random variables with mean $\mu$ and variance $\sigma^2$.

  • PDF

Variance Estimation Using Poststratified Complex Sample

  • Kim, Kyu-Seong
    • Communications for Statistical Applications and Methods
    • /
    • v.6 no.1
    • /
    • pp.131-142
    • /
    • 1999
  • Estimators for domains and approximate estimators of their variance are derived using post-stratified complex sample. Furthermore we propose an adjusted variance estimator of a domain mean in case of considering the post-stratified complex sample as simple random sample. A simulation study based on the data of Farm Household Economy Survey is presented to compare variance estimators numerically. From the study we showed that our adjusted variance estimator compensate for the under-estimation problem considerably.

  • PDF

A Study on Sample Variance (표본분산에 대한 고찰)

  • Jang Dae-Heung
    • The Korean Journal of Applied Statistics
    • /
    • v.18 no.3
    • /
    • pp.689-699
    • /
    • 2005
  • We usually use $S^2=\frac{{\Sigma}^n_{i=1}(X_i-\={X})^2}{n-1}$ as sample variance. Korean high school text-books use $S^2_n=\frac{{\Sigma}^n_{i=1}(X_i-\={X})^2}{n}$as sample variance. We can compare the above two definitions of sample variance through their theoretical relationship and simulation.

Estimation of the Mean and Variance for Normal Distributions whose Both Sides are Truncated

  • Hong, Chong-Sun;Choi, Yun-Young
    • Communications for Statistical Applications and Methods
    • /
    • v.9 no.1
    • /
    • pp.249-259
    • /
    • 2002
  • In order to estimate the mean and variance for a Normal distribution which is truncated at both right and left sides, maximum likelihood estimators based on the entire sample from the original distribution are compared with the sample mean and variance of the censored sample which is the data remaining after truncation using simulation. We found that, surprisingly, the mean squared error of the mean based on the censored data Is smaller than that of the full sample estimators.

Design and efficiency of the variance component model control chart (분산성분모형 관리도의 설계와 효율)

  • Cho, Chan Yang;Park, Changsoon
    • Journal of the Korean Data and Information Science Society
    • /
    • v.28 no.5
    • /
    • pp.981-999
    • /
    • 2017
  • In the standard control chart assuming a simple random model, we estimate the process variance without considering the between-sample variance. If the between-sample exists in the process, the process variance is under-estimated. When the process variance is under-estimated, the narrower control limits result in the excessive false alarm rate although the sensitivity of the control chart is improved. In this paper, using the variance component model to incorporate the between-sample variance, we set the control limits using both the within- and between-sample variances, and evaluate the efficiency of the control chart in terms of the average run length (ARL). Considering the most widely used control chart types such as ${\bar{X}}$, EWMA and CUSUM control charts, we compared the differences between two cases, Case I and Case II, where the between-sample variance is ignored and considered, respectively. We also considered the two cases when the process parameters are given and estimated. The results showed that the false alarm rate of Case I increased sharply as the between-sample variance increases, while that of Case II remains the same regardless of the size of the between-sample variance, as expected.

Calculating Sample Variance for the Combined Data (두 자료들의 평균과 분산을 이용한 혼합자료의 분산 계산)

  • Shin, Mi-Young;Cho, Tae-Kyoung
    • The Korean Journal of Applied Statistics
    • /
    • v.21 no.1
    • /
    • pp.177-182
    • /
    • 2008
  • There are times when we need more sample to achieve a more accurate estimator. Since these two sets of sample have the information about the same population, it is necessary to treat both as a single combined data. In this paper we present the unpooled sample variance for the combined data when we just know a sample mean and variance for the each data set without the raw data. It is shown that the pooled variance $s^2_p$ is always greater than the exact variance $s^2_t$ when ${\bar{x}}_n\;=\;{\bar{y}}_m$. And the difference of means for two data, ${\bar{x}}_n-{\bar{y}}_m}$, is larger, the difference of $s^2_p$ and $s^2_t$ is larger.

Cusum Control Chart for Monitoring Process Variance (공정분산 관리를 위한 누적합 관리도)

  • Lee, Yoon-Dong;Kim, Sang-Ik
    • Journal of Korean Society for Quality Management
    • /
    • v.33 no.3
    • /
    • pp.149-155
    • /
    • 2005
  • Cusum control chart is used for the purpose of controling the process mean. We consider the problem related to cusum chart for controling process variance. Previous researches have considered the same problem. The main difficulty shown in the related researches was to derive the ARL function which characterizes the properties of the chart. Sample variance, differently with sample mean, follows chi-squared type distribution, even when the quality characteristics are assumed to be normally distributed. The ARL function of cusum is described by a type of integral equation. Since the solution of the integral equation for non-normal distribution is not known well, people used simulation method instead of solving the integral equation directly, or approximation method by taking logarithm of the sample variance. Recently a new method to solve the integral equation for Erlang distribution was published. Here we consider the steps to apply the solution to the problem of controling process variance.

Cusum control chart for monitoring process variance (공정분산 관리를 위한 누적합 관리도)

  • Lee, Yoon-Dong;Kim, Sang-Ik
    • Proceedings of the Korean Society for Quality Management Conference
    • /
    • 2006.04a
    • /
    • pp.135-141
    • /
    • 2006
  • Cusum control chart is used for the purpose of controling the process mean. We consider the problem related to cusum chart for controling process variance. Previous researches have considered the same problem. The main difficulty shown in the related researches was to derive the ARL function which characterizes the properties of the chart. Sample variance, differently with sample mean, follows chi-squared type distribution, even when the quality characteristics are assumed to be normally distributed. The ARL function of cusum is described by a type of integral equation. Since the solution of the integral equation for non-normal distribution is not known well, people used simulation method instead of solving the integral equation directly, or approximation method by taking logarithm of the sample variance. Recently a new method to solve the integral equation for Erlang distribution was published. Here we consider the steps to apply the solution to the problem of controling process variance.

  • PDF

Development of a method of the data generation with maintaining quantile of the sample data

  • Joohyung Lee;Young-Oh Kim
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2023.05a
    • /
    • pp.244-244
    • /
    • 2023
  • Both the frequency and the magnitude of hydrometeorological extreme events such as severe floods and droughts are increasing. In order to prevent a damage from the climatic disaster, hydrological models are often simulated under various meteorological conditions. While performing the simulations, a synthetic data generated through time series models which maintains the key statistical characteristics of the sample data are widely applied. However, the synthetic data can easily maintains both the average and the variance of the sample data, but the quantile is not maintained well. In this study, we proposes a data generation method which maintains the quantile of the sample data well. The equations of the former maintenance of variance extension (MOVE) are expanded to maintain quantile rather than the average or the variance of the sample data. The equations are derived and the coefficients are determined based on the characteristics of the sample data that we aim to preserve. Monte Carlo simulation is utilized to assess the performance of the proposed data generation method. A time series data (data length of 500) is regarded as the sample data and selected randomly from the sample data to create the data set (data length of 30) for simulation. Data length of the selected data set is expanded from 30 to 500 by using the proposed method. Then, the average, the variance, and the quantile difference between the sample data, and the expanded data are evaluated with relative root mean square error for each simulation. As a result of the simulation, each equation which is designed to maintain the characteristic of data performs well. Moreover, expanded data can preserve the quantile of sample data more precisely than that those expanded through the conventional time series model.

  • PDF