• Title/Summary/Keyword: Sample Sizes

Search Result 673, Processing Time 0.024 seconds

Effective Sample Sizes for the Test of Mean Differences Based on Homogeneity Test

  • Heo, Sunyeong
    • Journal of Integrative Natural Science
    • /
    • v.12 no.3
    • /
    • pp.91-99
    • /
    • 2019
  • Many researchers in various study fields use the two sample t-test to confirm their treatment effects. The two sample t-test is generally used for small samples, and assumes that two independent random samples are selected from normal populations, and the population variances are unknown. Researchers often conduct F-test, the test of equality of variances, before testing the treatment effects, and the test statistic or confidence interval for the two sample t-test has two formats according to whether the variances are equal or not. Researchers using the two sample t-test often want to know how large sample sizes they need to get reliable test results. This research gives some guidelines for sample sizes to them through simulation works. The simulation had run for normal populations with the different ratios of two variances for different sample sizes (${\leq}30$). The simulation results are as follows. First, if one has no idea equality of variances but he/she can assume the difference is moderate, it is safe to use sample size at least 20 in terms of the nominal level of significance. Second, the power of F-test for the equality of variances is very low when the sample sizes are small (<30) even though the ratio of two variances is equal to 2. Third, the sample sizes at least 10 for the two sample t-test are recommendable in terms of the nominal level of significance and the error limit.

Sample Size Determination and Evaluation of Form Errors

  • Chang, Sung Ho;Kim, Sunn Ho
    • Journal of Korean Society for Quality Management
    • /
    • v.22 no.3
    • /
    • pp.85-98
    • /
    • 1994
  • In current coordinate measuring machine practice, there are no commonly accepted sample sizes for estimating form errors which have a statistical confidence. Practically, sample size planning is important for the geometrical tolerance inspection using a coordinate measuring machine. We determine and validate appropriate sample sizes for form error estimation. Also, we develop form error estimation methods with certain confidence levels based on the obtained sample sizes in various form errors: straightness, flatness, circularity, and cylindericity.

  • PDF

An elaboration on sample size determination for correlations based on effect sizes and confidence interval width: a guide for researchers

  • Mohamad Adam Bujang
    • Restorative Dentistry and Endodontics
    • /
    • v.49 no.2
    • /
    • pp.21.1-21.8
    • /
    • 2024
  • Objectives: This paper aims to serve as a useful guide for sample size determination for various correlation analyses that are based on effect sizes and confidence interval width. Materials and Methods: Sample size determinations are calculated for Pearson's correlation, Spearman's rank correlation, and Kendall's Tau-b correlation. Examples of sample size statements and their justification are also included. Results: Using the same effect sizes, there are differences between the sample size determination of the 3 statistical tests. Based on an empirical calculation, a minimum sample size of 149 is usually adequate for performing both parametric and non-parametric correlation analysis to determine at least a moderate to an excellent degree of correlation with acceptable confidence interval width. Conclusions: Determining data assumption(s) is one of the challenges to offering a valid technique to estimate the required sample size for correlation analyses. Sample size tables are provided and these will help researchers to estimate a minimum sample size requirement based on correlation analyses.

Approximate Confidence Limits for the Ratio of Two Binomial Variates with Unequal Sample Sizes

  • Cho, Hokwon
    • Communications for Statistical Applications and Methods
    • /
    • v.20 no.5
    • /
    • pp.347-356
    • /
    • 2013
  • We propose a sequential method to construct approximate confidence limits for the ratio of two independent sequences of binomial variates with unequal sample sizes. Due to the nonexistence of an unbiased estimator for the ratio, we develop the procedure based on a modified maximum likelihood estimator (MLE). We generalize the results of Cho and Govindarajulu (2008) by defining the sample-ratio when sample sizes are not equal. In addition, we investigate the large-sample properties of the proposed estimator and its finite sample behavior through numerical studies, and we make comparisons from the sample information view points.

Determination of Sample Sizes of Bivariate Efficacy and Safety Outcomes (이변량 효능과 안전성 이항변수의 표본수 결정방법)

  • Lee, Hyun-Hak;Song, Hae-Hiang
    • The Korean Journal of Applied Statistics
    • /
    • v.22 no.2
    • /
    • pp.341-353
    • /
    • 2009
  • We consider sample-size determination problem motivated by comparative clinical trials where patient outcomes are characterized by a bivariate outcome of efficacy and safety. Thall and Cheng (1999) presented a sample size methodology for the case of bivariate binary outcomes. We propose a bivariate Wilcoxon-Mann-Whitney(WMW) statistics for sample-size determination for binary outcomes, and this nonparametric method can be equally used to determine sample sizes of ordinal outcomes. The two methods of sample size determination rely on the same testing strategy for the target parameters but differs in the test statistics, an asymptotic bivariate normal statistic of the transformed proportions in Thall and Cheng (1999) and nonparametric bivariate WMW statistic in the other method. Sample sizes are calculated for the two experimental oncology trials, described in Thall and Cheng (1999), and for the first trial example the sample sizes of a bivariate WMW statistic are smaller than those of Thall and Cheng (1999), while for the second trial example the reverse is true.

Reproducibility and Sample Size in High-Dimensional Data (고차원 자료의 재현성과 표본 수)

  • Seo, Won-Seok;Choi, Jee-A;Jeong, Hyeong-Chul;Cho, Hyung-Jun
    • The Korean Journal of Applied Statistics
    • /
    • v.23 no.6
    • /
    • pp.1067-1080
    • /
    • 2010
  • A number of methods have been developed to determine sample sizes in clinical trial, and most clinical trial organizations determine sample sizes based on the methods. In contrast, determining sufficient sample sizes needed for experiments using microarray chips is unsatisfactory and not widely in use. In this paper, our objective is to provide a guideline in determining sample sizes, utilizing reproducibility of real microarray data. In the reproducibility comparison, five methods for discovering differential expression are used: Fold change, Two-sample t-test, Wilcoxon rank-sum test, SAM, and LPE. In order to standardize gene expression values, both MAS5 and RMA methods are considered. According to the number of repetitions, the upper 20 and 100 gene accordances are also compared. In determining sample sizes, more realistic information can be added to the existing method because of our proposed approach.

A Comparison of the Reliability Estimation Accuracy between Bayesian Methods and Classical Methods Based on Weibull Distribution (와이블분포 하에서 베이지안 기법과 전통적 기법 간의 신뢰도 추정 정확도 비교)

  • Cho, HyungJun;Lim, JunHyoung;Kim, YongSoo
    • Journal of Korean Institute of Industrial Engineers
    • /
    • v.42 no.4
    • /
    • pp.256-262
    • /
    • 2016
  • The Weibull is widely used in reliability analysis, and several studies have attempted to improve estimation of the distribution's parameters. least squares estimation (LSE) or Maximum likelihood estimation (MLE) are often used to estimate distribution parameters. However, it has been proven that Bayesian methods are more suitable for small sample sizes than LSE and MLE. In this work, the Weibull parameter estimation accuracy of LSE, MLE, and Bayesian method are compared for sample sets with 3 to 30 data points. The Bayesian method was most accurate for sample sizes under 25, and the accuracy of the Bayesian method was similar to LSE and MLE as the sample size increased.

The Effect of Sample and Particle Sizes in Discrete Particle Swarm Optimization for Simulation-based Optimization Problems (시뮬레이션 최적화 문제 해결을 위한 이산 입자 군집 최적화에서 샘플수와 개체수의 효과)

  • Yim, Dong-Soon
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.40 no.1
    • /
    • pp.95-104
    • /
    • 2017
  • This paper deals with solution methods for discrete and multi-valued optimization problems. The objective function of the problem incorporates noise effects generated in case that fitness evaluation is accomplished by computer based experiments such as Monte Carlo simulation or discrete event simulation. Meta heuristics including Genetic Algorithm (GA) and Discrete Particle Swarm Optimization (DPSO) can be used to solve these simulation based multi-valued optimization problems. In applying these population based meta heuristics to simulation based optimization problem, samples size to estimate the expected fitness value of a solution and population (particle) size in a generation (step) should be carefully determined to obtain reliable solutions. Under realistic environment with restriction on available computation time, there exists trade-off between these values. In this paper, the effects of sample and population sizes are analyzed under well-known multi-modal and multi-dimensional test functions with randomly generated noise effects. From the experimental results, it is shown that the performance of DPSO is superior to that of GA. While appropriate determination of population sizes is more important than sample size in GA, appropriate determination of sample size is more important than particle size in DPSO. Especially in DPSO, the solution quality under increasing sample sizes with steps is inferior to constant or decreasing sample sizes with steps. Furthermore, the performance of DPSO is improved when OCBA (Optimal Computing Budget Allocation) is incorporated in selecting the best particle in each step. In applying OCBA in DPSO, smaller value of incremental sample size is preferred to obtain better solutions.

Effect of Positively Skewed Distribution on the Two sample t-test: Based on Chi-square Distribution

  • Heo, Sunyeong
    • Journal of Integrative Natural Science
    • /
    • v.14 no.3
    • /
    • pp.123-129
    • /
    • 2021
  • This research examines the effect of positively skewed population distribution on the two sample t-test through simulation. For simulation work, two independent samples were selected from the same chi-square distributions with 3, 5, 10, 15, 20, 30 degrees of freedom and sample sizes 3, 5, 10, 15, 20, 30, respectively. Chi-square distribution is largely skewed to the right at small degrees of freedom and getting symmetric as the degrees of freedom increase. Simulation results show that the sampled populations are distributed positively skewed like chi-square distribution with small degrees of freedom, the F-test for the equality of variances shows poor performances even at the relatively large degrees of freedom and sample sizes like 30 for both, and so it is recommended to avoid using F-test. When two population variances are equal, the skewness of population distribution does not affect on the t-test in terms of the confidence level. However even though for the highly positively skewed distribution and small sample sizes like three or five the t-test achieved the nominal confidence level, the error limits are very large at small sample size. Therefore, if the sampled population is expected to be highly skewed to the right, it will be recommended to use relatively large sample size, at least 20.

Small Sample Characteristics of Generalized Estimating Equations for Categorical Repeated Measurements (범주형 반복측정자료를 위한 일반화 추정방정식의 소표본 특성)

  • 김동욱;김재직
    • The Korean Journal of Applied Statistics
    • /
    • v.15 no.2
    • /
    • pp.297-310
    • /
    • 2002
  • Liang and Zeger proposed generalized estimating equations(GEE) for analyzing repeated data which is discrete or continuous. GEE model can be extended to model for repeated categorical data and its estimator has asymptotic multivariate normal distribution in large sample sizes. But GEE is based on large sample asymptotic theory. In this paper, we study the properties of GEE estimators for repeated ordinal data in small sample sizes. We generate ordinal repeated measurements for two groups using two methods. Through Monte Carlo simulation studies we investigate the empirical type 1 error rates, powers, relative efficiencies of the GEE estimators, the effect of unequal sample size of two groups, and the performance of variance estimators for polytomous ordinal response variables, especially in small sample sizes.