• 제목/요약/키워드: statistical testing

검색결과 1,198건 처리시간 0.023초

Statistical Investigation on Class Mutation Operators

  • Ma, Yu-Seung;Kwon, Yong-Rae;Kim, Sang-Woon
    • ETRI Journal
    • /
    • 제31권2호
    • /
    • pp.140-150
    • /
    • 2009
  • Although mutation testing is potentially powerful, it is a computationally expensive testing method. To investigate how we can reduce the cost of object-oriented mutation testing, we have conducted empirical studies on class mutation operators. We applied class mutation operators to 866 classes contained in six open-source programs. An analysis of the number and the distribution of class mutants generated and preliminary data on the effectiveness of some operators are provided. Our study shows that the overall number of class mutants is smaller than for traditional mutants, which offers the possibility that class mutation can be made practically affordable.

  • PDF

Intrinsic Bayes Factors for Exponential Model Comparison with Censored Data

  • Kim, Dal-Ho;Kang, Sang-Gil;Kim, Seong W.
    • Journal of the Korean Statistical Society
    • /
    • 제29권1호
    • /
    • pp.123-135
    • /
    • 2000
  • This paper addresses the Bayesian hypotheses testing for the comparison of exponential population under type II censoring. In Bayesian testing problem, conventional Bayes factors can not typically accommodate the use of noninformative priors which are improper and are defined only up to arbitrary constants. To overcome such problem, we use the recently proposed hypotheses testing criterion called the intrinsic Bayes factor. We derive the arithmetic, expected and median intrinsic Bayes factors for our problem. The Monte Carlo simulation is used for calculating intrinsic Bayes factors which are compared with P-values of the classical test.

  • PDF

The Sequential Testing of Multiple Outliers in Linear Regression

  • Park, Jinpyo;Park, Heechang
    • Communications for Statistical Applications and Methods
    • /
    • 제8권2호
    • /
    • pp.337-346
    • /
    • 2001
  • In this paper we consider the problem of identifying and testing the outliers in linear regression. first we consider the problem for testing the null hypothesis of no outliers. The test based on the ratio of two scale estimates is proposed. We show the asymptotic distribution of the test statistic by Monte Carlo simulation and investigate its properties. Next we consider the problem of identifying the outliers. A forward sequential procedure based on the suggested test is proposed and shown to perform fairly well. The forward sequential procedure is unaffected by masking and swamping effects because the test statistic is based on robust estimate.

  • PDF

Intrinsic Priors for Testing Two Normal Means with the Default Bayes Factors

  • Jongsig Bae;Kim, Hyunsoo;Kim, Seong W.
    • Journal of the Korean Statistical Society
    • /
    • 제29권4호
    • /
    • pp.443-454
    • /
    • 2000
  • In Bayesian model selection or testing problems of different dimensions, the conventional Bayes factors with improper noninformative priors are not well defined. The intrinsic Bayes factor and the fractional Bayes factor are used to overcome such problems by using a data-splitting idea and fraction, respectively. This article addresses a Bayesian testing for the comparison of two normal means with unknown variance. We derive proper intrinsic priors, whose Bayes factors are asymptotically equivalent to the corresponding fractional Bayes factor. We demonstrate our results with two examples.

  • PDF

A Two Sample Test for Functional Data

  • Lee, Jong Soo;Cox, Dennis D.;Follen, Michele
    • Communications for Statistical Applications and Methods
    • /
    • 제22권2호
    • /
    • pp.121-135
    • /
    • 2015
  • We consider testing equality of mean functions from two samples of functional data. A novel test based on the adaptive Neyman methodology applied to the Hotelling's T-squared statistic is proposed. Under the enlarged null hypothesis that the distributions of the two populations are the same, randomization methods are proposed to find a null distribution which gives accurate significance levels. An extensive simulation study is presented which shows that the proposed test works very well in comparison with several other methods under a variety of alternatives and is one of the best methods for all alternatives, whereas the other methods all show weak power at some alternatives. An application to a real-world data set demonstrates the applicability of the method.

두 생존분포의 동일성 검정에 관한 비교연구 (A comparison of the statistical methods for testing the equality of two survival distributions)

  • 정미남;이재원
    • 응용통계연구
    • /
    • 제11권1호
    • /
    • pp.113-127
    • /
    • 1998
  • 생존자료의 분석에 있어 두 집단간의 생존분포의 비교는 자주 관심의 대상이 되고 있다. 중도절단(censoring)이 존재하는 생존자료에 있어 두 생존분포의 동일성을 검정하는 방법으로 log-rank 통계량과 Gehan의 일반화된 Wilcoxon 통계량에 근거한 검정법이 주로 사용되어 왔다. 그러나 이 두 가지 검정통계량이 어떤 상황에서나 적절한 것은 아니고, 두 생존분포의 여러가지 형태와 중도절단의 정도에 따라 통계량의 검정력은 크게 달라진다. 따라서 본 논문에서는 두 생존분포의 비교를 위해 제안된 몇 가지 검정통계량들을 여러가지 상황에서 모의실험을 통하여 비교하고, 그 결과를 토대호 주어진 상황에서 적절한 통계량을 선택하는데 대한 유용한 정보를 제공하였다.

  • PDF

Tests for homogeneity of proportions in clustered binomial data

  • Jeong, Kwang Mo
    • Communications for Statistical Applications and Methods
    • /
    • 제23권5호
    • /
    • pp.433-444
    • /
    • 2016
  • When we observe binary responses in a cluster (such as rat lab-subjects), they are usually correlated to each other. In clustered binomial counts, the independence assumption is violated and we encounter an extra-variation. In the presence of extra-variation, the ordinary statistical analyses of binomial data are inappropriate to apply. In testing the homogeneity of proportions between several treatment groups, the classical Pearson chi-squared test has a severe flaw in the control of Type I error rates. We focus on modifying the chi-squared statistic by incorporating variance inflation factors. We suggest a method to adjust data in terms of dispersion estimate based on a quasi-likelihood model. We explain the testing procedure via an illustrative example as well as compare the performance of a modified chi-squared test with competitive statistics through a Monte Carlo study.

Cumulative Sums of Residuals in GLMM and Its Implementation

  • Choi, DoYeon;Jeong, KwangMo
    • Communications for Statistical Applications and Methods
    • /
    • 제21권5호
    • /
    • pp.423-433
    • /
    • 2014
  • Test statistics using cumulative sums of residuals have been widely used in various regression models including generalized linear models(GLM). Recently, Pan and Lin (2005) extended this testing procedure to the generalized linear mixed models(GLMM) having random effects, in which we encounter difficulties in computing the marginal likelihood that is expressed as an integral of random effects distribution. The Gaussian quadrature algorithm is commonly used to approximate the marginal likelihood. Many commercial statistical packages provide an option to apply this type of goodness-of-fit test in GLMs but available programs are very rare for GLMMs. We suggest a computational algorithm to implement the testing procedure in GLMMs by a freely accessible R package, and also illustrate through practical examples.

Count Five Statistics Using Trimmed Mean

  • Hong, Chong-Sun;Jun, Jae-Woon
    • Communications for Statistical Applications and Methods
    • /
    • 제13권2호
    • /
    • pp.309-318
    • /
    • 2006
  • There are many statistical methods of testing the equality of two population variances. Among them, the well-known F test is very sensitive to the normality assumption. Several other tests that do not assume normality have been proposed, but these tests usually need tables of critical values or software for hypotheses testing. McGrath and Yeh (2005) suggested a quick and compact Count Five test requiring only the calculation of the number of extreme points. Since the Count Five test uses only extreme values, this discards some information from the samples, often resulting in a degradation in power. In this paper, an alternative Count Five test using the trimmed mean is proposed and its properties are discussed for some distributions and normal mixtures.

On inference of multivariate means under ranked set sampling

  • Rochani, Haresh;Linder, Daniel F.;Samawi, Hani;Panchal, Viral
    • Communications for Statistical Applications and Methods
    • /
    • 제25권1호
    • /
    • pp.1-13
    • /
    • 2018
  • In many studies, a researcher attempts to describe a population where units are measured for multiple outcomes, or responses. In this paper, we present an efficient procedure based on ranked set sampling to estimate and perform hypothesis testing on a multivariate mean. The method is based on ranking on an auxiliary covariate, which is assumed to be correlated with the multivariate response, in order to improve the efficiency of the estimation. We showed that the proposed estimators developed under this sampling scheme are unbiased, have smaller variance in the multivariate sense, and are asymptotically Gaussian. We also demonstrated that the efficiency of multivariate regression estimator can be improved by using Ranked set sampling. A bootstrap routine is developed in the statistical software R to perform inference when the sample size is small. We use a simulation study to investigate the performance of the method under known conditions and apply the method to the biomarker data collected in China Health and Nutrition Survey (CHNS 2009) data.