• Title/Summary/Keyword: Statistic

Search Result 2,194, Processing Time 0.036 seconds

A Jarque-Bera type test for multivariate normality based on second-power skewness and kurtosis

  • Kim, Namhyun
    • Communications for Statistical Applications and Methods
    • /
    • v.28 no.5
    • /
    • pp.463-475
    • /
    • 2021
  • Desgagné and de Micheaux (2018) proposed an alternative univariate normality test to the Jarque-Bera test. The proposed statistic is based on the sample second power skewness and kurtosis while the Jarque-Bera statistic uses sample Pearson's skewness and kurtosis that are the third and fourth standardized sample moments, respectively. In this paper, we generalize their statistic to a multivariate version based on orthogonalization or an empirical standardization of data. The proposed multivariate statistic follows chi-squared distribution approximately. A simulation study shows that the proposed statistic has good control of type I error even for a very small sample size when critical values from the approximate distribution are used. It has comparable power to the multivariate version of the Jarque-Bera test with exactly the same idea of the orthogonalization. It also shows much better power for some mixed normal alternatives.

Goodness-of-fit tests for a proportional odds model

  • Lee, Hyun Yung
    • Journal of the Korean Data and Information Science Society
    • /
    • v.24 no.6
    • /
    • pp.1465-1475
    • /
    • 2013
  • The chi-square type test statistic is the most commonly used test in terms of measuring testing goodness-of-fit for multinomial logistic regression model, which has its grouped data (binomial data) and ungrouped (binary) data classified by a covariate pattern. Chi-square type statistic is not a satisfactory gauge, however, because the ungrouped Pearson chi-square statistic does not adhere well to the chi-square statistic and the ungrouped Pearson chi-square statistic is also not a satisfactory form of measurement in itself. Currently, goodness-of-fit in the ordinal setting is often assessed using the Pearson chi-square statistic and deviance tests. These tests involve creating a contingency table in which rows consist of all possible cross-classifications of the model covariates, and columns consist of the levels of the ordinal response. I examined goodness-of-fit tests for a proportional odds logistic regression model-the most commonly used regression model for an ordinal response variable. Using a simulation study, I investigated the distribution and power properties of this test and compared these with those of three other goodness-of-fit tests. The new test had lower power than the existing tests; however, it was able to detect a greater number of the different types of lack of fit considered in this study. I illustrated the ability of the tests to detect lack of fit using a study of aftercare decisions for psychiatrically hospitalized adolescents.

A simple diagnostic statistic for determining the size of random forest (랜덤포레스트의 크기 결정을 위한 간편 진단통계량)

  • Park, Cheolyong
    • Journal of the Korean Data and Information Science Society
    • /
    • v.27 no.4
    • /
    • pp.855-863
    • /
    • 2016
  • In this study, a simple diagnostic statistic for determining the size of random forest is proposed. This method is based on MV (margin of victory), a scaled difference in the votes at the infinite forest between the first and second most popular categories of the current random forest. We can note that if MV is negative then there is discrepancy between the current and infinite forests. More precisely, our method is based on the proportion of cases that -MV is greater than a fixed small positive number (say, 0.03). We derive an appropriate diagnostic statistic for our method and then calculate the distribution of the statistic. A simulation study is performed to compare our method with a recently proposed diagnostic statistic.

Power analysis for 3 ${\times}$ 3 Latin square design (3 ${\times}$ 3 라틴방격모형의 검정력 분석)

  • Choi, Young-Hun
    • Journal of the Korean Data and Information Science Society
    • /
    • v.20 no.2
    • /
    • pp.401-410
    • /
    • 2009
  • Due to the characteristics of 3 ${\times}$ 3 Latin square design which is composed of two block effects and one main effect, powers of rank transformed statistic for testing the main effect are very superior to powers of parametric statistic without regard to the type of population distributions. By order of when all three effects are fixed, when on one block effect is random, when two block effects are random, the rank transform statistic for testing the main effect shows relatively high powers as compared with the parametric statistic. Further when the size of main effect is big with one equivalent size of block effect and the other small size of block effect, powers of rank transformed statistic for testing the main effect demonstrate excellent advantage to powers of parametric statistic.

  • PDF

A review on the development of a scan statistic and its applications (스캔 통계량의 발전 과정과 응용에 대한 고찰)

  • 김병수;김기한
    • The Korean Journal of Applied Statistics
    • /
    • v.6 no.1
    • /
    • pp.125-143
    • /
    • 1993
  • The primary objective of the paper is to review the development of approximations of the null distribution of a scan statistic and to show how these approximations were improved. Let $X_1, \cdots, X_N$ be a sequence of independent uniform random variables on an interval (0, t]. A can statistic is defined to be the maximum number of observations in a subinterval of length t $\leq$ T, when we continuously (or discretely) move the subinterval from 0 to T. A scan statistic is used to test whether certain events occur in a cluster aganist a null hypothesis of the uniformity. It is difficult to calculate the exact null distribution of a scan statistic. Several authors have suggested approximations of the null distribution of a scan statistic since Naus(1966). We conceive that a scan statistic can be used for detecting a "hot region" is defined to be a region at which the frequencies of mutations are relatively high. A "hot region" may be regarded as a generalized version of a hot spot. We leave it for a further study the concrete formulation of deteciton a "hot region" in a mutational spectrum.uot; in a mutational spectrum.

  • PDF

Power analysis for $2{\times}2$ factorial in randomized complete block design (블럭이 존재하는 $2{\times}2$ 요인모형의 검정력 분석)

  • Choi, Young-Hun
    • Journal of the Korean Data and Information Science Society
    • /
    • v.22 no.2
    • /
    • pp.245-253
    • /
    • 2011
  • Powers of rank transformed statistic for testing main effects and interaction effects for $2{\times}2$ factorial design in randomized complete block design are very superior to powers of parametric statistic without regard to the block size, composition method of effects and the type of population distributions such as exponential, double exponential, normal and uniform. $2{\times}2$ factorial design in RCBD increases error effects and decreases powers of parametric statistic which results in conservativeness. However powers of rank transformed statistic maintain relative preference. In general powers of rank transformed statistic show relative preference over those of parametric statistic with small block size and big effect size.

Rank transformation analysis for 4 $\times$ 4 balanced incomplete block design (4 $\times$ 4 균형불완전블럭모형의 순위변환분석)

  • Choi, Young-Hun
    • Journal of the Korean Data and Information Science Society
    • /
    • v.21 no.2
    • /
    • pp.231-240
    • /
    • 2010
  • If only fixed effects exist in a 4 $\times$ 4 balanced incomplete block design, powers of FR statistic for testing a main effect show the highest level with a few replications. Under the exponential and double exponential distributions, FR statistic shows relatively high powers with big differences as compared with the F statistic. Further in a traditional balanced incomplete block design, powers of FR statistic having a fixed main effect and a random block effect show superior preference for all situations without regard to the effect size of a main effect, the parameter size and the type of population distributions of a block effect. Powers of FR statistic increase in a high speed as replications increase. Overall power preference of FR statistic for testing a main effect is caused by unique characteristic of a balanced incomplete block design having one main and block effect with missing observations, which sensitively responds to small increase of main effect and sample size.

Power study for 4 × 4 graeco-latin square design (4 × 4 그레코라틴방격모형의 검정력 연구)

  • Choi, Young-Hun
    • Journal of the Korean Data and Information Science Society
    • /
    • v.23 no.4
    • /
    • pp.683-691
    • /
    • 2012
  • In $4{\times}4$ graeco-latin square design, powers of rank transformed statistic for testing the main effect are superior to powers of parametric statistic without regard to the effect structure with equally or unequally spaced effect levels as well as the type of population distributions such as exponential, double exponential, normal and uniform distribution. As numbers of block effect or effect sizes are decreased, powers of rank transformed statistic are much higher than powers of parametric statistic. In case that block effects are smaller than a main effect or one block effect is higher than other block effects, powers of rank transformed statistic are much higher than powers of parametric statistic in $4{\times}4$ graeco-latin square design with three block effects and one main effect.

Power study for 2 × 2 factorial design in 4 × 4 latin square design (4 × 4 라틴방격모형 내 2 × 2 요인모형의 검정력 연구)

  • Choi, Young Hun
    • Journal of the Korean Data and Information Science Society
    • /
    • v.25 no.6
    • /
    • pp.1195-1205
    • /
    • 2014
  • Compared with single design, powers of rank transformed statistic for testing main and interaction effects for $2{\times}2$ factorial in $4{\times}4$ latin square design are rapidly increased as effect size and replication size are increased. In general powers of rank transformed statistic are superior without regard to the diversified effect composition and the type of error distributions as nontesting factors are few and effect size are small. Powers of rank transformed statistic show much higher level than those of parametric statistic in exponential and double exponential distributions. Further powers of rank transformed statistic are very similar with those of parametric statistic in normal and uniform distributions.

Power comparison for 3×3 split plot factorial design (3×3 분할요인모형의 검정력 비교연구)

  • Choi, Young Hun
    • Journal of the Korean Data and Information Science Society
    • /
    • v.28 no.1
    • /
    • pp.143-152
    • /
    • 2017
  • Restriction of completely randomization within a block can be handled by a split plot factorial design splitted by several plots. $3{\times}3$ split plot factorial design with two fixed main factors and one fixed block shows that powers of the rank transformed statistic for testing whole plot factorial effect and split plot factorial effect are superior to those of the parametric statistic when existing effect size is small or the remaining effect size is relatively smaller than the testing factorial effect size. Powers of the rank transformed statistic show relatively high level for exponential and double exponential distributions, whereas powers of the parametric and rank transformed statistic maintain similar level for normal and uniform distributions. Powers of the parametric and rank transformed statistic with two fixed main factors and one random block are respectively lower than those with all fixed factors. Powers of the parametric andrank transformed statistic for testing split plot factorial effect with two fixed main factors and one random block are slightly lower than those for testing whole plot factorial effect, but powers of the rank transformed statistic show comparative advantage over those of the parametric statistic.