• Title/Summary/Keyword: Permutation Test

Search Result 88, Processing Time 0.032 seconds

Estimation of Gini-Simpson index for SNP data

  • Kang, Joonsung
    • Journal of the Korean Data and Information Science Society
    • /
    • v.28 no.6
    • /
    • pp.1557-1564
    • /
    • 2017
  • We take genomic sequences of high-dimensional low sample size (HDLSS) without ordering of response categories into account. When constructing an appropriate test statistics in this model, the classical multivariate analysis of variance (MANOVA) approach might not be useful owing to very large number of parameters and very small sample size. For these reasons, we present a pseudo marginal model based upon the Gini-Simpson index estimated via Bayesian approach. In view of small sample size, we consider the permutation distribution by every possible n! (equally likely) permutation of the joined sample observations across G groups of (sizes $n_1,{\ldots}n_G$). We simulate data and apply false discovery rate (FDR) and positive false discovery rate (pFDR) with associated proposed test statistics to the data. And we also analyze real SARS data and compute FDR and pFDR. FDR and pFDR procedure along with the associated test statistics for each gene control the FDR and pFDR respectively at any level ${\alpha}$ for the set of p-values by using the exact conditional permutation theory.

Tests of equality of several variances with the likelihood ratio principle

  • Park, Hyo-Il
    • Communications for Statistical Applications and Methods
    • /
    • v.25 no.4
    • /
    • pp.329-339
    • /
    • 2018
  • In this study, we propose tests for equality of several variances with the normality assumption. First of all, we propose the likelihood ratio test by applying the permutation principle. Then by using the p-values for the pairwise tests between variances and combination functions, we propose combination tests. We apply the permutation principle to obtain the overall p-values. Also we review the well- known test statistics for the completion of our discussion and modify a statistic with the p-values. Then we illustrate proposed tests by numerical and simulated data and compare their efficiency with the reviewed ones through a simulation study by obtaining empirical p-values. Finally, we discuss some interesting features related to the resampling methods and tests for equality among several variances.

Combining Independent Permutation p-Values Associated with Multi-Sample Location Test Data

  • Um, Yonghwan
    • Journal of the Korea Society of Computer and Information
    • /
    • v.25 no.7
    • /
    • pp.175-182
    • /
    • 2020
  • Fisher's classical method for combining independent p-values from continuous distributions is widely used but it is known to be inadequate for combining p-values from discrete probability distributions. Instead, the discrete analog of Fisher's classical method is used as an alternative for combining p-values from discrete distributions. In this paper, firstly we obtain p-values from discrete probability distributions associated with multi-sample location test data (Fisher-Pitman test and Kruskall-Wallis test data) by permutation method, and secondly combine the permutaion p-values by the discrete analog of Fisher's classical method. And we finally compare the combined p-values from both the discrete analog of Fisher's classical method and Fisher's classical method.

Non-parametric approach for the grouped dissimilarities using the multidimensional scaling and analysis of distance (다차원척도법과 거리분석을 활용한 그룹화된 비유사성에 대한 비모수적 접근법)

  • Nam, Seungchan;Choi, Yong-Seok
    • The Korean Journal of Applied Statistics
    • /
    • v.30 no.4
    • /
    • pp.567-578
    • /
    • 2017
  • Grouped multivariate data can be tested for differences between two or more groups using multivariate analysis of variance (MANOVA). However, this method cannot be used if several assumptions of MANOVA are violated. In this case, multidimensional scaling (MDS) and analysis of distance (AOD) can be applied to grouped dissimilarities based on the various distances. A permutation test is a non-parametric method that can also be used to test differences between groups. MDS is used to calculate the coordinates of observations from dissimilarities and AOD is useful for finding group structure using the coordinates. In particular, AOD is mathematically associated with MANOVA if using the Euclidean distance when computing dissimilarities. In this paper, we study the between and within group structure by applying MDS and AOD to the grouped dissimilarities. In addition, we propose a new test statistic using the group structure for the permutation test. Finally, we investigate the relationship between AOD and MANOVA from dissimilarities based on the Euclidean distance.

Major DNA Marker Mining of Hanwoo Chromosome 6 by Bootstrap Method

  • Lee, Jea-Young;Lee, Yong-Won
    • Communications for Statistical Applications and Methods
    • /
    • v.11 no.3
    • /
    • pp.657-668
    • /
    • 2004
  • Permutation test has been applied for the QTL(quantitative trait loci) analysis and we selected a major locus. K -means clustering analysis, for the major DNA Marker mining of ILSTS035 microsatellite loci in Hanwoo chromosome 6, has been described. Finally, bootstrap testing method has been adapted to calculate confidence intervals and for finding major DNA Markers.

Test procedures for the mean and variance simultaneously under normality

  • Park, Hyo-Il
    • Communications for Statistical Applications and Methods
    • /
    • v.23 no.6
    • /
    • pp.563-574
    • /
    • 2016
  • In this study, we propose several simultaneous tests to detect the difference between means and variances for the two-sample problem when the underlying distribution is normal. For this, we apply the likelihood ratio principle and propose a likelihood ratio test. We then consider a union-intersection test after identifying the likelihood statistic, a product of two individual likelihood statistics, to test the individual sub-null hypotheses. By noting that the union-intersection test can be considered a simultaneous test with combination function, also we propose simultaneous tests with combination functions to combine individual tests for each sub-null hypothesis. We apply the permutation principle to obtain the null distributions. We then provide an example to illustrate our proposed procedure and compare the efficiency among the proposed tests through a simulation study. We discuss some interesting features related to the simultaneous test as concluding remarks. Finally we show the expression of the likelihood ratio statistic with a product of two individual likelihood ratio statistics.

Permutation test for a post selection inference of the FLSA (순열검정을 이용한 FLSA의 사후추론)

  • Choi, Jieun;Son, Won
    • The Korean Journal of Applied Statistics
    • /
    • v.34 no.6
    • /
    • pp.863-874
    • /
    • 2021
  • In this paper, we propose a post-selection inference procedure for the fused lasso signal approximator (FLSA). The FLSA finds underlying sparse piecewise constant mean structure by applying total variation (TV) semi-norm as a penalty term. However, it is widely known that this convex relaxation can cause asymptotic inconsistency in change points detection. As a result, there can remain false change points even though we try to find the best subset of change points via a tuning procedure. To remove these false change points, we propose a post-selection inference for the FLSA. The proposed procedure applies a permutation test based on CUSUM statistic. Our post-selection inference procedure is an extension of the permutation test of Antoch and Hušková (2001) which deals with single change point problems, to multiple change points detection problems in combination with the FLSA. Numerical study results show that the proposed procedure is better than naïve z-tests and tests based on the limiting distribution of CUSUM statistics.

Determination of Significance Threshold for Detecting QTL in Pigs (돼지의 QTL 검색을 위한 유의적 임계수준(Threshold) 결정)

  • Lee, H.K.;Jeon, G.J.
    • Journal of Animal Science and Technology
    • /
    • v.44 no.1
    • /
    • pp.31-38
    • /
    • 2002
  • Interval mapping using microsatellite markers was employed to detect quantitative trait loci (QTL) in the experimental cross between Berkshire and Yorkshire pigs. In order to derive critical values (CV) for test statistics for declaring significance of QTL, permutation test (PT) of Churchill and Doerge method(1994) and the analytical method(LK) of Lander and Kruglyak(1995) were used by each trait and chromosome. 525 $F_2$ progeny phenotypes of five traits(carcass weight, loin eye area, marbling score, cholesterol content, last back fat thickness) and genotypes of 125 markers covering the genome were used. Data were analyzed by line cross regression interval mapping with an F-test every by 1cM. PT CV were based on 10,000 permutations. CV at genome-wise test were 10.5 for LK and ranged from 8.1 to 8.3 for PT, depending on the trait. CV, differed substantially between methods, led to different numbers of quantitative trait loci (QTL) to be detected. PT results in the least stringent CV compared at the same % level.

Independence tests using coin package in R (coin 패키지를 이용한 독립성 검정)

  • Kim, Jinheum;Lee, Jung-Dong
    • Journal of the Korean Data and Information Science Society
    • /
    • v.25 no.5
    • /
    • pp.1039-1055
    • /
    • 2014
  • The distribution of a test statistic under a null hypothesis depends on the unknown distribution of the data and thus is unknown as well. Conditional tests replace the unknown null distribution by the conditional null distribution, that is, the distribution of the test statistic given the observed data. This approach is known as permutation tests and was developed by Fisher (Fisher, 1935). Theoretical framework for permutation tests was given by Strasser and Weber(1999). The coin package developed by Hothon et al. (2006, 2008) implements a unified approach for conditional inference via the generic independence test. Because convenient functions for the most prominent problems are available, users will not have to use the extremely flexible procedure. In this article we briefly review the underlying theory from Strasser and Weber (1999) and explain how to transform the data to perform the generic function independence test. Finally it was illustrated with a few real data sets.

Fast Combinatorial Programs Generating Total Data (전수데이터를 생성하는 빠른 콤비나토리얼 프로그램)

  • Jang, Jae-Soo;Won, Shin-Jae;Cheon, Hong-Sik;Suh, Chang-Jin
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.14 no.3
    • /
    • pp.1451-1458
    • /
    • 2013
  • This paper deals with the programs and algorithms that generate the full data set that satisfy the basic combinatorial requirement of combination, permutation, partial permutation or shortly r-permutation, which are used in the application of the total data testing or the simulation input. We search the programs able to meet the rules which is permutations and combinations, r-permutations, select the fastest program by field. With further study, we developed a new program reducing the time required to processing. Our research performs the following pre-study. Firstly, hundreds of algorithms and programs in the internet are collected and corrected to be executable. Secondly, we measure running time for all completed programs and select a few fast ones. Thirdly, the fast programs are analyzed in depth and its pseudo-code programs are provided. We succeeded in developing two programs that run faster. Firstly, the combination program can save the running time by removing recursive function and the r-permutation program become faster by combining the best combination program and the best permutation program. According to our performance test, the former and later program enhance the running speed by 22% to 34% and 62% to 226% respectively compared with the fastest collected program. The programs suggested in this study could apply to a particular cases easily based on Pseudo-code., Predicts the execution time spent on data processing, determine the validity of the processing, and also generates total data with minimum access programming.