• Title/Summary/Keyword: 왈드 검정

Search Result 3, Processing Time 0.014 seconds

Detecting survival related gene sets in microarray analysis (마이크로어레이 자료에서 생존과 유의한 관련이 있는 유전자집단 검색)

  • Lee, Sun-Ho;Lee, Kwang-Hyun
    • Journal of the Korean Data and Information Science Society
    • /
    • v.23 no.1
    • /
    • pp.1-11
    • /
    • 2012
  • When the microarray experiment developed, main interest was limited to detect differentially expressed genes associated with a phenotype of interest. However, as human diseases are thought to occur through the interactions of multiple genes within a same functional category, the unit of analysis of the microarray experiment expanded to the set of genes. For the phenotype of censored survival time, Gene Set Enrichment Analysis(GSEA), Global test and Wald type test are widely used. In this paper, we modified the Wald type test by adopting normal score transformation of gene expression values and developed a parametric test which requires much less computation than others. The proposed method is compared with other methods using a real data set of ovarian cancer and a simulation data set.

Effect of complex sample design on Pearson test statistic for homogeneity (복합표본자료에서 동질성검정을 위한 피어슨 검정통계량의 효과)

  • Heo, Sun-Yeong;Chung, Young-Ae
    • Journal of the Korean Data and Information Science Society
    • /
    • v.23 no.4
    • /
    • pp.757-764
    • /
    • 2012
  • This research is for comparison of test statistics for homogeneity when the data is collected based on complex sample design. The survey data based on complex sample design does not satisfy the condition of independency which is required for the standard Pearson multinomial-based chi-squared test. Today, lots of data sets ara collected by complex sample designs, but the tests for categorical data are conducted using the standard Pearson chi-squared test. In this study, we compared the performance of three test statistics for homogeneity between two populations using data from the 2009 customer satisfaction evaluation survey to the service from Gyeongsangnam-do regional offices of education: the standard Pearson test, the unbiasedWald test, and the Pearsontype test with survey-based point estimates. Through empirical analyses, we fist showed that the standard Pearson test inflates the values of test statistics very much and the results are not reliable. Second, in the comparison of Wald test and Pearson-type test, we find that the test results are affected by the number of categories, the mean and standard deviation of the eigenvalues of design matrix.

Error cause analysis of Pearson test statistics for k-population homogeneity test (k-모집단 동질성검정에서 피어슨검정의 오차성분 분석에 관한 연구)

  • Heo, Sunyeong
    • Journal of the Korean Data and Information Science Society
    • /
    • v.24 no.4
    • /
    • pp.815-824
    • /
    • 2013
  • Traditional Pearson chi-squared test is not appropriate for the data collected by the complex sample design. When one uses the traditional Pearson chi-squared test to the complex sample categorical data, it may give wrong test results, and the error may occur not only due to the biased variance estimators but also due to the biased point estimators of cell proportions. In this study, the design based consistent Wald test statistics was derived for k-population homogeneity test, and the traditional Pearson chi-squared test statistics was partitioned into three parts according to the causes of error; the error due to the bias of variance estimator, the error due to the bias of cell proportion estimator, and the unseparated error due to the both bias of variance estimator and bias of cell proportion estimator. An analysis was conducted for empirical results of the relative size of each error component to the Pearson chi-squared test statistics. The second year data from the fourth Korean national health and nutrition examination survey (KNHANES, IV-2) was used for the analysis. The empirical results show that the relative size of error from the bias of variance estimator was relatively larger than the size of error from the bias of cell proportion estimator, but its degrees were different variable by variable.