• Title/Summary/Keyword: homogeneity test. Pearson test

Search Result 22, Processing Time 0.021 seconds

Test of Homogeneity Baseon Complex Survey Data : Discussion Based on Power of Test

  • Heo, Sun-Yeong;Yi, Su-Cheol
    • Journal of the Korean Data and Information Science Society
    • /
    • v.16 no.3
    • /
    • pp.609-620
    • /
    • 2005
  • In the secondary data analysis for categorical data, situations often arise in which the estimated cell variances are available, but not the full matrix of variances. In this case researchers are often inclined to use Pearson-type test statistics for homogeneity. However, for a complex sample observed cell proportions are not distributed as multinomial and Pearson-type test statistic generally is not distributed asymptotically as chi-square distribution. This paper evaluates powers for Wald test and Pearson-type test and the first order corrected test of Pearson-type test for homogeneity. The resulting power curves indicate that as the misspecification effect increases, the amount of inflation of significance level and the loss of power Pearson-type test are getting more severe.

  • PDF

Effect of complex sample design on Pearson test statistic for homogeneity (복합표본자료에서 동질성검정을 위한 피어슨 검정통계량의 효과)

  • Heo, Sun-Yeong;Chung, Young-Ae
    • Journal of the Korean Data and Information Science Society
    • /
    • v.23 no.4
    • /
    • pp.757-764
    • /
    • 2012
  • This research is for comparison of test statistics for homogeneity when the data is collected based on complex sample design. The survey data based on complex sample design does not satisfy the condition of independency which is required for the standard Pearson multinomial-based chi-squared test. Today, lots of data sets ara collected by complex sample designs, but the tests for categorical data are conducted using the standard Pearson chi-squared test. In this study, we compared the performance of three test statistics for homogeneity between two populations using data from the 2009 customer satisfaction evaluation survey to the service from Gyeongsangnam-do regional offices of education: the standard Pearson test, the unbiasedWald test, and the Pearsontype test with survey-based point estimates. Through empirical analyses, we fist showed that the standard Pearson test inflates the values of test statistics very much and the results are not reliable. Second, in the comparison of Wald test and Pearson-type test, we find that the test results are affected by the number of categories, the mean and standard deviation of the eigenvalues of design matrix.

Effect of Bias on the Pearson Chi-squared Test for Two Population Homogeneity Test

  • Heo, Sunyeong
    • Journal of Integrative Natural Science
    • /
    • v.5 no.4
    • /
    • pp.241-245
    • /
    • 2012
  • Categorical data collected based on complex sample design is not proper for the standard Pearson multinomial-based chi-squared test because the observations are not independent and identically distributed. This study investigates effects of bias of point estimator of population proportion and its variance estimator to the standard Pearson chi-squared test statistics when the sample is collected based on complex sampling scheme. This study examines the effect under two population homogeneity test. The standard Pearson test statistic can be partitioned into two parts; the first part is the weighted sum of ${\chi}^2_1$ with eigenvalues of design matrix as their weights, and the additional second part which is added due to the biases of the point estimator and its variance estimator. Our empirical analysis shows that even though the bias of point estimator is small, Pearson test statistic is very much inflated due to underestimate the variance of point estimator. In the connection of design-based variance estimator and its design matrix, the bigger the average of eigenvalues of design matrix is, the larger relative size of which the first component part to Pearson test statistic is taking.

Tests for homogeneity of proportions in clustered binomial data

  • Jeong, Kwang Mo
    • Communications for Statistical Applications and Methods
    • /
    • v.23 no.5
    • /
    • pp.433-444
    • /
    • 2016
  • When we observe binary responses in a cluster (such as rat lab-subjects), they are usually correlated to each other. In clustered binomial counts, the independence assumption is violated and we encounter an extra-variation. In the presence of extra-variation, the ordinary statistical analyses of binomial data are inappropriate to apply. In testing the homogeneity of proportions between several treatment groups, the classical Pearson chi-squared test has a severe flaw in the control of Type I error rates. We focus on modifying the chi-squared statistic by incorporating variance inflation factors. We suggest a method to adjust data in terms of dispersion estimate based on a quasi-likelihood model. We explain the testing procedure via an illustrative example as well as compare the performance of a modified chi-squared test with competitive statistics through a Monte Carlo study.

Error cause analysis of Pearson test statistics for k-population homogeneity test (k-모집단 동질성검정에서 피어슨검정의 오차성분 분석에 관한 연구)

  • Heo, Sunyeong
    • Journal of the Korean Data and Information Science Society
    • /
    • v.24 no.4
    • /
    • pp.815-824
    • /
    • 2013
  • Traditional Pearson chi-squared test is not appropriate for the data collected by the complex sample design. When one uses the traditional Pearson chi-squared test to the complex sample categorical data, it may give wrong test results, and the error may occur not only due to the biased variance estimators but also due to the biased point estimators of cell proportions. In this study, the design based consistent Wald test statistics was derived for k-population homogeneity test, and the traditional Pearson chi-squared test statistics was partitioned into three parts according to the causes of error; the error due to the bias of variance estimator, the error due to the bias of cell proportion estimator, and the unseparated error due to the both bias of variance estimator and bias of cell proportion estimator. An analysis was conducted for empirical results of the relative size of each error component to the Pearson chi-squared test statistics. The second year data from the fourth Korean national health and nutrition examination survey (KNHANES, IV-2) was used for the analysis. The empirical results show that the relative size of error from the bias of variance estimator was relatively larger than the size of error from the bias of cell proportion estimator, but its degrees were different variable by variable.

Empirical Analysis on Rao-Scott First Order Adjustment for Two Population Homogeneity test Based on Stratified Three-Stage Cluster Sampling with PPS

  • Heo, Sunyeong
    • Journal of Integrative Natural Science
    • /
    • v.7 no.3
    • /
    • pp.208-213
    • /
    • 2014
  • National-wide and/or large scale sample surveys generally use complex sample design. Traditional Pearson chi-square test is not appropriate for the categorical complex sample data. Rao-Scott suggested an adjustment method for Pearson chi-square test, which uses the average of eigenvalues of design matrix of cell probabilities. This study is to compare the efficiency of Rao-Scott first order adjusted test to Wald test for homogeneity between two populations using 2009 Gyeongnam regional education offices's customer satisfaction survey (2009 GREOCSS) data. The 2009 GREOCSS data were collected based on stratified three-stage cluster sampling with probability proportional to size. The empirical results show that the Rao-Scott adjusted test statistic using only the variances of cell probabilities is very close to the Wald test statistic, which uses the covariance matrix of cell probabilities, under the 2009 GREOCSS data based. However it is necessary to be cautious to use the Rao-Scott first order adjusted test statistic in the place of Wald test because its efficiency is decreasing as the relative variance of eigenvalues of the design matrix of cell probabilities is increasing, specially more when the number of degrees of freedom is small.

Estimation of Design Rainfall Using 3 Parameter Probability Distributions (3변수 확률분포에 의한 설계강우량 추정)

  • Lee, Soon Hyuk;Maeng, Sung Jin;Ryoo, Kyong Sik
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2004.05b
    • /
    • pp.595-598
    • /
    • 2004
  • This research seeks to derive the design rainfalls through the L-moment with the test of homogeneity, independence and outlier of data on annual maximum daily rainfall at 38 rainfall stations in Korea. To select the appropriate distribution of annual maximum daily rainfall data by the rainfall stations, Generalized Extreme Value (GEV), Generalized Logistic (GLO), Generalized Pareto (GPA), Generalized Normal (GNO) and Pearson Type 3 (PT3) probability distributions were applied and their aptness were judged using an L-moment ratio diagram and the Kolmogorov-Smirnov (K-S) test. Parameters of appropriate distributions were estimated from the observed and simulated annual maximum daily rainfall using Monte Carlo techniques. Design rainfalls were finally derived by GEV distribution, which was proved to be more appropriate than the other distributions.

  • PDF

Frequency Analysis of Extreme Rainfall Using 3 Parameter Probability Distributions (3변수 확률분포형에 의한 극치강우의 빈도분석)

  • Kim, Byeong-Jun;Maeng, Sung-Jin;Ryoo, Kyong-Sik;Lee, Soon-Hyuk
    • Journal of The Korean Society of Agricultural Engineers
    • /
    • v.46 no.3
    • /
    • pp.31-42
    • /
    • 2004
  • This research seeks to derive the design rainfalls through the L-moment with the test of homogeneity, independence and outlier of data on annual maximum daily rainfall at 38 rainfall stations in Korea. To select the appropriate distribution of annual maximum daily rainfall data by the rainfall stations, Generalized Extreme Value (GEV), Generalized Logistic (GLO), Generalized Pareto (GPA), Generalized Normal (GNO) and Pearson Type 3 (PT3) probability distributions were applied and their aptness were judged using an L-moment ratio diagram and the Kolmogorov-Smirnov (K-S) test. Parameters of appropriate distributions were estimated from the observed and simulated annual maximum daily rainfall using Monte Carlo techniques. Design rainfalls were finally derived by GEV distribution, which was proved to be more appropriate than the other distributions.

Pearson-type Chi-square Test on the Joint Orientations from Different Depths in Boreholes (시추공 영상자료와 카이제곱 검정을 이용한 절리 방향성의 수직적 변화양상에 관한 정량적 평가)

  • Kim, Ki-Seog;Park, Young-Do;Park, Yeon-Jun
    • Tunnel and Underground Space
    • /
    • v.18 no.3
    • /
    • pp.185-193
    • /
    • 2008
  • We have carried out Pearson-type chi-square tests on the orientation data of joints from different depths in order to estimate the homogeneity of joint orientations obtained from a borehole. The orientation data of joints were collected from two non-foliated massive rocks of granitic gneisses in South Korea since orientations of joints in folded metamorphic rocks, for example, are controlled by foliation and also changes as the orientations of foliation change by folding. Borehole images were used for the analysis of the orientations of individual joints. The orientation data were subdivided into the upper level data and lower level data. The data from these two levels are plotted on the patch net consisting of 21 orientation patches. Then, the two patterns on the patch net were analyzed using a contingency table. From the chi-square test on the data collected from two sites, we found that some data sets show statistically meaningful differences in orientations of joints. Since joints are one of the important parameters in determining the physical properties of rock masses, in situ investigation of joints are desirable in the geotechnical investigation and also in design of subsurface structures (e.g. tunnels and underground storages).

The Effect of the Structured Education on the Early Rehabilitation Knowledge and Activity Performance of the C.V.A. Patients (구조화된 환자교육이 뇌졸중 환자의 조기재활에 관한 지식과 활동수행에 미치는 영향)

  • 이혜진;이향련
    • Journal of Korean Academy of Nursing
    • /
    • v.27 no.1
    • /
    • pp.109-119
    • /
    • 1997
  • This study has been attempted to set up the strategies of the nursing which can promote the activity performance for early rehabilitation for the patients by examining the effect of the structured patient education on the early rehabilitation knowledge and activity performance of the C.V.A patients. The study method has been done by investigating the experiment group and control group in advance through the question papers and interview and observation on 65 patients who had been hospitalized at oriental medicine hospital of K Medical Center from July 1st 1995 to the end of Sep, 1995. The analysis of the collected material had been done for the homogeneity test in which general characters of experiment group and control group had been tested by X²and the homogeneity test of ADL by t-test. To test the hypothesis the t-test had been given for the difference of the early rehabilitation knowledge and activity performance between the two groups and the correlation between early rehabilitation knowledge and activity performance had been tested by Pearson's Correlation Coefficient. The result of the test of the hypothesis is as the below. 1 The 1st hypothesis “The experiment group which had received the structured education should be higher in the early rehabilitation knowledge than the control group” was supported(t=4.45. p=.000). 2. The 2nd hypothesis “The experiment group which received the structured education should be higher in the early rehabilitation activity performance than the control group”was supported(t=2.11, p=.036). 3. The 3rd hypothesis “The higher the early rehabilitation knowledge of the patient the higher the activity performance degree” was rejected (r=.1546, p=.219). In conclusion, the patients who received the structured education showed the increase in the degree of early rehabilitation knowledge and activity performance, so it has been judged that education has been prerequisite in increasing the knowledge and activity performance of early rehabilitation.

  • PDF