• 제목/요약/키워드: chi square statistics

검색결과 639건 처리시간 0.033초

Likelihood ratio in estimating Chi-square parameter

  • Rahman, Mezbahur
    • Journal of the Korean Data and Information Science Society
    • /
    • 제20권3호
    • /
    • pp.587-592
    • /
    • 2009
  • The most frequent use of the chi-square distribution is in the area of goodness-of-t of a distribution. The likelihood ratio test is a commonly used test statistic as the maximum likelihood estimate in statistical inferences. The recently revised versions of the likelihood ratio test statistics are used in estimating the parameter in the chi-square distribution. The estimates are compared with the commonly used method of moments and the maximum likelihood estimate.

  • PDF

Empirical Comparisons of Disparity Measures for Three Dimensional Log-Linear Models

  • Park, Y.S.;Hong, C.S.;Jeong, D.B.
    • Journal of the Korean Data and Information Science Society
    • /
    • 제17권2호
    • /
    • pp.543-557
    • /
    • 2006
  • This paper is concerned with the applicability of the chi-square approximation to the six disparity statistics: the Pearson chi-square, the generalized likelihood ratio, the power divergence, the blended weight chi-square, the blended weight Hellinger distance, and the negative exponential disparity statistic. Three dimensional contingency tables of small and moderate sample sizes are generated to be fitted to all possible hierarchical log-linear models: the completely independent model, the conditionally independent model, the partial association models, and the model with one variable independent of the other two. For models with direct solutions of expected cell counts, point estimates and confidence intervals of the 90 and 95 percentage points of six statistics are explored. For model without direct solutions, the empirical significant levels and the empirical powers of six statistics to test the significance of the three factor interaction are computed and compared.

  • PDF

Distribution of a Sum of Weighted Noncentral Chi-Square Variables

  • Heo, Sun-Yeong;Chang, Duk-Joon
    • Communications for Statistical Applications and Methods
    • /
    • 제13권2호
    • /
    • pp.429-440
    • /
    • 2006
  • In statistical computing, it is often for researchers to need the distribution of a weighted sum of noncentral chi-square variables. In this case, it is very limited to know its exact distribution. There are many works to contribute to this topic, e.g. Imhof (1961) and Solomon-Stephens (1977). Imhof's method gives good approximation to the true distribution, but it is not easy to apply even though we consider the development of computer technology Solomon-Stephens's three moment chi-square approximation is relatively easy and accurate to apply. However, they skipped many details, and their simulation is limited to a weighed sum of central chi-square random variables. This paper gives details on Solomon-Stephens's method. We also extend their simulation to the weighted sum of non-central chi-square distribution. We evaluated approximated powers for homogeneous test and compared them with the true powers. Solomon-Stephens's method shows very good approximation for the case.

임의로 관측중단된 두 표본 자료에 대한 카이제곱 검정방법 (Two-sample chi-square test for randomly censored data)

  • 김주한;김정란
    • 응용통계연구
    • /
    • 제8권2호
    • /
    • pp.109-119
    • /
    • 1995
  • 두 모집단에서 임의로 관측중단도니 두 표본을 얻었을 때, 두 모집단의 분포가 같다는 가설을 검정하기 위한 카이제곱 검정방법이 제안되었다. 여기서 제안된 통계량은 대립가설이 두 모집단의 분포가 같지 않다는 양측가설일 때 쓰일 수 있다. 귀무가설이 사실일 때 제안된 통계량의 극한분포는 카이제곱 분포가 된다. 두 가지 형태의 카이제곱 검정통계량이 제안되었는데, 하나는 product-limit 추정치로부터 얻은 관측된 칸(cell) 확률의 차이들의 벡터의 이차형식으로 표현된 것이고, 다른 하나는 간단한 합의 모양으로 표현된 것이다. 두 형태의 검정통계량을 사용하여 암치료를 위한 화학요법 실험으로부터 얻은 자료를 분석하여 보았다.

  • PDF

A Rao-Robson Chi-Square Test for Multivariate Normality Based on the Mahalanobis Distances

  • Park, Cheolyong
    • Communications for Statistical Applications and Methods
    • /
    • 제7권2호
    • /
    • pp.385-392
    • /
    • 2000
  • Many tests for multivariate normality are based on the spherical coordinates of the scaled residuals of multivariate observations. Moore and Stubblebine's (1981) Pearson chi-square test is based on the radii of the scaled residuals, or equivalently the sample Mahalanobis distances of the observations from the sample mean vector. The chi-square statistic does not have a limiting chi-square distribution since the unknown parameters are estimated from ungrouped data. We will derive a simple closed form of the Rao-Robson chi-square test statistic and provide a self-contained proof that it has a limiting chi-square distribution. We then provide an illustrative example of application to a real data with a simulation study to show the accuracy in finite sample of the limiting distribution.

  • PDF

On the Robustness of Chi-square Test Procedure for a Compounded Multivariate Normal Mean

  • Kim, Hea-Jung
    • Communications for Statistical Applications and Methods
    • /
    • 제2권2호
    • /
    • pp.330-335
    • /
    • 1995
  • The rebustness of one sample Chi-square test for multivariate normal mean vector is investigated when the multivariate normal population is mixed with another multivariate normal population with differing in the mean vector. Explicit expressions for the level of significance and power of the test are derived. Some numerical results indicate that the Chi-square test procedure is quite robust against slight mixtures of multivariate normal populations differing in location parameters.

  • PDF

On an Approximation to the Distribution of Product of Independent Beta Variates

  • Hea Jung Kim
    • Communications for Statistical Applications and Methods
    • /
    • 제1권1호
    • /
    • pp.81-86
    • /
    • 1994
  • A Chi-square approximation to the distribution of product of independent Beta variates denoted by U is developed. The distribution is commonly used as a test criterion for the general linear hypothesis about the multivariate linear models. The approximation is obtained by fitting a logarithmic function of U to a Chi-square variate in terms of the first three moments. It is compared with the well known approximations due to Box(1949), Rao(1948), and Mudholkar and Trivedi(1980). It is found that the Chi-square approximation compares favorably with the other three approximations.

  • PDF

A Note on the Chi-Square Test for Multivariate Normality Based on the Sample Mahalanobis Distances

  • Park, Cheolyong
    • Journal of the Korean Statistical Society
    • /
    • 제28권4호
    • /
    • pp.479-488
    • /
    • 1999
  • Moore and Stubblebine(1981) suggested a chi-square test for multivariate normality based on cell counts calculated from the sample Mahalanobis distances. They derived the limiting distribution of the test statistic only when equiprobable cells are employed. Using conditional limit theorems, we derive the limiting distribution of the statistic as well as the asymptotic normality of the cell counts. These distributions are valid even when equiprobable cells are not employed. We finally apply this method to a real data set.

  • PDF

Criteria of Association Rule based on Chi-Square for Nominal Database

  • 박희창;이호순
    • 한국데이터정보과학회:학술대회논문집
    • /
    • 한국데이터정보과학회 2004년도 춘계학술대회
    • /
    • pp.25-38
    • /
    • 2004
  • Association rule mining searches for interesting relationships among items in a given database. Association rules are frequently used by retail stores to assist in marketing, advertising, floor placement, and inventory control. There are three primary quality measures for association rule, support and confidence and lift. In this paper we present the relation between the measure of association based on chi square statistic and the criteria of association rule for nominal database and propose the objective criteria for association.

  • PDF

불균형 텍스트 데이터의 변수 선택에 있어서의 카이제곱통계량과 정보이득의 특징 (Properties of chi-square statistic and information gain for feature selection of imbalanced text data)

  • 문혜인;손원
    • 응용통계연구
    • /
    • 제35권4호
    • /
    • pp.469-484
    • /
    • 2022
  • 텍스트 데이터는 일반적으로 많은 단어로 이루어져 있으므로 변수의 수가 매우 많은 고차원 데이터에 해당된다. 이러한 고차원 데이터에서는 계산 효율성과 통계분석의 정확성을 높이기 위해 많은 변수 중 중요한 변수를 선택하기 위한 절차를 거치는 경우가 많다. 텍스트 데이터에서도 많은 단어 중 중요한 단어를 선택하기 위해 여러가지 방법들이 사용되고 있다. 이 연구에서는 단어 선택을 위한 대표적인 필터링 방법인 카이제곱통계량과 정보이득의 공통점과 차이점을 살펴보고 실제 텍스트 데이터에서 이 단어선택 방법들의 성질을 확인해보았다. 카이제곱통계량과 정보이득은 비음성, 볼록성 등의 성질을 공유하지만 불균형 텍스트 데이터에서 카이제곱통계량이 양변수 위주로 단어를 선택하는 반면, 정보이득은 음변수도 상대적으로 많이 선택하는 경향이 있음을 확인하였다.