• Title/Summary/Keyword: Goodness-of-fit tests

Search Result 134, Processing Time 0.031 seconds

Does Breast Cancer Drive the Building of Survival Probability Models among States? An Assessment of Goodness of Fit for Patient Data from SEER Registries

  • Khan, Hafiz;Saxena, Anshul;Perisetti, Abhilash;Rafiq, Aamrin;Gabbidon, Kemesha;Mende, Sarah;Lyuksyutova, Maria;Quesada, Kandi;Blakely, Summre;Torres, Tiffany;Afesse, Mahlet
    • Asian Pacific Journal of Cancer Prevention
    • /
    • v.17 no.12
    • /
    • pp.5287-5294
    • /
    • 2016
  • Background: Breast cancer is a worldwide public health concern and is the most prevalent type of cancer in women in the United States. This study concerned the best fit of statistical probability models on the basis of survival times for nine state cancer registries: California, Connecticut, Georgia, Hawaii, Iowa, Michigan, New Mexico, Utah, and Washington. Materials and Methods: A probability random sampling method was applied to select and extract records of 2,000 breast cancer patients from the Surveillance Epidemiology and End Results (SEER) database for each of the nine state cancer registries used in this study. EasyFit software was utilized to identify the best probability models by using goodness of fit tests, and to estimate parameters for various statistical probability distributions that fit survival data. Results: Statistical analysis for the summary of statistics is reported for each of the states for the years 1973 to 2012. Kolmogorov-Smirnov, Anderson-Darling, and Chi-squared goodness of fit test values were used for survival data, the highest values of goodness of fit statistics being considered indicative of the best fit survival model for each state. Conclusions: It was found that California, Connecticut, Georgia, Iowa, New Mexico, and Washington followed the Burr probability distribution, while the Dagum probability distribution gave the best fit for Michigan and Utah, and Hawaii followed the Gamma probability distribution. These findings highlight differences between states through selected sociodemographic variables and also demonstrate probability modeling differences in breast cancer survival times. The results of this study can be used to guide healthcare providers and researchers for further investigations into social and environmental factors in order to reduce the occurrence of and mortality due to breast cancer.

Power comparison of distribution-free two sample goodness-of-fit tests (이표본 분포 동일성에 대한 분포무관 검정법 간 검정력 비교 연구)

  • Kim, Seon Bin;Lee, Jae Won
    • The Korean Journal of Applied Statistics
    • /
    • v.30 no.4
    • /
    • pp.513-528
    • /
    • 2017
  • Statistics are often used to test two samples if they have been drawn from the same underlying distribution. In this paper, we introduce several well-known distribution-free tests to compare distributions and conduct an extensive Monte-Carlo simulation to specify their behaviors. We consider various circumstances of when two distributions vary in (1) location, (2) scale, (3) symmetry, (4) kurtosis, (5) tail weight. A practical guideline for two-sample goodness-of-fit test is presented based on the simulation result.

Goodness-of-fit tests for randomly censored Weibull distributions with estimated parameters

  • Kim, Namhyun
    • Communications for Statistical Applications and Methods
    • /
    • v.24 no.5
    • /
    • pp.519-531
    • /
    • 2017
  • We consider goodness-of-fit test statistics for Weibull distributions when data are randomly censored and the parameters are unknown. Koziol and Green (Biometrika, 63, 465-474, 1976) proposed the $Cram\acute{e}r$-von Mises statistic's randomly censored version for a simple hypothesis based on the Kaplan-Meier product limit of the distribution function. We apply their idea to the other statistics based on the empirical distribution function such as the Kolmogorov-Smirnov and Liao and Shimokawa (Journal of Statistical Computation and Simulation, 64, 23-48, 1999) statistics. The latter is a hybrid of the Kolmogorov-Smirnov, $Cram\acute{e}r$-von Mises, and Anderson-Darling statistics. These statistics as well as the Koziol-Green statistic are considered as test statistics for randomly censored Weibull distributions with estimated parameters. The null distributions depend on the estimation method since the test statistics are not distribution free when the parameters are estimated. Maximum likelihood estimation and the graphical plotting method with the least squares are considered for parameter estimation. A simulation study enables the Liao-Shimokawa statistic to show a relatively high power in many alternatives; however, the null distribution heavily depends on the parameter estimation. Meanwhile, the Koziol-Green statistic provides moderate power and the null distribution does not significantly change upon the parameter estimation.

Asymptotic Relative Efficiency of Chi-squared Type Tests Based on the Empirical Process

  • Lee, Sang-Yeol
    • Journal of the Korean Statistical Society
    • /
    • v.25 no.3
    • /
    • pp.337-346
    • /
    • 1996
  • The chi-squared type statistic generated from the empirical process can be used for testing the goodness of fit hypothesis on iid random sample. Lee (1995) showed that under some conditions, the chi-squared type statistic is asymptotically maximin in the sense of Strasser (1985). Since the chi-squared type statistic depends on the choice of *points in the unit interval, it is worth investigating the points yielding more efficient tests. Motivated by this viewpoint, we are led to study the asymptotic relative efficiency of chi-squared type tests in the same setting of Lee (1995). Some examples are given for illustration.

  • PDF

Test of Exponentiality in Step Stress Accelerated Life test Model based on Kullback­Leibler Information Function (쿨백­라이블러 정보함수 이용한 단계 스트레스 가속수명모형의 지수성 검정)

  • 박병구;윤상철
    • Journal of Korean Society for Quality Management
    • /
    • v.31 no.4
    • /
    • pp.194-202
    • /
    • 2003
  • In this paper, we propose goodness of fit test statistics for exponentiality in accelerated life tests data based on Kullback­Leibler information functions. This acceleration model is assumed to be a tampered random variable model. The procedure is applicable when the exponential parameter using the data from accelerated life tests is or is not specified under null hypothesis. And we compare the power of the proposed test statistics with Kolmogorov­Smirnov, Cramer von Mises and Anderson­Darling statistics in the small sample.

The Choice of an Optimal Growth Function Considering Environmental Factors and Production Style (생산방식과 환경요인들을 고려한 최적성장함수의 선택에 관한 연구)

  • Choi, Jong Du
    • Environmental and Resource Economics Review
    • /
    • v.13 no.4
    • /
    • pp.717-734
    • /
    • 2004
  • This paper examined the statistical goodness-of-fit tests for biological growth model in bioeconomic analysis. Some authors estimated usually growth function for fish in the world. However, few studies have estimated growth equations for the bivalve species. Thus, this paper studied the common functional forms of fitting growth equations for cham scallops considering environmental factors and production styles. The following functional forms are considered: linear, log-reciprocal, double log, polynomial and linear with interactions. Results of fitting these various functional forms with real data are compared and evaluated using standard statistical goodness-of-fit tests. Results also indicate that log-reciprocal function is statistically the best fit to the real data. Therefore, the log-reciprocal function is decided the best function describing cham scallop biological growth and hence might be useful for economic evaluation(i.e., optimal harvesting time).

  • PDF

Goodness-of-fit test for normal distribution based on parametric and nonparametric entropy estimators (모수적 엔트로피 추정량과 비모수적 엔트로피 추정량에 기초한 정규분포에 대한 적합도 검정)

  • Choi, Byungjin
    • Journal of the Korean Data and Information Science Society
    • /
    • v.24 no.4
    • /
    • pp.847-856
    • /
    • 2013
  • In this paper, we deal with testing goodness-of-fit for normal distribution based on parametric and nonparametric entropy estimators. The minimum variance unbiased estimator for the entropy of the normal distribution is derived as a parametric entropy estimator to be used for the construction of a test statistic. For a nonparametric entropy estimator of a data-generating distribution under the alternative hypothesis sample entropy and its modifications are used. The critical values of the proposed tests are estimated by Monte Carlo simulations and presented in a tabular form. The performance of the proposed tests under some selected alternatives are investigated by means of simulations. The results report that the proposed tests have better power than the previous entropy-based test by Vasicek (1976). In applications, the new tests are expected to be used as a competitive tool for testing normality.

Goodness-of-fit tests for the inverse Weibull or extreme value distribution based on multiply type-II censored samples

  • Kang, Suk-Bok;Han, Jun-Tae;Seo, Yeon-Ju;Jeong, Jina
    • Journal of the Korean Data and Information Science Society
    • /
    • v.25 no.4
    • /
    • pp.903-914
    • /
    • 2014
  • The inverse Weibull distribution has been proposed as a model in the analysis of life testing data. Also, inverse Weibull distribution has been recently derived as a suitable model to describe degradation phenomena of mechanical components such as the dynamic components (pistons, crankshaft, etc.) of diesel engines. In this paper, we derive the approximate maximum likelihood estimators of the scale parameter and the shape parameter in the inverse Weibull distribution under multiply type-II censoring. We also develop four modified empirical distribution function (EDF) type tests for the inverse Weibull or extreme value distribution based on multiply type-II censored samples. We also propose modified normalized sample Lorenz curve plot and new test statistic.

A study on the goodness-of-fit tests for proportional hazards model (비례위험모형의 적합도 검정법에 관한 연구)

  • 장애방;이재원
    • The Korean Journal of Applied Statistics
    • /
    • v.10 no.1
    • /
    • pp.85-104
    • /
    • 1997
  • Proportional hazards model has been widely used for analyzing survival data. This article reviews some well-known goodness-of-fit tests for proportional hazards model. Simulation studies also provide some insights into the properties of these test statistics across several types of survival distributions and degerees of censorship.

  • PDF

Kullback-Leibler Information-Based Tests of Fit for Inverse Gaussian Distribution (역가우스분포에 대한 쿨백-라이블러 정보 기반 적합도 검정)

  • Choi, Byung-Jin
    • The Korean Journal of Applied Statistics
    • /
    • v.24 no.6
    • /
    • pp.1271-1284
    • /
    • 2011
  • The entropy-based test of fit for the inverse Gaussian distribution presented by Mudholkar and Tian(2002) can only be applied to the composite hypothesis that a sample is drawn from an inverse Gaussian distribution with both the location and scale parameters unknown. In application, however, a researcher may want a test of fit either for an inverse Gaussian distribution with one parameter known or for an inverse Gaussian distribution with both the two partameters known. In this paper, we introduce tests of fit for the inverse Gaussian distribution based on the Kullback-Leibler information as an extension of the entropy-based test. A window size should be chosen to implement the proposed tests. By means of Monte Carlo simulations, window sizes are determined for a wide range of sample sizes and the corresponding critical values of the test statistics are estimated. The results of power analysis for various alternatives report that the Kullback-Leibler information-based goodness-of-fit tests have good power.