• Title/Summary/Keyword: Data normality

Search Result 318, Processing Time 0.024 seconds

Goodness-of-Fit-Test from Censored Samples

  • Cho, Young-Suk
    • Journal of the Korean Data and Information Science Society
    • /
    • v.17 no.1
    • /
    • pp.41-52
    • /
    • 2006
  • Because most common assumption is normality in statistical analysis, testing normality is very important. The Q-Q plot is a powerful tool to test normality with full samples in statistical package. But the plot can't test normality in type-II censored samples. This paper proposed the modified the Q-Q plot and the modified normalized sample Lorenz curve(NSLC) for normality test in the type-II censored samples. Using the two Hodgkin's disease data sets and the type-II censored samples, we picture the modified Q-Q plot and the modified normalized sample Lorenz curve.

  • PDF

The Validation Study of Normality Distribution of Aquatic Toxicity Data for Statistical Analysis (수생태 독성자료의 정규성 분포 특성 확인을 통해 통계분석 시 분포 특성 적용에 대한 타당성 확인 연구)

  • OK, Seung-yeop;Moon, Hyo-Bang;Ra, Jin-Sung
    • Journal of Environmental Health Sciences
    • /
    • v.45 no.2
    • /
    • pp.192-202
    • /
    • 2019
  • Objectives: According to the central limit theorem, the samples in population might be considered to follow normal distribution if a large number of samples are available. Once we assume that toxicity dataset follow normal distribution, we can treat and process data statistically to calculate genus or species mean value with standard deviation. However, little is known and only limited studies are conducted to investigate whether toxicity dataset follows normal distribution or not. Therefore, the purpose of study is to evaluate the generally accepted normality hypothesis of aquatic toxicity dataset Methods: We selected the 8 chemicals, which consist of 4 organic and 4 inorganic chemical compounds considering data availability for the development of species sensitivity distribution. Toxicity data were collected at the US EPA ECOTOX Knowledgebase by simple search with target chemicals. Toxicity data were re-arranged to a proper format based on the endpoint and test duration, where we conducted normality test according to the Shapiro-Wilk test. Also we investigated the degree of normality by simple log transformation of toxicity data Results: Despite of the central limit theorem, only one large dataset (n>25) follow normal distribution out of 25 large dataset. By log transforming, more 7 large dataset show normality. As a result of normality test on small dataset (n<25), log transformation of toxicity value generally increases normality. Both organic and inorganic chemicals show normality growth for 26 species and 30 species, respectively. Those 56 species shows normality growth by log transformation in the taxonomic groups such as amphibian (1), crustacean (21), fish (22), insect (5), rotifer (2), and worm (5). In contrast, mollusca shows normality decrease at 1 species out of 23 that originally show normality. Conclusions: The normality of large toxicity dataset was not always satisfactory to the central limit theorem. Normality of those data could be improved through log transformation. Therefore, care should be taken when using toxicity data to induce, for example, mean value for risk assessment.

A Comparison on the Empirical Power of Some Normality Tests

  • Kim, Dae-Hak;Eom, Jun-Hyeok;Jeong, Heong-Chul
    • Journal of the Korean Data and Information Science Society
    • /
    • v.17 no.1
    • /
    • pp.31-39
    • /
    • 2006
  • In many cases, we frequently get a desired information based on the appropriate statistical analysis of collected data sets. Lots of statistical theory rely on the assumption of the normality of the data. In this paper, we compare the empirical power of some normality tests including sample entropy quantity. Monte carlo simulation is conducted for the calculation of empirical power of considered normality tests by varying sample sizes for various distributions.

  • PDF

A Simultaneous Test for Multivariate Normality and Independence with Application to Univariate Residuals

  • Park, Cheol-Yong
    • Journal of the Korean Data and Information Science Society
    • /
    • v.17 no.1
    • /
    • pp.115-122
    • /
    • 2006
  • A test is suggested for detecting deviations from both multivariate normality and independence. This test can be used for assessing the normality and independence of univariate time series residuals. We derive the limiting distribution of the test statistic and a simulation study is conducted to study the accuracy of the limiting distribution in finite samples. Finally, we apply our method to a real data of time series.

  • PDF

A Test for Multivariate Normality Focused on Elliptical Symmetry Using Mahalanobis Distances

  • Park, Cheol-Yong
    • Journal of the Korean Data and Information Science Society
    • /
    • v.17 no.4
    • /
    • pp.1191-1200
    • /
    • 2006
  • A chi-squared test of multivariate normality is suggested which is mainly focused on detecting deviations from elliptical symmetry. This test uses Mahalanobis distances of observations to have some power for deviations from multivariate normality. We derive the limiting distribution of the test statistic by a conditional limit theorem. A simulation study is conducted to study the accuracy of the limiting distribution in finite samples. Finally, we compare the power of our method with those of other popular tests of multivariate normality under two non-normal distributions.

  • PDF

A Note on the Simple Chi-Squared Test of Multivariate Normality

  • Park, Cheol-Yong
    • Journal of the Korean Data and Information Science Society
    • /
    • v.15 no.2
    • /
    • pp.423-430
    • /
    • 2004
  • We provide the exact form of a Rao-Robson version of the chi-squared test of multivariate normality suggested by Park(2001). This test is easy to apply in practice since it is easily computed and has a limiting chi-squared distribution under multivariate normality. A self-contained formal argument is provided that it has the limiting chi-squared distribution. A simulation study is provided to study the accuracy, in finite samples, of the limiting distribution. Finally, a simulation study in a nonnormal distribution is conducted in order to compare the power of our test with those of other popular normality tests.

  • PDF

Testing the domestic financial data for the normality of the innovation based on the GARCH(1,1) model

  • Lee, Tae-Wook;Ha, Jeong-Cheol
    • Journal of the Korean Data and Information Science Society
    • /
    • v.18 no.3
    • /
    • pp.809-815
    • /
    • 2007
  • Since Bollerslev(1986), the GARCH model has been popular in analysing the volatility of the financial time series. In real data analysis, practitioners conventionally put the normal assumption on the innovation random variables of the GARCH model, which is often violated. In this paper, we analyse the domestic financial data based on the GARCH(1,1) model and among existing normality tests, perform the Jarque-Bera test based on the residuals. It is shown that the innovation based on the GARCH(1,1) model dose not follow the normality assumption.

  • PDF

Logistic Model for Normality by Neural Networks

  • Lee, Jea-Young;Rhee, Seong-Won
    • Journal of the Korean Data and Information Science Society
    • /
    • v.14 no.1
    • /
    • pp.119-129
    • /
    • 2003
  • We propose a new logistic regression model of normality curves for normal(diseased) and abnormal(nondiseased) classifications by neural networks in data mining. The fitted logistic regression lines are estimated, interpreted and plotted by the neural network technique. A few goodness-of-fit test statistics for normality are discussed and the performances by the fitted logistic regression lines are conducted.

  • PDF

Goodness-of-Fit Test for the Normality based on the Generalized Lorenz Curve

  • Cho, Youngseuk;Lee, Kyeongjun
    • Communications for Statistical Applications and Methods
    • /
    • v.21 no.4
    • /
    • pp.309-316
    • /
    • 2014
  • Testing normality is very important because the most common assumption is normality in statistical analysis. We propose a new plot and test statistic to goodness-of-fit test for normality based on the generalized Lorenz curve. We compare the new plot with the Q-Q plot. We also compare the new test statistic with the Kolmogorov-Smirnov (KS), Cramer-von Mises (CVM), Anderson-Darling (AD), Shapiro-Francia (SF), and Shapiro-Wilks (W) test statistic in terms of the power of the test through by Monte Carlo method. As a result, new plot is clearly classified normality and non-normality than Q-Q plot; in addition, the new test statistic is more powerful than the other test statistics for asymmetrical distribution. We check the proposed test statistic and plot using Hodgkin's disease data.

Normality of the MPLE of a Proportional Hazard Model for Informative Censored Data (정보적 중도절단을 고려한 최대 편우도 추정량의 정규성)

  • 정대현;원동유
    • Journal of Applied Reliability
    • /
    • v.1 no.2
    • /
    • pp.149-163
    • /
    • 2001
  • We study the normality of the maximum partial likelihood estimators for the proportional hazard model with informative censored data. The proposed models cover the cases in which the times to a primary event may be informatively or randomly censored and the times to a secondary event may be randomly censored. To estimate the parameters and to check the normality of the parameters in the model, we adopt the partial likelihood and counting process to use the martingale central limit theorem. Simulation studies are performed to examine the normality of the MPLE's for the five cases in which they depend upon the proportions of randomly censored and informative censored data.

  • PDF