Browse > Article
http://dx.doi.org/10.9765/KSCOE.2021.33.2.53

Normality Test of the Water Quality Monitoring Data in Harbour  

Cho, Hong-Yeon (Marine Big-data Center, Korea Institute of Ocean Science and Technology, University of Science anf Technology)
Publication Information
Journal of Korean Society of Coastal and Ocean Engineers / v.33, no.2, 2021 , pp. 53-64 More about this Journal
Abstract
Normality test (hereafter NT) is a highly recommended test for statistical estimation because the normality assumption on the data is the basic and essential. NT was carried using the KOEM water quality monitoring data in harbor which are composed of total 3,000 data sets (50 stations, 30 water quality parameters including surface and bottom layers, and two seasons, such as summer and winter). The comparative analysis of the normality are carried out using total 18 methods supported by the R program packages. In addition, the Shapiro-Wilk test method is selected as the references method in this study for the analysis on the data transformation and outliers's effects in detail. The numbers of normality assumption rejection (NAR) are estimated and compared to these cases, before and after applications of the Box-Cox transformation and Rosner's outlier test. The NAR numbers are reduced from 24-28 to 3-4 in the "before and after" BC transformation cases with the no outlier-exclusion condition. On the contrary, the NAR numbers are rapidly diminished from 6-9 to below one in the same case with the outlier exclusion condition. Thus, the Box-Cox transformation based on the outlier test of the coastal water quality monitoring data that are not comes form the normal distribution, is highly recommended for the suitable statistical estimation and inferences.
Keywords
normality test; outlier's test; Box-Cox transformation; harbor WQ monitoring data; statistical estimation;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Filliben, J.J. (1975). The probability plot correlation coefficient test for normality. Technometrics, 17, 111-117.   DOI
2 Frosini, B.V. (1987). On the distribution and power of a goodnessof-fit statistic with parametric and nonparametric applications. "Goodness-of-fit" (edited by Revesz P., Sarkadi K., Sen P.K.). 133-154.
3 Barnett, V. and Lewis, T. (1994). Outliers in statistical data, John Wiley & Sons.
4 Cho, H.Y., Lee, K.S. and Ahn, S.M. (2016). Impact of outliers on the statistical measures of the environmental monitoring data in Busan coastal sea, Note. Ocean and Polar Research, 38(2), 149-159.   DOI
5 D'Agostino, R.B. and Stephens, M.A. (1986). Goodness-of-Fit Techniques, Marcel Dekker.
6 Jarque, C.M. and Bera, A.K. (1987). A test for normality of observations and regression residuals. International Statistical Review, 55, 163-172.   DOI
7 Gavrilov, I. and Pusev, R. (2014). normtest: Tests for Normality. R package version 1.1. https://CRAN.R-project.org/package=normtest.
8 Geary, R.C. (1935). The ratio of the mean deviation to the standard deviation as a test of normality. Biometrika, 27, 310-332.   DOI
9 Hegazy, Y.A.S. and Green, J.R. (1975). Some new goodness-of-fit tests using order statistics. Journal of the Royal Statistical Society. Series C (Applied Statistics), 24, 299-308.
10 Little, R.J.A. and Rubin, D.B. (2002). Statistical Analysis with Missing Data, Second Edition, John Wiley & Sons.
11 Looney, S.W. and Gulledge, T.R. (1985). Use of correlation coefficient with normal probability plots. The American Statistician, 39, 75-79.   DOI
12 Millard, S.P. (2013). EnvStats: An R Package for Environmental Statistics. Springer, New York. ISBN 978-1-4614-8455-4, https://www.springer.com.
13 Gross, J. and Ligges, U. (2015). nortest: Tests for Normality. R package version 1.0-4. https://CRAN.R-project.org/package=nortest.
14 Ministry of Oceans and Fisheris (2012). Marine Environment Information (System) Portal (2021). https://www.meis.go.kr [accessed 2021.02.26.].
15 Pohlert, T. (2020). ppcc: Probability Plot Correlation Coefficient Test. R package version 1.2. https://CRAN.R-project.org/package=ppcc.
16 R Core Team (2020). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. https://www.R-project.org/.
17 Shapiro, S.S., Wilk, M.B. and Chen, H.J. (1968). A comparative study of various tests for normality. Journal of the American Statistical Association, 63, 1343-1372.   DOI
18 Razali, N.M. and Wah, Y.B. (2011). Power comparisons of shapirowilk, kolmogrov-smirnov, lilliefors and anderson-darling tests. Journal of Statistical Modeling and Analytics, 2(1), 21-33.
19 Royston, P. (1995). Remark AS R94: A remark on Algorithm AS 181: The W test for normality. Applied Statistics, 44, 547-551. doi:10.2307/2986146.   DOI
20 Royston, P. (1993). A pocket-calculator algorithm for the Shapiro-Francia test for non-normality: an application to medicine. Statistics in Medicine, 12, 181-184.   DOI
21 Urzua, C.M. (1996). On the correct use of omnibus tests for normality. Economics Letters, 53, 247-251.   DOI
22 Spiegelhalter, D.J. (1977). A test for normality against symmetric alternatives. Biometrika, 64, 415-418.   DOI
23 Stephens, M.A. (1986). Tests based on EDF statistics. Goodnessof-Fit Techniques. (edited by D'Agostino, R.B. and Stephens, M.A.). Marcel Dekker, New York.
24 Thode, Jr., H.C. (2002). Testing for Normality. Marcel Dekker, New York.
25 Weisberg, S. and Bingham, C. (1975). An approximate analysis of variance test for non-normality suitable for machine calculation. Technometrics, 17, 133-134.   DOI