• Title/Summary/Keyword: Outliers test

Search Result 114, Processing Time 0.028 seconds

Outlier detection using Grubb test and Cochran test in clinical data (그럽 및 코크란 검정을 이용한 임상자료의 이상치 판단)

  • Sohn, Ki-Cheul;Shin, Im-Hee
    • Journal of the Korean Data and Information Science Society
    • /
    • v.23 no.4
    • /
    • pp.657-663
    • /
    • 2012
  • There are very small values and/or very big values which get out of the normal range for survey data in various fields. The reasons of occurrence for outlier are two. One of them is the error in process of data input and the other is the strange response of the respondent. If the data has outliers, then the summary statistics such as the mean and the variance produce misleading information. Therefore, researcher should be careful in detecting the outlier in data. In particular, it is very important problem for clinical fields because the cost of experiment is very high. This article introduce the Grubb test and Cochran test to detect outliers in the data and we apply this method for clinical data.

CASB-DELETION DIAGNOSTICS FOR TESTING A LINEAR HYPOTHESIS ABOUT REGRESSION COEFFICIENTS

  • Kim, Myung-Geun
    • Journal of applied mathematics & informatics
    • /
    • v.10 no.1_2
    • /
    • pp.111-118
    • /
    • 2002
  • We study the influence of observations on testing a linear hypothesis using single and multiple case-deletions. The change in the F-test statistic due to case-deletions is shown to be completely determined by two externally Studentized residuals. These residuals we used for investigating the outlyingness when there are linear constraints or not. An illustrative example is given. It shows the usefulness of case-deletions.

Robust Unit Root Tests for a Panel TAR Model

  • Shin, Dong-Wan
    • The Korean Journal of Applied Statistics
    • /
    • v.24 no.1
    • /
    • pp.11-23
    • /
    • 2011
  • Robust unit root tests are developed for dynamic panels consisting of TAR processes. The test statistics are all based on diverse combinations of individual t-type tests for significance of TAR coefficients. Limiting null distributions are established. A Monte-Carlo experiment compares the proposed tests. The tests are applied to a panel data set of Canadian unemployment rates which show asymmetric features as well as having outliers.

ROBUST UNIT ROOT TESTS FOR SEASONAL AUTOREGRESSIVE PROCESS

  • Oh, Yu-Jin;So, Beong-Soo
    • Journal of the Korean Statistical Society
    • /
    • v.33 no.2
    • /
    • pp.149-157
    • /
    • 2004
  • The stationarity is one of the most important properties of a time series. We propose robust sign tests for seasonal autoregressive processes to determine whether or not a time series is stationary. The proposed tests are robust to the outliers and the heteroscedastic errors, and they have an exact binomial null distribution regardless of the period of seasonality and types of median adjustments. A Monte-Carlo simulation shows that the sign test is locally more powerful than the tests based on ordinary least squares estimator (OLSE) for heavy-tailed and/or heteroscedastic error distributions.

ROBUST UNIT ROOT TESTS FOR SEASONAL AUTOREGRESSIVE PROCESS

  • Oh, Yu-Jin;So, Beong-Soo
    • Proceedings of the Korean Statistical Society Conference
    • /
    • 2003.05a
    • /
    • pp.281-286
    • /
    • 2003
  • The stationarity is one of the most important properties of a time series. We propose robust sign tests for seasonal autoregressive process to determine whether or not a time series is stationary. The tests have an exact binomial null distribution and are robust to the outliers and the heteroscedastic errors. Monte-Carlo simulation shows that the sign test is locally more powerful than the OLSE-based tests for heavy-tailed and/or heteroscedastic error distributions.

  • PDF

Accuracy of Multiple Outlier Tests in Nonlinear Regression

  • Kahng, Myung-Wook
    • Communications for Statistical Applications and Methods
    • /
    • v.18 no.1
    • /
    • pp.131-136
    • /
    • 2011
  • The original Bates-Watts framework applies only to the complete parameter vector. Thus, guidelines developed in that framework can be misleading when the adequacy of the linear approximation is very different for different subsets. The subset curvature measures appear to be reliable indicators of the adequacy of linear approximation for an arbitrary subset of parameters in nonlinear models. Given the specific mean shift outlier model, the standard approaches to obtaining test statistics for outliers are discussed. The accuracy of outlier tests is investigated using subset curvatures.

Derivation of Optimal Design Flood by L-Moments and LB-Moments ( I ) - On the method of L-Moments - (L-모멘트 및 LH-모멘트 기법에 의한 적정 설계홍수량의 유도( I ) - L-모멘트법을 중심으로 -)

  • 이순혁;박명근;맹승진;정연수;김동주;류경식
    • Magazine of the Korean Society of Agricultural Engineers
    • /
    • v.40 no.4
    • /
    • pp.45-57
    • /
    • 1998
  • This study was conducted to derive optimal design floods by Generalized Extreme Value (GEV) distribution for the annual maximum series at ten watersheds along Han, Nagdong, Geum, Yeongsan and Seomjin river systems. Adequacy for the analysis of flood data used in this study was established by the tests of Independence, Homogeneity, detection of Outliers. L-coefficient of variation, L-skewness and L-kurtosis were calculated by L-moment ratio respectively. Parameters were estimated by the Methods of Moments and L-Moments. Design floods obtained by Methods of Moments and L-Moments using different methods for plotting positions in GEV distribution were compared by the Relative Mean Errors(RME) and Relative Absolute Errors(RAE). The results were analyzed and summarized as follows. 1. Adequacy for the analysis of flood data was acknowledged by the tests of Independence, Homogeneity and detection of Outliers. 2. GEV distribution used in this study was found to be more suitable one than Pearson type 3 distribution by the goodness of fit test using Kolmogorov-Smirnov test and L-Moment ratios diagram in the applied watersheds. 3. Parameters for GEV distribution were estimated using Methods of Moments and L-Moments. 4. Design floods were calculated by Methods of Moments and L-Moments in GEV distribution. 5. It was found that design floods derived by the method of L-Moments using Weibull plotting position formula in GEV distribution are much closer to those of the observed data in comparison with those obtained by method of moments using different formulas for plotting positions from the viewpoint of Relative Mean Errors and Relative Absolute Errors.

  • PDF

A Robust Test for Location Parameters in Multivariate Data (다변량 자료에서 위치모수에 대한 로버스트 검정)

  • So, Sun-Ha;Lee, Dong-Hee;Jung, Byoung-Cheo
    • The Korean Journal of Applied Statistics
    • /
    • v.22 no.6
    • /
    • pp.1355-1364
    • /
    • 2009
  • This work propose a robust test for location parameters in multivariate data based on MVE and MCD with the affine equivariance and the high-breakdown properties. We consider the hypothesis testing satisfying high efficiency and high test power simultaneously to bring in the one-step reweighting procedure upon high-breakdown estimators, which generally suffer from the low efficiency and, as a result, usually used only in the exploratory analysis. Monte Carlo study shows that the suggested method retains nominal significance levels and higher testing power without regard to various population distributions than a Hotelling's $T^2$ test. In an example, a data set containing known outliers does not make an influence toward our proposal, while it renders a Hotelling's $T^2$ useless.

A Study on Forest Fire Detection from MODIS Data Using Local Spatial Association Analysis (국지적 공간상관분석을 이용한 MODIS영상에서의 산불탐지에 관한 연구)

  • Byun, Young-Gi;Huh, Yong;Kim, Yong-Min;Yu, Ki-Yun
    • Journal of Korean Society for Geospatial Information Science
    • /
    • v.15 no.1 s.39
    • /
    • pp.23-29
    • /
    • 2007
  • Spatial outliers in remotely sensed imagery represent observed quantities showing unusual values compared to their neighbor pixel values. There have been various methods to detect the spatial outliers based on spatial autocorrelations in statistics and data mining. These methods may be applied in detecting forest fire pixels in the MODIS imageries from NASA's AQUA satellite. This is because the forest fire detection can be referred to as finding spatial outliers using spatial variation of brightness temperature. In this paper, we propose a new forest fire detection algorithm which is based on local spatial association analysis, and test the proposed algorithm to evaluate its applicability. In order to evaluate the proposed algorithm, the results were compared with the MODIS fire product provided by the NASA MODIS Science Team, which showed the possibility of the proposed algorithm in detecting the fire pixels.

  • PDF

Natural Background Level Analysis of Heavy Metal Concentration in Korean Coastal Sediments (한국 연안 퇴적물 내 중금속 원소의 자연적 배경농도 연구)

  • Lim, Dhong-Il;Choi, Jin-Yong;Jung, Hoi-Soo;Choi, Hyun-Woo;Kim, Young-Ok
    • Ocean and Polar Research
    • /
    • v.29 no.4
    • /
    • pp.379-389
    • /
    • 2007
  • This paper presents an attempt to determine natural background levels of heavy metals which could be used for assessing heavy metal contamination. For this study, a large archive dataset of heavy metal concentration (Cu, Cr, Ni, Pb, Zn) for more than 900 surface sediment samples from various Korean coastal environments was newly compiled. These data were normalized for aluminum (grain-size normalizer) concentration to isolate natural factors from anthropogenic ones. The normalization was based on the hypothesis that heavy metal concentrations vary consistently with the concentration of aluminum, unless these metals are of anthropogenic origin. So, the samples (outliers) suspected of receivingany anthropogenic input were removed from regression to ascertain the "background" relationship between the metals and aluminum. Identification of these outliers was tested using a model of predicted limits at 95%. The process of testing for normality (Kolmogorov-Smirnov Test) and selection of outliers was iterated until a normal distribution was achieved. On the basis of the linear regression analysis of the large archive (please check) dataset, background levels, which are applicable to heavy metal assessment of Korean coastal sediments, were successfully developed for Cu, Cr, Ni, Zn. As an example, we tested the applicability of this baseline level for metal pollution assessment of Masan Bay sediments.