DOI QR코드

DOI QR Code

Outlier tests on potential outliers

잠재적 이상치군에 대한 검정

  • Seo, Han Son (Department of Applied Statistics, Konkuk University)
  • 서한손 (건국대학교 응용통계학과)
  • Received : 2016.12.14
  • Accepted : 2017.02.01
  • Published : 2017.02.28

Abstract

Observations identified as potential outliers are usually tested for real outliers; however, some outlier detection methods skip a formal test or perform a test using simulated p-values. We introduce test procedures for outliers by testing subsets of potential outliers rather than by testing individual observations of potential outliers to avoid masking or swamping effects. Examples to illustrate methods and a Monte Carlo study to compare the power of the various methods are presented.

일반적으로 잠재적 이상치군은 검정과정을 통해 최종적으로 이상치 여부를 판단하지만 검정절차를 생략하거나 모의실험에 의해 계산된 유의값을 기반으로 검정을 수행하는 이상치 탐지법들도 있다. 본 논문에서는 가면화나 수렁화현상을 피하기 위하여 이상치후보군에 속한 개별 관찰치를 검정하지 않고 이상치후보군의 부분집합들을 검정하는 절차를 제안한다. 제안된 방법의 활용을 보여주는 예제와 다른 방법과의 검정력 비교를 위한 모의실험 결과가 제시된다.

Keywords

References

  1. Atkinson, A. C. (1985). Plots, Transformations and Regression: An Introduction to Graphical Method of Diagnostic Regression Analysis, Oxford University Press, Oxford.
  2. Brownlee, K. A. (1965). Statistical Theory and Methodology in Science and Engineering, John Wiley, New York.
  3. Hadi, A. S. and Simonoff, J. S. (1993). Procedures for the identification of multiple outliers in linear models, Journal of the American Statistical Association, 88, 1264-1272. https://doi.org/10.1080/01621459.1993.10476407
  4. Kianifard, F. and Swallow, W. H. (1990). A Monte Carlo comparison of five procedures for identifying outliers in linear regression, Communications in Statistics - Theory and Methods, 19, 1913-1938. https://doi.org/10.1080/03610929008830300
  5. Kianifard, F. and Swallow, W. H. (1996). A review of the development and application of recursive residuals in linear models, Journal of the American Statistical Association, 91, 391-400. https://doi.org/10.1080/01621459.1996.10476700
  6. Kim, S. S. and Krzanowski, W. J. (2007). Detecting multiple outliers in linear regression using a cluster method combined with graphical visualization, Computational Statistics, 22, 109-119. https://doi.org/10.1007/s00180-007-0026-3
  7. Paul, S. R. and Fung, K. Y. (1991). A generalized extreme studentized residual multiple-outlier-detection procedure in linear regression, Technometrics, 33, 339-348. https://doi.org/10.1080/00401706.1991.10484839
  8. Pena, D. and Yohai, V. J. (1995). The detection of influential subsets in linear regression by using an influence matrix, Journal of the Royal Statistical Society Series B (Methodological), 57, 145-156.
  9. Rousseeuw, P. J. and Leroy, A. M. (1987). Robust Regression and Outlier Detection, John Wiley, New York.
  10. Rousseeuw, P. J. and Van Zomeren, B. C. (1990). Unmasking multivariate outliers and leverage points(with comments), Journal of the American Statistical Association, 85, 633-651. https://doi.org/10.1080/01621459.1990.10474920
  11. Sebert, D. M., Montgomery, D. C., and Rollier, D. (1998). A clustering algorithm for identifying multiple outliers in linear regression, Computational Statistics and Data Analysis, 27, 461-484. https://doi.org/10.1016/S0167-9473(98)00021-8
  12. Seo, H. S. and Yoon, M. (2014). A test on a specific set of outlier candidates in a linear model, The Korean Journal of Applied Statistics, 27, 307-315. https://doi.org/10.5351/KJAS.2014.27.2.307