[KSCI] Korea Science Citation Index Service

http://dx.doi.org/10.5351/KJAS.2016.29.4.699

A sequential outlier detecting method using a clustering algorithm

Seo, Han Son (Department of Applied Statistics, Konkuk University)
Yoon, Min (Department of Statistics, Pukyong National University)

Publication Information

The Korean Journal of Applied Statistics / v.29, no.4, 2016 , pp. 699-706 More about this Journal

Abstract

Outlier detection methods without performing a test often do not succeed in detecting multiple outliers because they are structurally vulnerable to a masking effect or a swamping effect. This paper considers testing procedures supplemented to a clustering-based method of identifying the group with a minority of the observations as outliers. One of general steps is performing a variety of t-test on individual outlier-candidates. This paper proposes a sequential procedure for searching for outliers by changing cutoff values on a cluster tree and performing a test on a set of outlier-candidates. The proposed method is illustrated and compared to existing methods by an example and Monte Carlo studies.

Keywords

clustering; linear regression model; outlier test; sequential procedure;

Citations & Related Records

Times Cited By KSCI : 1 (Citation Analysis)

Reference
Cited By KSCI

1	Hadi, A. S. and Simonoff, J. S. (1993). Procedures for the identification of multiple outliers in linear models, Journal of the American Statistical Association, 88, 1264-1272. DOI
2	Kianifard, F. and Swallow, W. H. (1989). Using recursive residuals, calculated on adaptive-ordered observations, to identify outliers in linear regression, Biometrics, 45, 571-585. DOI
3	Kianifard, F. and Swallow, W. H. (1996). A review of the development and application of recursive residuals in linear models, Journal of the American Statistical Association, 91, 391-400. DOI
4	Kim, S. S. and Krzanowski, W. J. (2007). Detecting multiple outliers in linear regression using a cluster method combined with graphical visualization, Computational Statistics, 22, 109-119. DOI
5	Mojena, R. (1977). Hierarchical grouping methods and stopping rules: an evaluation, The Computer Journal, 20, 359-363. DOI
6	Pena, D. and Yohai, V. J. (1995). The detection of influential subsets in linear regression by using an influence matrix, Journal of the Royal Statistical Society, Series B, 57, 145-156.
7	Rousseeuw, P. J. and Leroy, A. M. (1987). Robust Regression and Outlier Detection, John Wiley, New York.
8	Sebert, D. M., Montgomery, D. C., and Rollier, D. (1998). A clustering algorithm for identifying multiple outliers in linear regression, Computational Statistics and Data Analysis, 27, 461-484. DOI
9	Seo, H. S. and Yoon, M. (2014). A test on a specific set of outlier candidates in a linear model, The Korean Journal of Applied Statistics, 27, 307-315. DOI

KSCI

A sequential outlier detecting method using a clustering algorithm 군집 알고리즘을 이용한 순차적 이상치 탐지법

A sequential outlier detecting method using a clustering algorithm