Browse > Article

MULTIPLE OUTLIER DETECTION IN LOGISTIC REGRESSION BY USING INFLUENCE MATRIX  

Lee, Gwi-Hyun (Department of Statistics, Seoul National University)
Park, Sung-Hyun (Department of Statistics, Seoul National University)
Publication Information
Journal of the Korean Statistical Society / v.36, no.4, 2007 , pp. 457-469 More about this Journal
Abstract
Many procedures are available to identify a single outlier or an isolated influential point in linear regression and logistic regression. But the detection of influential points or multiple outliers is more difficult, owing to masking and swamping problems. The multiple outlier detection methods for logistic regression have not been studied from the points of direct procedure yet. In this paper we consider the direct methods for logistic regression by extending the $Pe\tilde{n}a$ and Yohai (1995) influence matrix algorithm. We define the influence matrix in logistic regression by using Cook's distance in logistic regression, and test multiple outliers by using the mean shift model. To show accuracy of the proposed multiple outlier detection algorithm, we simulate artificial data including multiple outliers with masking and swamping.
Keywords
Influence matrix; logistic regression; multiple outliers;
Citations & Related Records

Times Cited By Web Of Science : 0  (Related Records In Web of Science)
연도 인용수 순위
  • Reference
1 BIANCO, A. M. AND YOHAI, V. J. (1996). 'Robust estimation in the logistic regression model', In Robust Statistics, Data Analysis, and Computer Intensive Methods; Lecture Notes in Statistics 109 (Rieder, H. ed.), 17-34, Springer-Verlag, New York
2 CHATTERJEE, S. AND HADI, A. S. (1986). 'Influential observations, high leverage points, and outliers in linear regression', Statistical Science, 1, 379-416   DOI
3 BECKMAN, R. J. AND COOK, R. D. (1983). 'Outlier ... s', Technometrics, 25, 119-163   DOI
4 COOK, R. D. (1979). 'Influential observations in linear regression', Journal of the American Statistical Association, 74, 169-174   DOI
5 WILLIAMS, D. A. (1987). 'Generalized linear model diagnostics using the deviance and single case deletions', Journal of the Royal Statistical Society, Ser. C, 36, 181-191
6 PENA, D. AND YOHAI, V. J. (1995). 'The detection of influential subsets in linear regression by using an influence matrix', Journal of the Royal Statistical Society, Ser. B, 57, 145-156
7 CANTONI, E. AND RONCHETTI, E. (2001). 'Robust inference for generalized linear models', Journal of the American Statistical Association, 96, 1022-1030   DOI   ScienceOn
8 PREGIBON, D. (1981). 'Logistic regression diagnostics', The Annals of Statistics, 9, 705-724   DOI
9 CROUX, C. AND HAESBROECK, G. (2003). 'Implementing the Bianco and Yohai estimator for logistic regression', Computational Statistics & Data Analysis, 44, 273-295   DOI   ScienceOn
10 RYAN, T. P. (1996). Modern Regression Methods, John Wiley & Sons, New York