• Title/Summary/Keyword: High leverage points

Search Result 15, Processing Time 0.013 seconds

Simultaneous Identification of Multiple Outliers and High Leverage Points in Linear Regression

  • Rahmatullah Imon, A.H.M.;Ali, M. Masoom
    • Journal of the Korean Data and Information Science Society
    • /
    • v.16 no.2
    • /
    • pp.429-444
    • /
    • 2005
  • The identification of unusual observations such as outliers and high leverage points has drawn a great deal of attention for many years. Most of these identifications techniques are based on case deletion that focuses more on the outliers than the high leverage points. But residuals together with leverage values may cause masking and swamping for which a good number of unusual observations remain undetected in the presence of multiple outliers and multiple high leverage points. In this paper we propose a new procedure to identify outliers and high leverage points simultaneously. We suggest an additive form of the residuals and the leverages that gives almost an equal focus on outliers and leverages. We analyzed several well-referred data set and discover few outliers and high leverage points that were undetected by the existing diagnostic techniques.

  • PDF

On Sensitivity Analysis in Principal Component Regression

  • Kim, Soon-Kwi;Park, Sung H.
    • Journal of the Korean Statistical Society
    • /
    • v.20 no.2
    • /
    • pp.177-190
    • /
    • 1991
  • In this paper, we discuss and review various measures which have been presented for studying outliers. high-leverage points, and influential observations when principal component regression is adopted. We suggest several diagnostics measures when principal component regression is used. A numerical example is illustrated. Some individual data points may be flagged as outliers, high-leverage point, or influential points.

  • PDF

A Generalized M-Estimator in Linear Regression

  • Song, Moon-Sup;Park, Chang-Soon;Nam, Ho-Soo
    • Communications for Statistical Applications and Methods
    • /
    • v.1 no.1
    • /
    • pp.27-32
    • /
    • 1994
  • We propose a robust regression estimator which has both a high breakdown point and a bounded influence function. The main contribution of this article is to present a weight function in the generalized M (GM)-estimator. The weighting schemes which control leverage points only without considering residuals cannot be efficient, since control leverage points only without considering residuals cannot be efficient, since these schemes inevitably downweight some good leverage points. In this paper we propose a weight function which depends both on design points and residuals, so as not to downweight good leverage points. Some motivating illustrations are also given.

  • PDF

A Confirmation of Identified Multiple Outliers and Leverage Points in Linear Model (다중 선형 모형에서 식별된 다중 이상점과 다중 지렛점의 재확인 방법에 대한 연구)

  • 유종영;안기수
    • The Korean Journal of Applied Statistics
    • /
    • v.15 no.2
    • /
    • pp.269-279
    • /
    • 2002
  • We considered the problem for confirmation of multiple outliers and leverage points. Identification of multiple outliers and leverage points is difficult because of the masking effect and swamping effect. Rousseeuw and van Zomeren(1990) identified multiple outliers and leverage points by using the Least Median of Squares and Minimum Value of Ellipsoids which are high-breakdown robust estimators. But their methods tend to declare too many observations as extremes. Atkinson(1987) suggested a method for confirming of outliers and Fung(1993) pointed out Atkinson method's limitation and proposed another method by using the add-back model. But we analyzed that Fung's method is affected by adjacent effect. In this thesis, we proposed one procedure for confirmation of outliers and leverage points and compared three example with Fung's method.

An Efficient Mallows-Type One-Step GM-Estimator in linear Models

  • Song, Moon-Sup;Park, Changsoon;Nam, Ho-Soo
    • Journal of the Korean Statistical Society
    • /
    • v.27 no.3
    • /
    • pp.369-383
    • /
    • 1998
  • This paper deals with a robust regression estimator. We propose an efficient one-step GM-estimator, which has a bounded influence function and a high breakdown point. The main idea of this paper is to use the Mallows-type weights which depend on both the predictor variables and the residuals from a high breakdown initial estimator. The proposed weighting scheme severely downweights the bad leverage points and slightly downweights the good leverage points. Under some regularity conditions, we compute the finite-sample breakdown point and prove the asymptotic normality. Some simulation results and a numerical example are also presented.

  • PDF

Algorithm for the L1-Regression Estimation with High Breakdown Point (L1-회귀추정량의 붕괴점 향상을 위한 알고리즘)

  • Kim, Bu-Yong
    • Communications for Statistical Applications and Methods
    • /
    • v.17 no.4
    • /
    • pp.541-550
    • /
    • 2010
  • The $L_1$-regression estimator is susceptible to the leverage points, even though it is highly robust to the vertical outliers. This article is concerned with the improvement of robustness of the $L_1$-estimator. To improve its robustness, in terms of the breakdown point, we attempt to dampen the influence of the leverage points by means of reducing the weights corresponding to the leverage points. In addition the algorithm employs the linear scaling transformation technique, for higher computational efficiency with the large data sets, to solve the linear programming problem of $L_1$-estimation. Monte Carlo simulation results indicate that the proposed algorithm yields $L_1$-estimates which are robust to the leverage points as well as the vertical outliers.

A High Breakdown and Efficient GM-Estimator in Linear Models

  • Song, Moon-Sup;Park, Changsoon;Nam, Ho-Soo
    • Journal of the Korean Statistical Society
    • /
    • v.25 no.4
    • /
    • pp.471-487
    • /
    • 1996
  • In this paper we propose an efficient scoring type one-step GM-estimator, which has a bounded influence function and a high break-down point. The main point of the estimator is in the weighting scheme of the GM-estimator. The weight function we used depends on both leverage points and residuals So we construct an estimator which does not downweight good leverage points Unider some regularity conditions, we compute the finite-sample breakdown point and prove asymptotic normality Some simulation results are also presented.

  • PDF

Influence Diagnostic Measure for Spline Estimator

  • Lee, In-Suk;Cho, Gyo-Young;Jung, Won-Tae
    • Journal of Korean Society for Quality Management
    • /
    • v.23 no.4
    • /
    • pp.58-63
    • /
    • 1995
  • To access the quality of a fit to a set of data it is always useful to conduct a posteriori analysis involving the examination of residuals, detection of influential data values, etc. Smoothing splines are a type of nonparametric regression estimators for the diagnostic problem. And leverage value, Cook's distance, and DFFITS are used for detecting influential data. Since high leverage points will always have small residuals, the new diagnostic measures including of properties of leverage and residuals are needed. In this paper, we propose FVARATIO version as diagnostic measure in nonparametric regression. Also we consider the rough bound as analogy with linear regression case.

  • PDF

Bootstrapping Regression Residuals

  • Imon, A.H.M. Rahmatullah;Ali, M. Masoom
    • Journal of the Korean Data and Information Science Society
    • /
    • v.16 no.3
    • /
    • pp.665-682
    • /
    • 2005
  • The sample reuse bootstrap technique has been successful to attract both applied and theoretical statisticians since its origination. In recent years a good deal of attention has been focused on the applications of bootstrap methods in regression analysis. It is easier but more accurate computation methods heavily depend on high-speed computers and warrant tough mathematical justification for their validity. It is now evident that the presence of multiple unusual observations could make a great deal of damage to the inferential procedure. We suspect that bootstrap methods may not be free from this problem. We at first present few examples in favour of our suspicion and propose a new method diagnostic-before-bootstrap method for regression purpose. The usefulness of our newly proposed method is investigated through few well-known examples and a Monte Carlo simulation under a variety of error and leverage structures.

  • PDF

A Study on Sensitivity Analysis in Ridge Regression (능형 회귀에서의 민감도 분석에 관한 연구)

  • Kim, Soon-Kwi
    • Journal of Korean Society for Quality Management
    • /
    • v.19 no.1
    • /
    • pp.1-15
    • /
    • 1991
  • In this paper, we discuss and review various measures which have been presented for studying outliers, high-leverage points, and influential observations when ridge regression estimation is adopted. We derive the influence function for ${\underline{\hat{\beta}}}\small{R}$, the ridge regression estimator, and discuss its various finite sample approximations when ridge regression is postulated. We also study several diagnostic measures such as Welsh-Kuh's distance, Cook's distance etc.

  • PDF