• 제목/요약/키워드: High leverage points

검색결과 15건 처리시간 0.017초

Simultaneous Identification of Multiple Outliers and High Leverage Points in Linear Regression

  • Rahmatullah Imon, A.H.M.;Ali, M. Masoom
    • Journal of the Korean Data and Information Science Society
    • /
    • 제16권2호
    • /
    • pp.429-444
    • /
    • 2005
  • The identification of unusual observations such as outliers and high leverage points has drawn a great deal of attention for many years. Most of these identifications techniques are based on case deletion that focuses more on the outliers than the high leverage points. But residuals together with leverage values may cause masking and swamping for which a good number of unusual observations remain undetected in the presence of multiple outliers and multiple high leverage points. In this paper we propose a new procedure to identify outliers and high leverage points simultaneously. We suggest an additive form of the residuals and the leverages that gives almost an equal focus on outliers and leverages. We analyzed several well-referred data set and discover few outliers and high leverage points that were undetected by the existing diagnostic techniques.

  • PDF

On Sensitivity Analysis in Principal Component Regression

  • Kim, Soon-Kwi;Park, Sung H.
    • Journal of the Korean Statistical Society
    • /
    • 제20권2호
    • /
    • pp.177-190
    • /
    • 1991
  • In this paper, we discuss and review various measures which have been presented for studying outliers. high-leverage points, and influential observations when principal component regression is adopted. We suggest several diagnostics measures when principal component regression is used. A numerical example is illustrated. Some individual data points may be flagged as outliers, high-leverage point, or influential points.

  • PDF

A Generalized M-Estimator in Linear Regression

  • Song, Moon-Sup;Park, Chang-Soon;Nam, Ho-Soo
    • Communications for Statistical Applications and Methods
    • /
    • 제1권1호
    • /
    • pp.27-32
    • /
    • 1994
  • We propose a robust regression estimator which has both a high breakdown point and a bounded influence function. The main contribution of this article is to present a weight function in the generalized M (GM)-estimator. The weighting schemes which control leverage points only without considering residuals cannot be efficient, since control leverage points only without considering residuals cannot be efficient, since these schemes inevitably downweight some good leverage points. In this paper we propose a weight function which depends both on design points and residuals, so as not to downweight good leverage points. Some motivating illustrations are also given.

  • PDF

다중 선형 모형에서 식별된 다중 이상점과 다중 지렛점의 재확인 방법에 대한 연구 (A Confirmation of Identified Multiple Outliers and Leverage Points in Linear Model)

  • 유종영;안기수
    • 응용통계연구
    • /
    • 제15권2호
    • /
    • pp.269-279
    • /
    • 2002
  • 다중 이상점 과 다중 지렛점의 식별은 가장효과(masking effect)와 편승효과(swamping effect)에 영향을 받으므로 어려움이 존재한다. Rousseeuw와 van Zomeren(1990)은 LMS (Least Median of Squares) 회귀방법과 MVE(Minimum Volume Ellipsoid) 통계량을 이용하여 다중 이상점과 다중 지렛점을 식별하였다. 그러나 이들의 방법은 LMS와 MVE의 강한 로버스트성으로 인하여 이상점과 지렛점이 아닌 점들도 이상점과 지렛점으로 식별하는 경향이 있다. Fung(1993)은 식별된 이상점과 지렛점들에 대하여 재확인방법을 제안하였는데 이 방법은 인근효과(adjacent effect)에 영향을 받아 이상점과 지렛점을 식별하는데 문제가 있는 것으로 분석되었다. 본 논문은 이러한 문제점을 지적하고 새로운 방법을 제안하여 식별된 이상점과 지렛점을 재확인하고자 한다.

An Efficient Mallows-Type One-Step GM-Estimator in linear Models

  • Song, Moon-Sup;Park, Changsoon;Nam, Ho-Soo
    • Journal of the Korean Statistical Society
    • /
    • 제27권3호
    • /
    • pp.369-383
    • /
    • 1998
  • This paper deals with a robust regression estimator. We propose an efficient one-step GM-estimator, which has a bounded influence function and a high breakdown point. The main idea of this paper is to use the Mallows-type weights which depend on both the predictor variables and the residuals from a high breakdown initial estimator. The proposed weighting scheme severely downweights the bad leverage points and slightly downweights the good leverage points. Under some regularity conditions, we compute the finite-sample breakdown point and prove the asymptotic normality. Some simulation results and a numerical example are also presented.

  • PDF

L1-회귀추정량의 붕괴점 향상을 위한 알고리즘 (Algorithm for the L1-Regression Estimation with High Breakdown Point)

  • 김부용
    • Communications for Statistical Applications and Methods
    • /
    • 제17권4호
    • /
    • pp.541-550
    • /
    • 2010
  • $L_1$-회귀추정량이 수직이상점에 대해서는 매우 로버스트하지만 지렛점에 대해서는 전혀 로버스트하지 않다는 사실은 잘 알려져 있다. 본 논문에서는 수직이상점은 물론 지렛점에 대해서도 로버스트한 $L_1$-회귀추정을 위한 알고리즘을 제안한다. MCD 또는 MVE-추정량에 바탕을 둔 로버스트거리를 기준으로 지렛점들을 식별하고, 식별된 지렛점들의 영향력을 적절히 감소시키기 위한 가중치를 결정한다. 가중치에 의해 변환된 자료에 선형척도변환 기법에 바탕을 둔 선형계획 알고리즘을 적용함으로써 $L_1$-회귀추정량의 붕괴점을 향상시킨다. 다양한 형태와 규모의 자료에 대한 모의실험 결과, 제안된 알고리즘에 의한 $L_1$-회귀추정량의 붕괴점이 크게 향상되는 것으로 나타났다.

A High Breakdown and Efficient GM-Estimator in Linear Models

  • Song, Moon-Sup;Park, Changsoon;Nam, Ho-Soo
    • Journal of the Korean Statistical Society
    • /
    • 제25권4호
    • /
    • pp.471-487
    • /
    • 1996
  • In this paper we propose an efficient scoring type one-step GM-estimator, which has a bounded influence function and a high break-down point. The main point of the estimator is in the weighting scheme of the GM-estimator. The weight function we used depends on both leverage points and residuals So we construct an estimator which does not downweight good leverage points Unider some regularity conditions, we compute the finite-sample breakdown point and prove asymptotic normality Some simulation results are also presented.

  • PDF

Influence Diagnostic Measure for Spline Estimator

  • Lee, In-Suk;Cho, Gyo-Young;Jung, Won-Tae
    • 품질경영학회지
    • /
    • 제23권4호
    • /
    • pp.58-63
    • /
    • 1995
  • To access the quality of a fit to a set of data it is always useful to conduct a posteriori analysis involving the examination of residuals, detection of influential data values, etc. Smoothing splines are a type of nonparametric regression estimators for the diagnostic problem. And leverage value, Cook's distance, and DFFITS are used for detecting influential data. Since high leverage points will always have small residuals, the new diagnostic measures including of properties of leverage and residuals are needed. In this paper, we propose FVARATIO version as diagnostic measure in nonparametric regression. Also we consider the rough bound as analogy with linear regression case.

  • PDF

Bootstrapping Regression Residuals

  • Imon, A.H.M. Rahmatullah;Ali, M. Masoom
    • Journal of the Korean Data and Information Science Society
    • /
    • 제16권3호
    • /
    • pp.665-682
    • /
    • 2005
  • The sample reuse bootstrap technique has been successful to attract both applied and theoretical statisticians since its origination. In recent years a good deal of attention has been focused on the applications of bootstrap methods in regression analysis. It is easier but more accurate computation methods heavily depend on high-speed computers and warrant tough mathematical justification for their validity. It is now evident that the presence of multiple unusual observations could make a great deal of damage to the inferential procedure. We suspect that bootstrap methods may not be free from this problem. We at first present few examples in favour of our suspicion and propose a new method diagnostic-before-bootstrap method for regression purpose. The usefulness of our newly proposed method is investigated through few well-known examples and a Monte Carlo simulation under a variety of error and leverage structures.

  • PDF

능형 회귀에서의 민감도 분석에 관한 연구 (A Study on Sensitivity Analysis in Ridge Regression)

  • Kim, Soon-Kwi
    • 품질경영학회지
    • /
    • 제19권1호
    • /
    • pp.1-15
    • /
    • 1991
  • In this paper, we discuss and review various measures which have been presented for studying outliers, high-leverage points, and influential observations when ridge regression estimation is adopted. We derive the influence function for ${\underline{\hat{\beta}}}\small{R}$, the ridge regression estimator, and discuss its various finite sample approximations when ridge regression is postulated. We also study several diagnostic measures such as Welsh-Kuh's distance, Cook's distance etc.

  • PDF