• Title/Summary/Keyword: Influential observations

Search Result 73, Processing Time 0.021 seconds

Detecting Influential Observations on the Smoothing Parameter in Nonparametric Regression

  • Kim, Choong-Rak;Jeon, Jong-Woo
    • Journal of the Korean Statistical Society
    • /
    • v.24 no.2
    • /
    • pp.495-506
    • /
    • 1995
  • We present formula for detecting influential observations on the smoothing parameter in smoothing spline. Further, we express them as functions of basic building blocks such as residuals and leverage, and compare it with the local influence approach by Thomas (1991). An example based on a real data set is given.

  • PDF

Firework plot for evaluating the impact of influential observations in multi-response surface methodology (다반응 반응표면분석에서 특이값의 영향을 평가하기 위한 불꽃그림)

  • Kim, Sang Ik;Jang, Dae-Heung
    • The Korean Journal of Applied Statistics
    • /
    • v.31 no.1
    • /
    • pp.97-108
    • /
    • 2018
  • It has been routine practice in regression analysis to check the validity of the assumed model by the use of regression diagnostics tools. Outliers and influential observations often distort the regression output in an undesired manner. Jang and Anderson-Cook (Quality and Reliability Engineering International, 30, 1409-1425, 2014) proposed a graphical method (called a firework plot) so that there could be an exploratory visualization of the trace of the impact of the possible outliers and influential observations on individual regression coefficients and the overall residual sum of the squares measure. This paper further extends a graphical approach to a multi-response surface methodology problem.

Local Influence Assessment of the Misclassification Probability in Multiple Discriminant Analysis

  • Jung, Kang-Mo
    • Journal of the Korean Statistical Society
    • /
    • v.27 no.4
    • /
    • pp.471-483
    • /
    • 1998
  • The influence of observations on the misclassification probability in multiple discriminant analysis under the equal covariance assumption is investigated by the local influence method. Under an appropriate perturbation we can get information about influential observations and outliers by studying the curvatures and the associated direction vectors of the perturbation-formed surface of the misclassification probability. We show that the influence function method gives essentially the same information as the direction vector of the maximum slope. An illustrative example is given for the effectiveness of the local influence method.

  • PDF

A Bayesian Diagnostic for Influential Observations in LDA

  • Lim, Jae-Hak;Lee, Chong-Hyung;Cho, Byung-Yup
    • Journal of Korean Society for Quality Management
    • /
    • v.28 no.1
    • /
    • pp.119-131
    • /
    • 2000
  • This paper suggests a new diagnostic measure for detecting influential observations in linear discriminant analysis (LDA). It is developed from a Bayesian point of view using a default Bayes factor obtained from the imaginary training sample methodology. The Bayes factor is taken as a criterion for testing homogeneity of covariance matrices in LDA model. It is noted that the effect of an observation over the criterion is fully explained by the diagnostic measure. We suggest a graphical method that can be taken as a tool for interpreting the diagnostic measure and detecting influential observations. Performance of the measure is examined through an illustrative example.

  • PDF

Detecting Influential Observations in Multivariate Statistical Analysis of Incomplete Data by PCA (주성분분석에 의한 결손 자료의 영향값 검출에 대한 연구)

  • 김현정;문승호;신재경
    • The Korean Journal of Applied Statistics
    • /
    • v.13 no.2
    • /
    • pp.383-392
    • /
    • 2000
  • Since late 1970, methods of influence or sensitivity analysis for detecting influential observations have been studied not only in regression and related methods but also in various multivariate methods. If results of multivariate analyses sometimes depend heavily on a small number of observations, we should be very careful to draw a conclusion. Similar phenomena may also occur in the case of incomplete data. In this research we try to study such influential observations in multivariate statistical analysis of incomplete data. Case of principal component analysis is studied with a numerical example.

  • PDF

Influence Analysis of the Liklihood Ratio Test in Multivariate Behrens-Fisher Problem

  • Jung, Kang-Mo;Kim, Myung-Geun
    • Communications for Statistical Applications and Methods
    • /
    • v.6 no.3
    • /
    • pp.939-946
    • /
    • 1999
  • We propose methods for detecting influential observations that have a large influence on the likelihood ratio test statistic for the multivariate Behrens-Fisher problem. For this purpose we derive the influence curve and the derivative influence of the likelihood ratio test statistic. An illustrative example is given to show the effectiveness of the proposed methods on the identification of influential observations.

  • PDF

Graphical Methods for Influence Diagnostics

  • Dae Heung Jang
    • Communications for Statistical Applications and Methods
    • /
    • v.4 no.2
    • /
    • pp.359-365
    • /
    • 1997
  • Unusual observations can greatly influence the results of least wquares estimation. I propose graphical methods which can detect the influential observations.

  • PDF

A cautionary note on the use of Cook's distance

  • Kim, Myung Geun
    • Communications for Statistical Applications and Methods
    • /
    • v.24 no.3
    • /
    • pp.317-324
    • /
    • 2017
  • An influence measure known as Cook's distance has been used for judging the influence of each observation on the least squares estimate of the parameter vector. The distance does not reflect the distributional property of the change in the least squares estimator of the regression coefficients due to case deletions: the distribution has a covariance matrix of rank one and thus it has a support set determined by a line in the multidimensional Euclidean space. As a result, the use of Cook's distance may fail to correctly provide information about influential observations, and we study some reasons for the failure. Three illustrative examples will be provided, in which the use of Cook's distance fails to give the right information about influential observations or it provides the right information about the most influential observation. We will seek some reasons for the wrong or right provision of information.

Firework plot for evaluating the impact of outliers in statistical inference (통계적 추론에서 특이점의 영향을 평가하기 위한 탐색적 자료분석 그림도구로서의 불꽃그림)

  • Moon, Sungho
    • The Korean Journal of Applied Statistics
    • /
    • v.31 no.1
    • /
    • pp.155-165
    • /
    • 2018
  • Outliers and influential observations often distort many numerical measures for data analysis. Jang and Anderson-Cook (Quality and Reliability Engineering International, 30, 1409-1425, 2014) proposed a graphical firework plot method for exploratory analysis purpose to provide a possible visualization of the trace of the impact of the possible outlying and influential observations on the univariate/bivariate data analysis and regression. They developed 3-D plot as well as pairwise plot for the appropriate measures of interest. We use firework plots as a graphical exploratory data analysis tool to detect outliers and evaluate the impact of outliers in statistical inference.

Influence Analysis of the Common Mean Problem

  • Kim, Myung Geun
    • Communications for Statistical Applications and Methods
    • /
    • v.20 no.3
    • /
    • pp.217-223
    • /
    • 2013
  • Two influence diagnostic methods for the common mean model are proposed. First, an investigation of the influence of observations according to minor perturbations of the common mean model is made by adapting the local influence method which is based on the likelihood displacement. It is well known that the maximum likelihood estimates are in general sensitive to influential observations. Case-deletions can be a candidate for detecting influential observations. However, the maximum likelihood estimators are iteratively computed and therefore case-deletions involve an enormous amount of computations. An approximation by Newton's method to the maximum likelihood estimator obtained after a single observation was deleted can reduce much of computational burden, which will be treated in this work. A numerical example is given for illustration and it shows that the proposed diagnostic methods can be useful tools.