• Title/Summary/Keyword: Multivariate Outliers

Search Result 39, Processing Time 0.029 seconds

Statistical Outliers in Florida Counties at the Presidential Election 2000 (2000년 미국대선 플로리다주의 투표결과 분석)

  • 김현철
    • The Korean Journal of Applied Statistics
    • /
    • v.15 no.1
    • /
    • pp.21-32
    • /
    • 2002
  • We searched out in the votes data of the State of Florida at presidential election 2000. We used a multivariate regression analysis. We got there were several outliers including Palm Beach County. It means that we should analyze the number of disqualified ballots which were double-punched as well as the votes, to insist the " Butterfly Ballot" made Palm Beach outlier.

Modified Multivariate $T^2$-Chart based on Robust Estimation (로버스트 추정에 근거한 수정된 다변량 $T^2$- 관리도)

  • 성웅현;박동련
    • Journal of Korean Society for Quality Management
    • /
    • v.29 no.1
    • /
    • pp.1-10
    • /
    • 2001
  • We consider the problem of detecting special variations in multivariate $T^2$-control chart when two or more multivariate outliers are present. Since a multivariate outlier may reflect slippage in mean, variance, or correlation, it can distort the sample mean vector and sample covariance matrix. Damaged sample mean vector and sample covariance matrix have difficulty in examining special variations clearly, An alternative to detection outliers or special variations is to use robust estimators of mean vector and covariance matrix that are less sensitive to extreme observations than are the standard estimators $\bar{x}$ and $\textbf{S}$. We applied popular minimum volume ellipsoid(MVE) and minimum covariance determinant(MCD) method to estimate mean vector and covariance matrix and compared its results with standard $T^2$-control chart using simulated multivariate data with outliers. We found that the modified $T^2$-control chart based on the above robust methods were more effective in detecting special variations clearly than the standard $T^2$-control chart.

  • PDF

The System for Checking Multivariate Normality and Outliers

  • 강명래;최용석
    • Proceedings of the Korean Statistical Society Conference
    • /
    • 2000.11a
    • /
    • pp.253-255
    • /
    • 2000
  • 다변량분석 기법을 사용하기 위해서는 자료가 정규성(normality)가정을 만족해야한다. 본 연구에서는 GUI(graphic user interface)환경 하에서 일변량(univariate)과 다변량자료(multivariate data)의 정규성검정, 이상치(outliers)제거 및 변수변환(variable transformation)을 지원하는 시스템을 구축하여 사용자들이 보다 편리하게 사용할 수 있음을 소개 하고자 한다.

  • PDF

Residuals Plots for Repeated Measures Data

  • PARK TAESUNG
    • Proceedings of the Korean Statistical Society Conference
    • /
    • 2000.11a
    • /
    • pp.187-191
    • /
    • 2000
  • In the analysis of repeated measurements, multivariate regression models that account for the correlations among the observations from the same subject are widely used. Like the usual univariate regression models, these multivariate regression models also need some model diagnostic procedures. In this paper, we propose a simple graphical method to detect outliers and to investigate the goodness of model fit in repeated measures data. The graphical method is based on the quantile-quantile(Q-Q) plots of the $X^2$ distribution and the standard normal distribution. We also propose diagnostic measures to detect influential observations. The proposed method is illustrated using two examples.

  • PDF

Test for an Outlier in Multivariate Regression with Linear Constraints

  • Kim, Myung-Geun
    • Communications for Statistical Applications and Methods
    • /
    • v.9 no.2
    • /
    • pp.473-478
    • /
    • 2002
  • A test for a single outlier in multivariate regression with linear constraints on regression coefficients using a mean shift model is derived. It is shown that influential observations based on case-deletions in testing linear hypotheses are determined by two types of outliers that are mean shift outliers with or without linear constraints, An illustrative example is given.

Diagnosis of Observations after Fit of Multivariate Skew t-Distribution: Identification of Outliers and Edge Observations from Asymmetric Data

  • Kim, Seung-Gu
    • The Korean Journal of Applied Statistics
    • /
    • v.25 no.6
    • /
    • pp.1019-1026
    • /
    • 2012
  • This paper presents a method for the identification of "edge observations" located on a boundary area constructed by a truncation variable as well as for the identification of outliers and the after fit of multivariate skew $t$-distribution(MST) to asymmetric data. The detection of edge observation is important in data analysis because it provides information on a certain critical area in observation space. The proposed method is applied to an Australian Institute of Sport(AIS) dataset that is well known for asymmetry in data space.

Identifying Multiple Leverage Points ad Outliers in Multivariate Linear Models

  • Yoo, Jong-Young
    • Communications for Statistical Applications and Methods
    • /
    • v.7 no.3
    • /
    • pp.667-676
    • /
    • 2000
  • This paper focuses on the problem of detecting multiple leverage points and outliers in multivariate linear models. It is well known that he identification of these points is affected by masking and swamping effects. To identify them, Rousseeuw(1985) used robust estimators of MVE(Minimum Volume Ellipsoids), which have the breakdown point of 50% approximately. And Rousseeuw and van Zomeren(1990) suggested the robust distance based on MVE, however, of which the computation is extremely difficult when the number of observations n is large. In this study, e propose a new algorithm to reduce the computational difficulty of MVE. The proposed method is powerful in identifying multiple leverage points and outlies and also effective in reducing the computational difficulty of MVE.

  • PDF

Multivariate Stratification under Consideration of Outliers (이상점을 고려한 다변량 층화)

  • Park, Jin-Woo;Yun, Seok-Hoon
    • The Korean Journal of Applied Statistics
    • /
    • v.21 no.3
    • /
    • pp.377-385
    • /
    • 2008
  • Most of the sample surveys conducted by several statistics preparation agencies are multipurpose surveys inquiring into several distinguishing items through a single sample. In a multipurpose sample design, the stratification tends to be very complex since the stratification variables which are both multivariate and heterogeneous must be considered collectively. In this paper we point out an outlier effect in a multivariate stratification to which the K-means clustering method is applied and propose to consider outliers prior to the stratification step. We also show an empirical stratification effect under consideration of outliers through a case study of sample design for The Rural Living Indicators.

OUTLIER DETECTION BASED ON A CHANGE OF LIKELIHOOD

  • Kim, Myung-Geun
    • Journal of applied mathematics & informatics
    • /
    • v.26 no.5_6
    • /
    • pp.1133-1138
    • /
    • 2008
  • A general method of detecting outliers based on a change of likelihood by using the influence function is suggested. It can be applied to all kinds of distributions that are specified by parameters. For the multivariate normal case, specific computations are made to get the corresponding conditional influence function. A numerical example is provided for illustration.

  • PDF