• Title/Summary/Keyword: outliers

Search Result 656, Processing Time 0.021 seconds

Unmasking Multiple Outliers in Multivariate Data

  • Yoo Jong-Young
    • Communications for Statistical Applications and Methods
    • /
    • v.13 no.1
    • /
    • pp.29-38
    • /
    • 2006
  • We proposed a procedure for detecting of multiple outliers in multivariate data. Rousseeuw and van Zomeren (1990) have suggested the robust distance $RD_i$ by using the Resampling Algorithm. But $RD_i$ are based on the assumption that X is in the general position.(X is said to be in the general position when every subsample of size p+1 has rank p) From the practical points of view, this is clearly unrealistic. In this paper, we proposed a computing method for approximating MVE, which is not subject to these problems. The procedure is easy to compute, and works well even if subsample is singular or nearly singular matrix.

A Study on High Breakdown Discriminant Analysis : A Monte Carlo Simulation

  • Moon Sup;Young Joo;Youngjo
    • Communications for Statistical Applications and Methods
    • /
    • v.7 no.1
    • /
    • pp.225-232
    • /
    • 2000
  • The linear and quadratic discrimination functions based on normal theory are widely used to classify an observation to one of predefined groups. But the discriminant functions are sensitive to outliers. A high breakdown procedure to estimate location and scatter of multivariate data is the minimum volume ellipsoid or MVE estimator To obtain high breakdown classifiers outliers in multivariate data are detected by using the robust Mahalanobis distance based on MVE estimators and the weighted estimators are inserted in the functions for classification. A samll-sample MOnte Carlo study shows that the high breakdown robust procedures perform better than the classical classifiers.

  • PDF

Calibration by Median Regression

  • Jinsan Yang;Lee, Seung-Ho
    • Journal of the Korean Statistical Society
    • /
    • v.28 no.2
    • /
    • pp.265-277
    • /
    • 1999
  • Classical and inverse estimation methods are two well known methods in statistical calibration problems. When there are outliers, both methods have large MSE's and could not estimate the input value correctly. We suggest median calibration estimation based on the LD-statistics. To investigate the robust performances, the influence function of the median calibration estimator is calculated and compared with other methods. When there are outliers in the response variables, the influence function is found to be bounded. In simulation studies, the MSE's for each calibration methods are compared. The estimated inputs as well as the performance of the influence functions are calculated.

  • PDF

On the Robustness of $L_1$-estimator in Linear Regression Models

  • Bu-Yong Kim
    • Communications for Statistical Applications and Methods
    • /
    • v.2 no.2
    • /
    • pp.277-287
    • /
    • 1995
  • It is well kmown that the $L_1$-estimator is robust with respect to vertical outliers in regression data, even if it is susceptible to bad leverage points. This article is concerned with the robustness of the $L_1$-estimator. To investigate its robustness against vertical outliers we may find intervals for the value of the response variable within which the $L_1$-estimates do not shange. A procedure for constructing those intervals in multiple limear regression is illustrated in the sensitivity analysis context. And then vertical breakdown point of the $L_1$-estimator is defined on the basis of properties related to those intervals.

  • PDF

Influence in Testing the Equality of Two Covariance Matrices (두개의 공분산 행렬의 동질성 검정에서의 영향치 분석)

  • Myung Geun Kim
    • The Korean Journal of Applied Statistics
    • /
    • v.7 no.2
    • /
    • pp.213-224
    • /
    • 1994
  • A diagnostic method useful for detecting outliers in testing the equality of two covariance metrics is developed using the influence curve approach. This method is easily generalized to more than two covariance matrices. A sample version for the influence measure of detecting outliers is considered based on the empirical distribution functions. The sample version includes as its component terms the well-known test statistic for detecting one outlier at a time introduced by Wilks and its generalization to the two-group case.

  • PDF

Detection of local structural chages in time series (시계열에서 국소구조변화의 탐지에 관한 연구)

  • Jae June Lee
    • The Korean Journal of Applied Statistics
    • /
    • v.7 no.2
    • /
    • pp.299-311
    • /
    • 1994
  • In time series data, atypical observations are not rare. Several approaches have been proposed to detect a single outlier, but the effectiveness of those procedures is in doubt when patchy outliers are present. In this paper, the atypicality in patchy outliers is interpreted as a local structural change, and a model is introduced to entertain its effect on the series. Based on this model, a statistic and a procedure are proposed for identifying those local structural changes. The performance of the proposed procedure is evaluated through simulation study and the analysis of real data sets.

  • PDF

Outlier Detection from LiDAR Data based on the Relative Density (상대적 밀도를 이용한 LiDAR 데이터의 Outlier 검출)

  • 문지영;이임평;김성준;김경옥
    • Proceedings of the Korean Society of Surveying, Geodesy, Photogrammetry, and Cartography Conference
    • /
    • 2004.11a
    • /
    • pp.507-512
    • /
    • 2004
  • LiDAR data often include outliers, the points being signficantly separated from other points and so seeming not to be measured from physical surfaces. Outliers should be removed before processing further the data for applications. Many methods have been developed for other data rather than LiDAR data as a part of data mining processes but their straightforward application to LiDAR data did not provide satisfactory results. In this study, we have thus modified one of such methods by considering the properties of LiDAR data and developed a method based on the relative point density. The proposed method have been applied to simulated and real data. The results confirms its promising performance with respect to the processing time and the detection accuracy

  • PDF

A procedure for simultaneous variable selection, variable transformation and outlier identification in linear regression (선형회귀에서 변수선택, 변수변환과 이상치 탐지의 동시적 수행을 위한 절차)

  • Seo, Han Son;Yoon, Min
    • The Korean Journal of Applied Statistics
    • /
    • v.33 no.1
    • /
    • pp.1-10
    • /
    • 2020
  • We propose a unified approach to variable selection, transformation and outliers in the linear model. The procedure includes a sequential method for outlier detection and a least trimmed squares estimator for variable transformation. It uses all possible subsets regressions for model selection. Some real data analyses and the simulation results are provided to show the efficiency of the methods in the context of the correct variable selection and the fitness of the estimated model.

Left Ventricular Image Processing and Displays of Cardiac Function

  • Kuwahara, Michiyoshi
    • Journal of Biomedical Engineering Research
    • /
    • v.6 no.1
    • /
    • pp.1-4
    • /
    • 1985
  • Background EEG signals can be represented as the sum of a conventional AR process and an innovation process. It is know that conventional estimation techniques, such as least square estimates (LSE) or Gauasian maximum likelihood estimates (MLE-G) are optimal when the innovation process satisfies the Gaussian or presumed distribution. When the data are contaminated by outliers, however, these assumptions are not met and the power spectrum estimated by conventional estimation techniques may be fatally biased. EEG signal may be affected by artifacts, which are outliers in the statistical term. So the robust filtering estimation technique is used against those artifacts and it performs well for the contaminated EEG signal.

  • PDF

Robust CUSUM test for time series of counts and its application to analyzing the polio incidence data

  • Kang, Jiwon
    • Journal of the Korean Data and Information Science Society
    • /
    • v.26 no.6
    • /
    • pp.1565-1572
    • /
    • 2015
  • In this paper, we analyze the polio incidence data based on the Poisson autoregressive models, focusing particularly on change-point detection. Since the data include some strongly deviating observations, we employ the robust cumulative sum (CUSUM) test proposed by Kang and Song (2015) to perform the test for parameter change. Contrary to the result of Kang and Lee (2014), our data analysis indicates that there is no significant change in the case of the CUSUM test with strong robustness and the same result is obtained after ridding the polio data of outliers. We additionally consider the comparison of the forecasting performance. All the results demonstrate that the robust CUSUM test performs adequately in the presence of seemingly outliers.