• Title/Summary/Keyword: regression outlier

Search Result 116, Processing Time 0.02 seconds

First Order Difference-Based Error Variance Estimator in Nonparametric Regression with a Single Outlier

  • Park, Chun-Gun
    • Communications for Statistical Applications and Methods
    • /
    • v.19 no.3
    • /
    • pp.333-344
    • /
    • 2012
  • We consider some statistical properties of the first order difference-based error variance estimator in nonparametric regression models with a single outlier. So far under an outlier(s) such difference-based estimators has been rarely discussed. We propose the first order difference-based estimator using the leave-one-out method to detect a single outlier and simulate the outlier detection in a nonparametric regression model with the single outlier. Moreover, the outlier detection works well. The results are promising even in nonparametric regression models with many outliers using some difference based estimators.

Asymptotic Properties of Outlier Tests in Nonlinear Regression

  • Kahng, Myung-Wook
    • Journal of the Korean Data and Information Science Society
    • /
    • v.17 no.1
    • /
    • pp.205-211
    • /
    • 2006
  • For a linear regression model, the necessary and sufficient condition for the asymptotic consistency of the outlier test statistic is known. An analogous condition for the nonlinear regression model is considered in this paper.

  • PDF

Simultaneous outlier detection and variable selection via difference-based regression model and stochastic search variable selection

  • Park, Jong Suk;Park, Chun Gun;Lee, Kyeong Eun
    • Communications for Statistical Applications and Methods
    • /
    • v.26 no.2
    • /
    • pp.149-161
    • /
    • 2019
  • In this article, we suggest the following approaches to simultaneous variable selection and outlier detection. First, we determine possible candidates for outliers using properties of an intercept estimator in a difference-based regression model, and the information of outliers is reflected in the multiple regression model adding mean shift parameters. Second, we select the best model from the model including the outlier candidates as predictors using stochastic search variable selection. Finally, we evaluate our method using simulations and real data analysis to yield promising results. In addition, we need to develop our method to make robust estimates. We will also to the nonparametric regression model for simultaneous outlier detection and variable selection.

MULTIPLE OUTLIER DETECTION IN LOGISTIC REGRESSION BY USING INFLUENCE MATRIX

  • Lee, Gwi-Hyun;Park, Sung-Hyun
    • Journal of the Korean Statistical Society
    • /
    • v.36 no.4
    • /
    • pp.457-469
    • /
    • 2007
  • Many procedures are available to identify a single outlier or an isolated influential point in linear regression and logistic regression. But the detection of influential points or multiple outliers is more difficult, owing to masking and swamping problems. The multiple outlier detection methods for logistic regression have not been studied from the points of direct procedure yet. In this paper we consider the direct methods for logistic regression by extending the $Pe\tilde{n}a$ and Yohai (1995) influence matrix algorithm. We define the influence matrix in logistic regression by using Cook's distance in logistic regression, and test multiple outliers by using the mean shift model. To show accuracy of the proposed multiple outlier detection algorithm, we simulate artificial data including multiple outliers with masking and swamping.

Test for an Outlier in Multivariate Regression with Linear Constraints

  • Kim, Myung-Geun
    • Communications for Statistical Applications and Methods
    • /
    • v.9 no.2
    • /
    • pp.473-478
    • /
    • 2002
  • A test for a single outlier in multivariate regression with linear constraints on regression coefficients using a mean shift model is derived. It is shown that influential observations based on case-deletions in testing linear hypotheses are determined by two types of outliers that are mean shift outliers with or without linear constraints, An illustrative example is given.

A Score test for Detection of Outliers in Nonlinear Regression

  • Kahng, Myung-Wook
    • Journal of the Korean Statistical Society
    • /
    • v.22 no.2
    • /
    • pp.201-208
    • /
    • 1993
  • Given the specific mean shift outlier model, the score test for multiple outliers in nonlinear regression is discussed as an alternative to the likelihood ratio test. The geometric interpretation of the score statistic is also presented.

  • PDF

Testing Outliers in Nonlinear Regression

  • Kahng, Myung-Wook
    • Journal of the Korean Statistical Society
    • /
    • v.24 no.2
    • /
    • pp.419-437
    • /
    • 1995
  • Given the specific mean shift outlier model, several standard approaches to obtaining test statistic for outliers are discussed. Each of these is developed in detail for the nonlinear regression model, and each leads to an equivalent distribution. The geometric interpretations of the statistics and accuracy of linear approximation are also presented.

  • PDF

Outlier Detection Using Support Vector Machines (서포트벡터 기계를 이용한 이상치 진단)

  • Seo, Han-Son;Yoon, Min
    • Communications for Statistical Applications and Methods
    • /
    • v.18 no.2
    • /
    • pp.171-177
    • /
    • 2011
  • In order to construct approximation functions for real data, it is necessary to remove the outliers from the measured raw data before constructing the model. Conventionally, visualization and maximum residual error have been used for outlier detection, but they often fail to detect outliers for nonlinear functions with multidimensional input. Although the standard support vector regression based outlier detection methods for nonlinear function with multidimensional input have achieved good performance, they have practical issues in computational cost and parameter adjustments. In this paper we propose a practical approach to outlier detection using support vector regression that reduces computational time and defines outlier threshold suitably. We apply this approach to real data examples for validity.

Assessing the Accuracy of Outlier Tests in Nonlinear Regression

  • Kahng, Myung-Wook;Kim, Bu-Yang
    • Communications for Statistical Applications and Methods
    • /
    • v.16 no.1
    • /
    • pp.163-168
    • /
    • 2009
  • Given the specific mean shift outlier model, the standard approaches to obtaining test statistics for outliers are discussed. Accuracy of outlier tests is investigated using subset curvatures. These subset curvatures appear to be reliable indicators of the adequacy of the linearization based test. Also, we consider obtaining graphical summaries of uncertainty in estimating parameters through confidence curves. The results are applied to the problem of assessing the accuracy of outlier tests.

Unified methods for variable selection and outlier detection in a linear regression

  • Seo, Han Son
    • Communications for Statistical Applications and Methods
    • /
    • v.26 no.6
    • /
    • pp.575-582
    • /
    • 2019
  • The problem of selecting variables in the presence of outliers is considered. Variable selection and outlier detection are not separable problems because each observation affects the fitted regression equation differently and has a different influence on each variable. We suggest a simultaneous method for variable selection and outlier detection in a linear regression model. The suggested procedure uses a sequential method to detect outliers and uses all possible subset regressions for model selections. A simplified version of the procedure is also proposed to reduce the computational burden. The procedures are compared to other variable selection methods using real data sets known to contain outliers. Examples show that the proposed procedures are effective and superior to robust algorithms in selecting the best model.