• Title/Summary/Keyword: outlier detection

Search Result 228, Processing Time 0.023 seconds

An Improved Iterative Procedure for Outlier Detection in Time Series (시계열 이상치 탐지를 위한 개선된 반복적 절차)

  • Bui, Anh Tuan;Jun, Chi-Hyuck
    • Journal of Korean Institute of Industrial Engineers
    • /
    • v.38 no.1
    • /
    • pp.17-24
    • /
    • 2012
  • We address some potential problems with the existing procedures of outlier detection in time series. Also we propose modifications in estimating model parameters and outlier effects in order to reduce the number of tests and to increase the detection accuracy. Experiments with some artificial data sets show that the proposed procedure significantly reduces the number of tests and enhances the accuracy of estimated parameters as well as the detection power.

Outlier Detection Using Support Vector Machines (서포트벡터 기계를 이용한 이상치 진단)

  • Seo, Han-Son;Yoon, Min
    • Communications for Statistical Applications and Methods
    • /
    • v.18 no.2
    • /
    • pp.171-177
    • /
    • 2011
  • In order to construct approximation functions for real data, it is necessary to remove the outliers from the measured raw data before constructing the model. Conventionally, visualization and maximum residual error have been used for outlier detection, but they often fail to detect outliers for nonlinear functions with multidimensional input. Although the standard support vector regression based outlier detection methods for nonlinear function with multidimensional input have achieved good performance, they have practical issues in computational cost and parameter adjustments. In this paper we propose a practical approach to outlier detection using support vector regression that reduces computational time and defines outlier threshold suitably. We apply this approach to real data examples for validity.

A Comparative Study of a Robust Estimate Method for Abnormal Traffic Detection (이상 트래픽 탐지를 위한 로버스트 추정 방법 비교 연구)

  • Jung, Jae-Yoon;Kim, Sahm
    • Communications for Statistical Applications and Methods
    • /
    • v.18 no.4
    • /
    • pp.517-525
    • /
    • 2011
  • This paper shows the performance evaluation of a robust estimator based on the GARCH model. We first introduce the method of a robust estimate in the GARCH model and the method of an outlier detection in the GARCH model. The results of the real internet traffic data show the out-performance of the robust estimator over the outlier detection method in the GARCH model. In addition, the method of the robust estimate is less complex than the method of the outlier detection method in the GARCH model.

A Comparison of Methods for the Detection of Outliers in Multivariate Data

  • Hadi, Ali-S.;Joo, Hye-Seon;Son, Mun-S.
    • Communications for Statistical Applications and Methods
    • /
    • v.3 no.2
    • /
    • pp.53-67
    • /
    • 1996
  • Numerous classical as well as robust methods have been proposed in the literature for the detection of multiple outlier in multivariate data. The effectiveness and power of each of these methods have not been thoroughly investigated. In this paper we first reduce the vast number of outlier detection methods to a small number of viable ones. This reduction is based on previous work of other researches and on some theoretical arguments. Then we design and implement a Monte Carlo experiment for comparing these methods. The main goal of our study is to determine which methods are most powerful in the detection of multiple outlier and in dealing with the masking and swamping problems. The results of the Monte Carlo study indicate that two of the methods seem to hace better performances than the others for the detection of multiple outlier in multivariate data.

  • PDF

Outlier detection in time series data (시계열 자료에서의 특이치 발견)

  • Choi, Jeong In;Um, In Ok;Choa, Hyung Jun
    • The Korean Journal of Applied Statistics
    • /
    • v.29 no.5
    • /
    • pp.907-920
    • /
    • 2016
  • This study suggests an outlier detection algorithm that uses quantile autoregressive model in time series data, eventually applying it to actual stock manipulation cases by comparing its performance to existing methods. Studies on outlier detection have traditionally been conducted mostly in general data and those in time series data are insufficient. They have also been limited to a parametric model, which is not convenient as it is complicated with an analysis that takes a long time. Thus, we suggest a new algorithm of outlier detection in time series data and through various simulations, compare it to existing algorithms. Especially, the outlier detection algorithm in time series data can be useful in finding stock manipulation. If stock price which had a certain pattern goes out of flow and generates an outlier, it can be due to intentional intervention and manipulation. We examined how fast the model can detect stock manipulations by applying it to actual stock manipulation cases.

Outlier Detection of Real-Time Reservoir Water Level Data Using Threshold Model and Artificial Neural Network Model (임계치 모형과 인공신경망 모형을 이용한 실시간 저수지 수위자료의 이상치 탐지)

  • Kim, Maga;Choi, Jin-Yong;Bang, Jehong;Lee, Jaeju
    • Journal of The Korean Society of Agricultural Engineers
    • /
    • v.61 no.1
    • /
    • pp.107-120
    • /
    • 2019
  • Reservoir water level data identify the current water storage of the reservoir, and they are utilized as primary data for management and research of agricultural water. For the reservoir storage management, Korea Rural Community Corporation (KRC) installed water level stations at around 1,600 agricultural reservoirs and has been collecting the water level data every 10 minutes. However, various kinds of outliers due to noise and erroneous problems are frequently appearing because of environmental and physical causes. Therefore, it is necessary to detect outlier and improve the quality of reservoir water level data to utilize the water level data in purpose. This study was conducted to detect and classify outlier and normal data using two different models including the threshold model and the artificial neural network (ANN) model. The results were compared to evaluate the performance of the models. The threshold model identifies the outlier by setting the upper/lower bound of water level data and variation data and by setting bandwidth of water level data as a threshold of regarding erroneous water level. The ANN model was trained with prepared training dataset as normal data (T) and outlier (F), and the ANN model operated for identifying the outlier. The models are evaluated with reference data which were collected reservoir water level data in daily by KRC. The outlier detection performance of the threshold model was better than the ANN model, but ANN model showed better detection performance for not classifying normal data as outlier.

Fused Navigation of Unmanned Surface Vehicle and Detection of GPS Abnormality (무인 수상정의 융합 항법 및 GPS 이상 검출)

  • Ko, Nak Yong;Jeong, Seokki
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.22 no.9
    • /
    • pp.723-732
    • /
    • 2016
  • This paper proposes an approach to fused navigation of an unmanned surface vehicle(USV) and to detection of the outlier or interference of global positioning system(GPS). The method fuses available sensor measurements through extended Kalman filter(EKF) to find the location and attitude of the USV. The method uses error covariance of EKF for detection of GPS outlier or interference. When outlier or interference of the GPS is detected, the method excludes GPS data from navigation process. The measurements to be fused for the navigation are GPS, acceleration, angular rate, magnetic field, linear velocity, range and bearing to acoustic beacons. The method is tested through simulated data and measurement data produced through ground navigation. The results show that the method detects GPS outlier or interference as well as the GPS recovery, which frees navigation from the problem of GPS abnormality.

Modeling of Strength of High Performance Concrete with Artificial Neural Network and Mahalanobis Distance Outlier Detection Method (신경망 이론과 Mahalanobis Distance 이상치 탐색방법을 이용한 고강도 콘크리트 강도 예측 모델 개발에 관한 연구)

  • Hong, Jung-Eui
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.33 no.4
    • /
    • pp.122-129
    • /
    • 2010
  • High-performance concrete (HPC) is a new terminology used in concrete construction industry. Several studies have shown that concrete strength development is determined not only by the water-to-cement ratio but also influenced by the content of other concrete ingredients. HPC is a highly complex material, which makes modeling its behavior a very difficult task. This paper aimed at demonstrating the possibilities of adapting artificial neural network (ANN) to predict the comprresive strength of HPC. Mahalanobis Distance (MD) outlier detection method used for the purpose increase prediction ability of ANN. The detailed procedure of calculating Mahalanobis Distance (MD) is described. The effects of outlier compared with before and after artificial neural network training. MD outlier detection method successfully removed existence of outlier and improved the neural network training and prediction performance.

Fast Outlier Removal for Image Registration based on Modified K-means Clustering

  • Soh, Young-Sung;Qadir, Mudasar;Kim, In-Taek
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.16 no.1
    • /
    • pp.9-14
    • /
    • 2015
  • Outlier detection and removal is a crucial step needed for various image processing applications such as image registration. Random Sample Consensus (RANSAC) is known to be the best algorithm so far for the outlier detection and removal. However RANSAC requires a cosiderable computation time. To drastically reduce the computation time while preserving the comparable quality, a outlier detection and removal method based on modified K-means is proposed. The original K-means was conducted first for matching point pairs and then cluster merging and member exclusion step are performed in the modification step. We applied the methods to various images with highly repetitive patterns under several geometric distortions and obtained successful results. We compared the proposed method with RANSAC and showed that the proposed method runs 3~10 times faster than RANSAC.

Simultaneous outlier detection and variable selection via difference-based regression model and stochastic search variable selection

  • Park, Jong Suk;Park, Chun Gun;Lee, Kyeong Eun
    • Communications for Statistical Applications and Methods
    • /
    • v.26 no.2
    • /
    • pp.149-161
    • /
    • 2019
  • In this article, we suggest the following approaches to simultaneous variable selection and outlier detection. First, we determine possible candidates for outliers using properties of an intercept estimator in a difference-based regression model, and the information of outliers is reflected in the multiple regression model adding mean shift parameters. Second, we select the best model from the model including the outlier candidates as predictors using stochastic search variable selection. Finally, we evaluate our method using simulations and real data analysis to yield promising results. In addition, we need to develop our method to make robust estimates. We will also to the nonparametric regression model for simultaneous outlier detection and variable selection.