• Title/Summary/Keyword: Outlier model

Search Result 210, Processing Time 0.026 seconds

Improvement of PM Forecasting Performance by Outlier Data Removing (Outlier 데이터 제거를 통한 미세먼지 예보성능의 향상)

  • Jeon, Young Tae;Yu, Suk Hyun;Kwon, Hee Yong
    • Journal of Korea Multimedia Society
    • /
    • v.23 no.6
    • /
    • pp.747-755
    • /
    • 2020
  • In this paper, we deal with outlier data problems that occur when constructing a PM2.5 fine dust forecasting system using a neural network. In general, when learning a neural network, some of the data are not helpful for learning, but rather disturbing. Those are called outlier data. When they are included in the training data, various problems such as overfitting occur. In building a PM2.5 fine dust concentration forecasting system using neural network, we have found several outlier data in the training data. We, therefore, remove them, and then make learning 3 ways. Over_outlier model removes outlier data that target concentration is low, but the model forecast is high. Under_outlier model removes outliers data that target concentration is high, but the model forecast is low. All_outlier model removes both Over_outlier and Under_outlier data. We compare 3 models with a conventional outlier removal model and non-removal model. Our outlier removal model shows better performance than the others.

Outlier Detection of Real-Time Reservoir Water Level Data Using Threshold Model and Artificial Neural Network Model (임계치 모형과 인공신경망 모형을 이용한 실시간 저수지 수위자료의 이상치 탐지)

  • Kim, Maga;Choi, Jin-Yong;Bang, Jehong;Lee, Jaeju
    • Journal of The Korean Society of Agricultural Engineers
    • /
    • v.61 no.1
    • /
    • pp.107-120
    • /
    • 2019
  • Reservoir water level data identify the current water storage of the reservoir, and they are utilized as primary data for management and research of agricultural water. For the reservoir storage management, Korea Rural Community Corporation (KRC) installed water level stations at around 1,600 agricultural reservoirs and has been collecting the water level data every 10 minutes. However, various kinds of outliers due to noise and erroneous problems are frequently appearing because of environmental and physical causes. Therefore, it is necessary to detect outlier and improve the quality of reservoir water level data to utilize the water level data in purpose. This study was conducted to detect and classify outlier and normal data using two different models including the threshold model and the artificial neural network (ANN) model. The results were compared to evaluate the performance of the models. The threshold model identifies the outlier by setting the upper/lower bound of water level data and variation data and by setting bandwidth of water level data as a threshold of regarding erroneous water level. The ANN model was trained with prepared training dataset as normal data (T) and outlier (F), and the ANN model operated for identifying the outlier. The models are evaluated with reference data which were collected reservoir water level data in daily by KRC. The outlier detection performance of the threshold model was better than the ANN model, but ANN model showed better detection performance for not classifying normal data as outlier.

Dam Sensor Outlier Detection using Mixed Prediction Model and Supervised Learning

  • Park, Chang-Mok
    • International journal of advanced smart convergence
    • /
    • v.7 no.1
    • /
    • pp.24-32
    • /
    • 2018
  • An outlier detection method using mixed prediction model has been described in this paper. The mixed prediction model consists of time-series model and regression model. The parameter estimation of the prediction model was performed using supervised learning and a genetic algorithm is adopted for a learning method. The experiments were performed in artificial and real data set. The prediction performance is compared with the existing prediction methods using artificial data. Outlier detection is conducted using the real sensor measurements in a dam. The validity of the proposed method was shown in the experiments.

Assessing the Accuracy of Outlier Tests in Nonlinear Regression

  • Kahng, Myung-Wook;Kim, Bu-Yang
    • Communications for Statistical Applications and Methods
    • /
    • v.16 no.1
    • /
    • pp.163-168
    • /
    • 2009
  • Given the specific mean shift outlier model, the standard approaches to obtaining test statistics for outliers are discussed. Accuracy of outlier tests is investigated using subset curvatures. These subset curvatures appear to be reliable indicators of the adequacy of the linearization based test. Also, we consider obtaining graphical summaries of uncertainty in estimating parameters through confidence curves. The results are applied to the problem of assessing the accuracy of outlier tests.

Simultaneous outlier detection and variable selection via difference-based regression model and stochastic search variable selection

  • Park, Jong Suk;Park, Chun Gun;Lee, Kyeong Eun
    • Communications for Statistical Applications and Methods
    • /
    • v.26 no.2
    • /
    • pp.149-161
    • /
    • 2019
  • In this article, we suggest the following approaches to simultaneous variable selection and outlier detection. First, we determine possible candidates for outliers using properties of an intercept estimator in a difference-based regression model, and the information of outliers is reflected in the multiple regression model adding mean shift parameters. Second, we select the best model from the model including the outlier candidates as predictors using stochastic search variable selection. Finally, we evaluate our method using simulations and real data analysis to yield promising results. In addition, we need to develop our method to make robust estimates. We will also to the nonparametric regression model for simultaneous outlier detection and variable selection.

First Order Difference-Based Error Variance Estimator in Nonparametric Regression with a Single Outlier

  • Park, Chun-Gun
    • Communications for Statistical Applications and Methods
    • /
    • v.19 no.3
    • /
    • pp.333-344
    • /
    • 2012
  • We consider some statistical properties of the first order difference-based error variance estimator in nonparametric regression models with a single outlier. So far under an outlier(s) such difference-based estimators has been rarely discussed. We propose the first order difference-based estimator using the leave-one-out method to detect a single outlier and simulate the outlier detection in a nonparametric regression model with the single outlier. Moreover, the outlier detection works well. The results are promising even in nonparametric regression models with many outliers using some difference based estimators.

Testing Outliers in Nonlinear Regression

  • Kahng, Myung-Wook
    • Journal of the Korean Statistical Society
    • /
    • v.24 no.2
    • /
    • pp.419-437
    • /
    • 1995
  • Given the specific mean shift outlier model, several standard approaches to obtaining test statistic for outliers are discussed. Each of these is developed in detail for the nonlinear regression model, and each leads to an equivalent distribution. The geometric interpretations of the statistics and accuracy of linear approximation are also presented.

  • PDF

Asymptotic Properties of Outlier Tests in Nonlinear Regression

  • Kahng, Myung-Wook
    • Journal of the Korean Data and Information Science Society
    • /
    • v.17 no.1
    • /
    • pp.205-211
    • /
    • 2006
  • For a linear regression model, the necessary and sufficient condition for the asymptotic consistency of the outlier test statistic is known. An analogous condition for the nonlinear regression model is considered in this paper.

  • PDF

A Score test for Detection of Outliers in Nonlinear Regression

  • Kahng, Myung-Wook
    • Journal of the Korean Statistical Society
    • /
    • v.22 no.2
    • /
    • pp.201-208
    • /
    • 1993
  • Given the specific mean shift outlier model, the score test for multiple outliers in nonlinear regression is discussed as an alternative to the likelihood ratio test. The geometric interpretation of the score statistic is also presented.

  • PDF

Outlier Detection in Random Effects Model Using Fractional Bayes Factor

  • Chung, Younshik
    • Communications for Statistical Applications and Methods
    • /
    • v.7 no.1
    • /
    • pp.141-150
    • /
    • 2000
  • In this paper we propose a method of computing Bayes factor to detect an outlier in a random effects model. When no information is available and hence improper noninformative priors should be used Bayes factor includes the unspecified constants and has complicated computational burden. To solve this problem we use the fractional Bayes factor (FBF) of O-Hagan(1995) and the generalized Savage0-Dickey density ratio of Verdinelli and Wasserman (1995) The proposed method is applied to outlier deterction problem We perform a simulation of the proposed approach with a simulated data set including an outlier and also analyze a real data set.

  • PDF