• Title/Summary/Keyword: Outliers test

Search Result 114, Processing Time 0.031 seconds

Regression diagnostics for response transformations in a partial linear model (부분선형모형에서 반응변수변환을 위한 회귀진단)

  • Seo, Han Son;Yoon, Min
    • Journal of the Korean Data and Information Science Society
    • /
    • v.24 no.1
    • /
    • pp.33-39
    • /
    • 2013
  • In the transformation of response variable in partial linear models outliers can cause a bad effect on estimating the transformation parameter, just as in the linear models. To solve this problem the processes of estimating transformation parameter and detecting outliers are needed, but have difficulties to be performed due to the arbitrariness of the nonparametric function included in the partial linear model. In this study, through the estimation of nonparametric function and outlier detection methods such as a sequential test and a maximum trimmed likelihood estimation, processes for transforming response variable robust to outliers in partial linear models are suggested. The proposed methods are verified and compared their effectiveness by simulation study and examples.

Adaptive boosting in ensembles for outlier detection: Base learner selection and fusion via local domain competence

  • Bii, Joash Kiprotich;Rimiru, Richard;Mwangi, Ronald Waweru
    • ETRI Journal
    • /
    • v.42 no.6
    • /
    • pp.886-898
    • /
    • 2020
  • Unusual data patterns or outliers can be generated because of human errors, incorrect measurements, or malicious activities. Detecting outliers is a difficult task that requires complex ensembles. An ideal outlier detection ensemble should consider the strengths of individual base detectors while carefully combining their outputs to create a strong overall ensemble and achieve unbiased accuracy with minimal variance. Selecting and combining the outputs of dissimilar base learners is a challenging task. This paper proposes a model that utilizes heterogeneous base learners. It adaptively boosts the outcomes of preceding learners in the first phase by assigning weights and identifying high-performing learners based on their local domains, and then carefully fuses their outcomes in the second phase to improve overall accuracy. Experimental results from 10 benchmark datasets are used to train and test the proposed model. To investigate its accuracy in terms of separating outliers from inliers, the proposed model is tested and evaluated using accuracy metrics. The analyzed data are presented as crosstabs and percentages, followed by a descriptive method for synthesis and interpretation.

Influence Measures for a Test Statistic on Independence of Two Random Vectors

  • Jung Kang-Mo
    • Communications for Statistical Applications and Methods
    • /
    • v.12 no.3
    • /
    • pp.635-642
    • /
    • 2005
  • In statistical diagnostics a large number of influence measures have been proposed for identifying outliers and influential observations. However it seems to be few accounts of the influence diagnostics on test statistics. We study influence analysis on the likelihood ratio test statistic whether the two sets of variables are uncorrelated with one another or not. The influence of observations is measured using the case-deletion approach, the influence function. We compared the proposed influence measures through two illustrative examples.

Minimum Hellinger Distance Bsed Goodness-of-fit Tests in Normal Models: Empirical Approach

  • Dong Bin Jeong
    • Communications for Statistical Applications and Methods
    • /
    • v.6 no.3
    • /
    • pp.967-976
    • /
    • 1999
  • In this paper we study the Hellinger distance based goodness-of-fit tests that are analogs of likelihood ratio tests. The minimum Hellinger distance estimator (MHDE) in normal models provides an excellent robust alternative to the usual maximum likelihood estimator. Our simulation results show that the Hellinger deviance test (Simpson 1989) based goodness-of-fit test is robust when data contain outliers. The proposed hellinger deviance test(Simpson 1989) is a more direcct method for obtaining robust inferences than an automated outlier screen method used before the likelihood ratio test data analysis.

  • PDF

Outlier Detection of the Coastal Water Temperature Monitoring Data Using the Approximate and Detail Components (어림과 나머지 성분을 이용한 연안 수온자료의 이상자료 감지)

  • Cho, Hong-Yeon;Oh, Ji-Hee
    • Journal of the Korean Society for Marine Environment & Energy
    • /
    • v.15 no.2
    • /
    • pp.156-162
    • /
    • 2012
  • Outlier detection and treatment process is highly required as the first step for the statistical analysis of the monitoring data having many outliers frequently occurred in the coastal environmental monitoring projects. In this study, the outlier detection method using the approximate and detail (or residual) components of the (raw) data is suggested. The approximate and detail components of the data can be separated by the diverse filtering and smoothing methods. The decomposition of the data is carried out by the harmonic analysis and local regression curve, respectively. Then, the Grubbs' test and modified z-score method widely used to detect outliers in the data are applied to the detail components of the water temperature data. The new data set is reconstructed after removed the outliers detected by these methods. It can be shown that the suggested process is successfully applied to the outlier detection of the coastal water temperature monitoring data provided by the Real-time Information System for Aquaculture Environment, National Fisheries Research and Development Institute (NFRDI).

Assessing the Accuracy of Outlier Tests in Nonlinear Regression

  • Kahng, Myung-Wook;Kim, Bu-Yang
    • Communications for Statistical Applications and Methods
    • /
    • v.16 no.1
    • /
    • pp.163-168
    • /
    • 2009
  • Given the specific mean shift outlier model, the standard approaches to obtaining test statistics for outliers are discussed. Accuracy of outlier tests is investigated using subset curvatures. These subset curvatures appear to be reliable indicators of the adequacy of the linearization based test. Also, we consider obtaining graphical summaries of uncertainty in estimating parameters through confidence curves. The results are applied to the problem of assessing the accuracy of outlier tests.

Outliers Identification and Reliabilities in Geodetic Networks (돌출오차(突出誤差)의 검출(檢出)과 측지망(測地網)의 신뢰도(信賴度))

  • Lee, Suck Chan;Kho, Young Ho;Lee, Young Jin
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.7 no.1
    • /
    • pp.1-9
    • /
    • 1987
  • This paper is mainly concerned with the analysis of post-adjustment techniques for the outliers identification and reliabilities in geodetic networks. The proposed strategy which easily attached to a least squares adjustment program, is successfully used in a test network. It shows that their application can considerably improve the quality of the results, but the main advantage appears when their reliability is considered.

  • PDF

Influence in Testing the Equality of Two Covariance Matrices (두개의 공분산 행렬의 동질성 검정에서의 영향치 분석)

  • Myung Geun Kim
    • The Korean Journal of Applied Statistics
    • /
    • v.7 no.2
    • /
    • pp.213-224
    • /
    • 1994
  • A diagnostic method useful for detecting outliers in testing the equality of two covariance metrics is developed using the influence curve approach. This method is easily generalized to more than two covariance matrices. A sample version for the influence measure of detecting outliers is considered based on the empirical distribution functions. The sample version includes as its component terms the well-known test statistic for detecting one outlier at a time introduced by Wilks and its generalization to the two-group case.

  • PDF

The Outlier-Filtering Algorithm for National Highway Continuous Traffic Counts Data (일반국도 상시조사 교통량 자료의 이상치 판정 알고리즘 개발)

  • Shin, Jae Myong;Lee, Sang Hyup;Kim, Hyun Suk
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.33 no.2
    • /
    • pp.691-702
    • /
    • 2013
  • In this study the quantitative outlier-filtering algorithm has been developed using the smoothing method based on the day-of-the-week traffic volume variation pattern and then, in order to test the effectiveness of the algorithm, it has been used to identify outliers from the traffic volume data collected at 14 continuous traffic counts sites on the national highways in the year 2010. The test results are satisfactory since the filtering rate is 98.2% for normal days and the mis-filtering rate is 8.0% for abnormal days. Therefore, the algorithm will be able to be used for roughly-but-quickly filtering outliers from the collected traffic volume data.

Preparation and evaluation of limestone reference material for a proficiency test (국내산 석회석의 비교숙련도 시험용 시료 제조 및 평가)

  • Jung, Choong-Ho;Park, Deok-Won;Kim, Sung-Min;Yu, Eung-Chul
    • Analytical Science and Technology
    • /
    • v.22 no.1
    • /
    • pp.82-91
    • /
    • 2009
  • Preparation and evaluation of the limestone samples for a proficiency test using domestic limestone have been performed. We have used statistical method for evaluation of the XRF and instrumental analysis results. We have found that there were some outliers from XRF and ICP-OES instrumental analysis results for each sample. After removal of 5 outliers among the 50 samples we could obtain the homogeneous samples which have within a reliability of 95% from a statistical analysis result.