• Title/Summary/Keyword: Robust Statistics

Search Result 397, Processing Time 0.021 seconds

Robust Cross Validation Score

  • Park, Dong-Ryeon
    • Communications for Statistical Applications and Methods
    • /
    • v.12 no.2
    • /
    • pp.413-423
    • /
    • 2005
  • Consider the problem of estimating the underlying regression function from a set of noisy data which is contaminated by a long tailed error distribution. There exist several robust smoothing techniques and these are turned out to be very useful to reduce the influence of outlying observations. However, no matter what kind of robust smoother we use, we should choose the smoothing parameter and relatively less attention has been made for the robust bandwidth selection method. In this paper, we adopt the idea of robust location parameter estimation technique and propose the robust cross validation score functions.

Robust inference with order constraint in microarray study

  • Kang, Joonsung
    • Communications for Statistical Applications and Methods
    • /
    • v.25 no.5
    • /
    • pp.559-568
    • /
    • 2018
  • Gene classification can involve complex order-restricted inference. Examining gene expression pattern across groups with order-restriction makes standard statistical inference ineffective and thus, requires different methods. For this problem, Roy's union-intersection principle has some merit. The M-estimator adjusting for outlier arrays in a microarray study produces a robust test statistic with distribution-insensitive clustering of genes. The M-estimator in conjunction with a union-intersection principle provides a nonstandard robust procedure. By exact permutation distribution theory, a conditionally distribution-free test based on the proposed test statistic generates corresponding p-values in a small sample size setup. We apply a false discovery rate (FDR) as a multiple testing procedure to p-values in simulated data and real microarray data. FDR procedure for proposed test statistics controls the FDR at all levels of ${\alpha}$ and ${\pi}_0$ (the proportion of true null); however, the FDR procedure for test statistics based upon normal theory (ANOVA) fails to control FDR.

ROBUST MEASURES OF LOCATION IN WATER-QUALITY DATA

  • Kim, Kyung-Sub;Kim, Bom-Chul;Kim, Jin-Hong
    • Water Engineering Research
    • /
    • v.3 no.3
    • /
    • pp.195-202
    • /
    • 2002
  • The mean is generally used as a point estimator in water-quality data. Unfortunately, the nonnormal and skewed distributions of data hinder the direct application of the mean, which is inappropriate statistics in this case. The use of robust statistics such as L, M, and R-estimators are recommended and become more efficient. The median (L-estimator), the biweight (M-estimator), and the Hodges-Lehmann method (R-estimator) are briefly introduced and applied in this paper. From the actual data analyses, it is known that the median does not guarantee robustness for a small number of data sets, and robust measures of location or the arithmetic mean without outliers are highly recommended if the distribution has tails or outliers. Care must be taken to measure the location because water quality level within a water body can change depending on the selected point estimator.

  • PDF

Least clipped absolute deviation for robust regression using skipped median

  • Hao Li;Seokho Lee
    • Communications for Statistical Applications and Methods
    • /
    • v.30 no.2
    • /
    • pp.135-147
    • /
    • 2023
  • Skipped median is more robust than median when outliers are not symmetrically distributed. In this work, we propose a novel algorithm to estimate the skipped median. The idea of skipped median and the new algorithm are extended to regression problem, which is called least clipped absolute deviation (LCAD). Since our proposed algorithm for nonconvex LCAD optimization makes use of convex least absolute deviation (LAD) procedure as a subroutine, regularizations developed for LAD can be directly applied, without modification, to LCAD as well. Numerical studies demonstrate that skipped median and LCAD are useful and outperform their counterparts, median and LAD, when outliers intervene asymmetrically. Some extensions of the idea for skipped median and LCAD are discussed.

A RSS-Based Localization Method Utilizing Robust Statistics for Wireless Sensor Networks under Non-Gaussian Noise (비 가우시안 잡음이 존재하는 무선 센서 네트워크에서 Robust Statistics를 활용하는 수신신호세기기반의 위치 추정 기법)

  • Ahn, Tae-Joon;Koo, In-Soo
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.11 no.3
    • /
    • pp.23-30
    • /
    • 2011
  • In the wireless sensor network(WSN), the detection of precise location of sensor nodes is essential for efficiently utilizing the sensing data acquired from sensor nodes. Among various location methods, the received signal strength (RSS) based localization scheme is mostly preferable in many applications since it can be easily implemented without any additional hardware cost. Since the RSS localization method is mainly effected by radio channel between two nodes, outlier data can be included in the received signal strength measurement specially when some obstacles move around the link between nodes. The outlier data can have bad effect on estimating the distance between two nodes such that it can cause location errors. In this paper, we propose a RSS-based localization method using Robust Statistic and Gaussian filter algorithm for enhancing the accuracy of RSS-based localization. In the proposed algorithm, the outlier data can be eliminated from samples by using the Robust Statistics as well as the Gaussian filter such that the accuracy of localization can be achieved. Through simulation, it is shown that the proposed algorithm can increase the accuracy of localization and is more robust to non gaussian noise channels.

On Confidence Intervals of Robust Regression Estimators (로버스트 회귀추정에 의한 신뢰구간 구축)

  • Lee Dong-Hee;Park You-Sung;Kim Kee-Whan
    • The Korean Journal of Applied Statistics
    • /
    • v.19 no.1
    • /
    • pp.97-110
    • /
    • 2006
  • Since it is well-established that even high quality data tend to contain outliers, one would expect fat? greater reliance on robust regression techniques than is actually observed. But most of all robust regression estimators suffers from the computational difficulties and the lower efficiency than the least squares under the normal error model. The weighted self-tuning estimator (WSTE) recently suggested by Lee (2004) has no more computational difficulty and it has the asymptotic normality and the high break-down point simultaneously. Although it has better properties than the other robust estimators, WSTE does not have full efficiency under the normal error model through the weighted least squares which is widely used. This paper introduces a new approach as called the reweighted WSTE (RWSTE), whose scale estimator is adaptively estimated by the self-tuning constant. A Monte Carlo study shows that new approach has better behavior than the general weighted least squares method under the normal model and the large data.

Statistical Matching Techniques Using the Robust Regression Model (로버스트 회귀모형을 이용한 자료결합방법)

  • Jhun, Myoung-Shic;Jung, Ji-Song;Park, Hye-Jin
    • The Korean Journal of Applied Statistics
    • /
    • v.21 no.6
    • /
    • pp.981-996
    • /
    • 2008
  • Statistical matching techniques whose aim is to achieve a complete data file from different sources. Since the statistical matching method proposed by Rubin (1986) assumes the multivariate normality for data, using this method to data which violates the assumption would involve some problems. This research proposed the statistical matching method using robust regression as an alternative to the linear regression. Furthermore, we carried out a simulation study to compare the performance of the robust regression model and the linear regression model for the statistical matching.

Algorithm for the Robust Estimation in Logistic Regression (로지스틱회귀모형의 로버스트 추정을 위한 알고리즘)

  • Kim, Bu-Yong;Kahng, Myung-Wook;Choi, Mi-Ae
    • The Korean Journal of Applied Statistics
    • /
    • v.20 no.3
    • /
    • pp.551-559
    • /
    • 2007
  • The maximum likelihood estimation is not robust against outliers in the logistic regression. Thus we propose an algorithm for the robust estimation, which identifies the bad leverage points and vertical outliers by the V-mask type criterion, and then strives to dampen the effect of outliers. Our main finding is that, by an appropriate selection of weights and factors, we could obtain the logistic estimates with high breakdown point. The proposed algorithm is evaluated by means of the correct classification rate on the basis of real-life and artificial data sets. The results indicate that the proposed algorithm is superior to the maximum likelihood estimation in terms of the classification.

A Comparision on CERES & Robust-CERES

  • Oh, Kwang-Sik;Do, Soo-Hee;Kim, Dae-Hak
    • 한국데이터정보과학회:학술대회논문집
    • /
    • 2003.10a
    • /
    • pp.93-100
    • /
    • 2003
  • It is necessary to check the curvature of selected covariates in regression diagnostics. There are various graphical methods using residual plots based on least squares fitting. The sensitivity of LS fitting to outliers can distort their residuals, making the identification of the unknown function difficult to impossible. In this paper, we compare combining conditional expectation and residual plots(CERES Plots) between least square fit and robust fits using Huber M-estimator. Robust CERES will be far less distorted than their LS counterparts in the presence of outliers and hence, will be more useful in identifying the unknown function.

  • PDF

Multiple Response Optimization for Robust Design using Desirability Function

  • Kwon, Yong-Man;Hong, Yeon-Woong;Chang, Duk-Joon
    • Journal of the Korean Data and Information Science Society
    • /
    • v.14 no.2
    • /
    • pp.325-335
    • /
    • 2003
  • Robust design is to identify appropriate settings of control factors that make the system's performance robust to to changes in the noise factors that represent the source of variation. In the Taguchi parameter design, the product array approach using orthogonal arrays is mainly used. However, it often requires an excessive number of experiments. An alternative approach, which is called the combined array approach, was suggested by Welch et. al. (1990) and studied by others. In these studies, only single response variable was considered. We propose how to simultaneously optimize multiple responses when we use the combined array approach.

  • PDF