• Title/Summary/Keyword: Robust Statistics

Search Result 397, Processing Time 0.019 seconds

Robust Variable Selection in Classification Tree

  • Jang Jeong Yee;Jeong Kwang Mo
    • Proceedings of the Korean Statistical Society Conference
    • /
    • 2001.11a
    • /
    • pp.89-94
    • /
    • 2001
  • In this study we focus on variable selection in decision tree growing structure. Some of the splitting rules and variable selection algorithms are discussed. We propose a competitive variable selection method based on Kruskal-Wallis test, which is a nonparametric version of ANOVA F-test. Through a Monte Carlo study we note that CART has serious bias in variable selection towards categorical variables having many values, and also QUEST using F-test is not so powerful to select informative variables under heavy tailed distributions.

  • PDF

A SIGN TEST FOR UNIT ROOTS IN A SEASONAL MTAR MODEL

  • Shin, Dong-Wan;Park, Sei-Jung
    • Journal of the Korean Statistical Society
    • /
    • v.36 no.1
    • /
    • pp.149-156
    • /
    • 2007
  • This study suggests a new method for testing seasonal unit roots in a momentum threshold autoregressive (MTAR) process. This sign test is robust against heteroscedastic or heavy tailed errors and is invariant to monotone data transformation. The proposed test is a seasonal extension of the sign test of Park and Shin (2006). In the case of partial seasonal unit root in an MTAR model, a Monte-Carlo study shows that the proposed test has better power than the seasonal sign test developed for AR model.

Unit Root Tests for Autoregressive Moving Average Processes Based on M-estimators

  • Shin, Dong-Wan;Lee, Oesook
    • Journal of the Korean Statistical Society
    • /
    • v.31 no.3
    • /
    • pp.301-314
    • /
    • 2002
  • For autoregressive moving average (ARMA) models, robust unit root tests are developed using M-estimators. The tests are parametric in the sense ARMA parameters are estimated jointly with unit roots. A Monte-Carlo experiment reveals superiority of the parametric tests over the semipararmetric tests of Lucas (1995a) in terms of both empirical sizes and powers.

Analysis of Field Test Data using Robust Linear Mixed-Effects Model (로버스트 선형혼합모형을 이용한 필드시험 데이터 분석)

  • Hong, Eun Hee;Lee, Youngjo;Ok, You Jin;Na, Myung Hwan;Noh, Maengseok;Ha, Il Do
    • The Korean Journal of Applied Statistics
    • /
    • v.28 no.2
    • /
    • pp.361-369
    • /
    • 2015
  • A general linear mixed-effects model is often used to analyze repeated measurement experiment data of a continuous response variable. However, a general linear mixed-effects model can give improper analysis results when simultaneously detecting heteroscedasticity and the non-normality of population distribution. To achieve a more robust estimation, we used a heavy-tailed linear mixed-effects model for a more exact and reliable analysis conclusion than a general linear mixed-effects model. We also provide reliability analysis results for further research.

Robust Design for Multiple Quality Characteristics using Principal Component Analysis

  • Kwon, Yong-Man;Hong, Yeon-Woong
    • Journal of the Korean Data and Information Science Society
    • /
    • v.14 no.3
    • /
    • pp.545-551
    • /
    • 2003
  • Robust design is to identify appropriate settings of control factors that make the system's performance robust to changes in the noise factors that represent the source of variation. In this paper we propose how to simultaneously optimize multiple quality characteristics using the principal component analysis of multivariate statistical analysis. An example is illustrated to compare it with already proposed method.

  • PDF

Empirical Choice of the Shape Parameter for Robust Support Vector Machines

  • Pak, Ro-Jin
    • Communications for Statistical Applications and Methods
    • /
    • v.15 no.4
    • /
    • pp.543-549
    • /
    • 2008
  • Inspired by using a robust loss function in the support vector machine regression to control training error and the idea of robust template matching with M-estimator, Chen (2004) applies M-estimator techniques to gaussian radial basis functions and form a new class of robust kernels for the support vector machines. We are specially interested in the shape of the Huber's M-estimator in this context and propose a way to find the shape parameter of the Huber's M-estimating function. For simplicity, only the two-class classification problem is considered.

Robust Bayes and Empirical Bayes Analysis in Finite Population Sampling with Auxiliary Information

  • Kim, Dal-Ho
    • Journal of the Korean Statistical Society
    • /
    • v.27 no.3
    • /
    • pp.331-348
    • /
    • 1998
  • In this paper, we have proposed some robust Bayes estimators using ML-II priors as well as certain empirical Bayes estimators in estimating the finite population mean in the presence of auxiliary information. These estimators are compared with the classical ratio estimator and a subjective Bayes estimator utilizing the auxiliary information in terms of "posterior robustness" and "procedure robustness" Also, we have addressed the issue of choice of sampling design from a robust Bayesian viewpoint.

  • PDF

A comparison study of various robust regression estimators using simulation (시뮬레이션을 통한 다양한 로버스트 회귀추정량의 비교 연구)

  • Jang, Soohee;Yoon, Jungyeon;Chun, Heuiju
    • The Korean Journal of Applied Statistics
    • /
    • v.29 no.3
    • /
    • pp.471-485
    • /
    • 2016
  • Least squares (LS) regression is a classic method for regression that is optimal under assumptions of regression and usual observations. However, the presence of unusual data in the LS method leads to seriously distorted estimates. Therefore, various robust estimation methods are proposed to circumvent the limitations of traditional LS regression. Among these, there are M-estimators based on maximum likelihood estimation (MLE), L-estimators based on linear combinations of order statistics and R-estimators based on a linear combinations of the ordered residuals. In this paper, robust regression estimators with high breakdown point and/or with high efficiency are compared under several simulated situations. The paper analyses and compares distributions of estimates as well as relative efficiencies calculated from mean squared errors (MSE) in the simulation study. We conclude that MM-estimators or GR-estimators are a good choice for the real data application.

Clustering Analysis of Science and Engineering College Students' understanding on Probability and Statistics (Robust PCA를 활용한 이공계 대학생의 확률 및 통계 개념 이해도 분석)

  • Yoo, Yongseok
    • Journal of Convergence for Information Technology
    • /
    • v.12 no.3
    • /
    • pp.252-258
    • /
    • 2022
  • In this study, we propose a method for analyzing students' understanding of probability and statistics in small lectures at universities. A computer-based test for probability and statistics was performed on 95 science and engineering college students. After dividing the students' responses into 7 clusters using the Robust PCA and the Gaussian mixture model, the achievement of each subject was analyzed for each cluster. High-ranking clusters generally showed high achievement on most topics except for statistical estimation, and low-achieving clusters showed strengths and weaknesses on different topics. Compared to the widely used PCA-based dimension reduction followed by clustering analysis, the proposed method showed each group's characteristics more clearly. The characteristics of each cluster can be used to develop an individualized learning strategy.

Principal Components Logistic Regression based on Robust Estimation (로버스트추정에 바탕을 둔 주성분로지스틱회귀)

  • Kim, Bu-Yong;Kahng, Myung-Wook;Jang, Hea-Won
    • The Korean Journal of Applied Statistics
    • /
    • v.22 no.3
    • /
    • pp.531-539
    • /
    • 2009
  • Logistic regression is widely used as a datamining technique for the customer relationship management. The maximum likelihood estimator has highly inflated variance when multicollinearity exists among the regressors, and it is not robust against outliers. Thus we propose the robust principal components logistic regression to deal with both multicollinearity and outlier problem. A procedure is suggested for the selection of principal components, which is based on the condition index. When a condition index is larger than the cutoff value obtained from the model constructed on the basis of the conjoint analysis, the corresponding principal component is removed from the logistic model. In addition, we employ an algorithm for the robust estimation, which strives to dampen the effect of outliers by applying the appropriate weights and factors to the leverage points and vertical outliers identified by the V-mask type criterion. The Monte Carlo simulation results indicate that the proposed procedure yields higher rate of correct classification than the existing method.