DOI QR코드

DOI QR Code

Firework Plot as a Graphical Exploratory Data Analysis Tool to Evaluate the Impact of Outliers in a Mixture Experiment

혼합물 실험에서 특이값의 영향을 평가하기 위한 그래픽 탐색적 자료분석 도구로서의 불꽃그림

  • Jang, Dae-Heung (Department of Statistics, Pukyong National University) ;
  • Ahn, SoJin (Department of Statistics, Pukyong National University) ;
  • Kim, Youngil (School of Business and Economics, ChungAng University)
  • Received : 2014.06.03
  • Accepted : 2014.07.15
  • Published : 2014.08.31

Abstract

It is common to check the validity of an assumed model with the heavy use of diagnostics tools when conducting data analysis with regression techniques; however, outliers and influential data points often distort the regression output in undesired manner. Jang and Anderson-Cook (2013) proposed a graphical method called a firework plot for exploratory analysis that could visualize the trace of the impact of possible outlying and/or influential data points on individual regression coefficients and the overall residual sum of squares(SSE) measure. They developed 3-D plot as well as pair-wise plot for the appropriate measures of interest. In this paper, the approach was extended further to tell the strength of their approach; in addition, a more meaningful interpretation was possible by adding a measure not mentioned in their paper. This approach was applied to the mixture experiment because we felt that a detailed analysis of statistical measure sensitivity is required in a small experiment.

회귀모형을 이용하여 자료를 분석하는 경우 이상점이나 영향점과 같은 특이값들의 유무를 검정하는 회귀진단기법은 모형의 적합성을 체크하기 위한 필수적인 도구로 잡은 지 오래이다. 이러한 점들이 존재 하는 경우 회귀분석의 결과가 왜곡되어 해석이 된다. Jang과 Anderson-Cook (2013)은 불꽃그림이란 이름을 붙인 그림도구를 발표하였는데 관측값에 부여된 가중치를 1에서 0으로 변화함에 따라 이상점이나 영향점이 회귀계수 및 잔차제곱합(SSE)에 어떠한 영향을 미치는지 3차원 그림에 추적곡선을 그려 보았을 뿐 아니라 쌍으로 대비시켜 봄으로써 분석의 시각적인 효과를 증대시켰다. 본 연구에서는 더 나아가 이러한 시도가 기존 방법과 어떤 차이점이 있는지 2013년에는 반영치 않은 통계량을 포함해서 더 많은 해석이 가능한지 혼합물 실험 계획을 통해 다양한 통계량의 민감도 분석을 실행하였다. 왜냐하면 작은 혼합물실험인 자료인 경우 더욱 세밀한 통계량에 대한 민감도 분석이 필요하기 때문이다.

Keywords

References

  1. Beckman, R. J. and Cook, R. D. (1983). Outlier ... s, Technometrics, 25, 119-147.
  2. Belsley, D. A., Kuh, E. and Welch, R. E. (1980). Regression Diagnostics: Identifying In uential Data and Source of Collinearity, Wiley, New York.
  3. Cook, R. D. (1977). Detection of influential observation in linear regression, Technometrics, 19, 15-18. https://doi.org/10.2307/1268249
  4. Cook, R. D. (1979). Influential observation in linear regression, Journal of American Statistical Association, 74, 169-174. https://doi.org/10.1080/01621459.1979.10481634
  5. Cook, R. D. and Weisberg, S. (1989). Regression diagnostics with dynamic graphics, Technometrics, 31, 277-291.
  6. Emerson, J. D. and Strenio, J. (1983). The spread-versus-level plot in Hoaglin, D. C., Mosteller, F. and Tukey, J. W.(Eds.) (1983). Understanding Robust and Exploratory Data Analysis, Wiley, New York.
  7. Fox, J. (2008). Applied Regression Analysis and Generalized Linear Models, 2nd ed., Sage, New York.
  8. Jang, D. H. and Anderson-Cook, C. M. (2013). Firework plot as a graphical exploratory data analysis tool for evaluating the impact of outliers in data exploration and regression, Quality and Reliability Engineering International, in press.
  9. McLean, R. A. and Anderson, V. L. (1966). Extreme vertices design of mixture experiments, Technometrics, 8, 447-454. https://doi.org/10.1080/00401706.1966.10490377
  10. Myers, R. H., Montgomery, D. C. and Anderson-Cook, C. M. (2009). Response Surface Methodology: Process and Product Optimization Using Designed Experiments, 3rd ed., Wiley, New York.
  11. Park, S. H., Kim, Y. H. and Toutenberg, H. (1992). Regression diagnostics for removing an observation with animating graphics, Statistical Papers, 33, 227-240. https://doi.org/10.1007/BF02925327