로지스틱 회귀모형에서의 SUPPRESSION

Suppression for Logistic Regression Model

  • 홍종선 (성균관대학교 경제학부 통계학) ;
  • 김호일 (교보자동차보험, CS 기획부) ;
  • 함주형 (한국후지제록스, MA 1 Team)
  • 발행 : 2005.11.01


로지스틱 회귀모형에서 suppression의 논의는 선형회귀의 논의보다 많지 않은데 그 이유 중의 하나는 회귀제곱합 또는 결정계수의 정의가 유일하지 않고 다양하기 때문이다. 여러 종류의 결정계수들 중에서 선호되는 두 종류의 결정계수와 Liao와 McGee(2003)가 제안한 두 종류의 수정 결정계수의 정의로부터 회귀제곱합을 유도하여 로지스틱 회귀모형에서의 suppression을 설명하고자 한다. 모의실험을 통하여 자료를 생성하여 어떤 경우에 suppression이 발생하는지를 살펴보고 그 결과를 선형회귀모형에서의 suppression 결과와 비교한다.

The suppression for logistic regression models has been debated no longer than that for linear regression models since, among many other reasons, sum of squares for regression (SSR) or coefficient of determination ($R^2$) could be defined into various ways. Based on four kinds of $R^2$'s: two kinds are most preferred, and the other two are proposed by Liao & McGee (2003), four kinds of SSR's are derived so that the suppression for logistic models is explained. Many data fitted to logistic models are generated by Monte Carlo method. We explore when suppression happens, and compare with that for linear regression models.



  1. 홍종선. (2004). Suppression and collapsibility for log-linear models, The Korean Communications in Statistics, 11, 519-527
  2. Bishop, Y. M. M., Fienberg, S. E. and Holland, P. W. (1975). Discrete Multivariate Analysis, Cambridge, Massachusetts: MIT Press
  3. Cohen, J. and Cohen, P. (1975). Applied Multiple Regression/Correlation Analysis for the Behavioral Sciences, New Jersey: Lawrence Erlbaum Associates
  4. Conger, A. J. (1974). A revised definition for suppressor variables: A guide to their identification and interpretation, Educational and Psychological Measurement, 34, 35-46
  5. Freud, R. J. (1988). when is $R^{2}$ > $r_{yx1}$ + $r_{yx2}$ (Revisited), The American Statistician, 42, 89-90
  6. Hamilton, D. (1987). Sometimes $R^{2}$ > $r_{yx1}$ + $r_{yx2}$ correlated variables are not always redundant, The American Statistician, 41, 129-132
  7. Hamilton, D. (1988). Reply to Freund and Mitra, The American Statistician, 42, 90-91
  8. Horst, P. (1941). The role of prediction variables which are independent of the criterion, in The Prediction of Personal Adjustment, ed. P. Horst, New York: Social Science Research Council, 431-436
  9. Kvalscth, T. O. (1985). Cautionary note about $R^{2}$, The American Statistician, 39, 279-285
  10. Liao, J. G. and McGee, D. (2003). Adjusted coefficients of determination for logistic regression, The American Statistician, 57, 161-165
  11. Lynn, H. S. (2003). Supperssion and confounding in action, The American Statistician, 57, 58-61
  12. Menard, S. (2000). Coefficients of determination for multiple logistic regression analysis, The American Statistician, 54, 17-24
  13. Mitra, S. (1988). The relationship between the multiple and the zero-order correlation coefficients, The American Statistician, 42, 89
  14. Mittlbock, M. and Schemper, M. (1996). Explained variation for logistic regression, Statistics in Medicine, 15, 1987-1997<1987::AID-SIM318>3.0.CO;2-9
  15. Schey, H. M. (1993). The relationship between the magnitudes of SSR($x_{2}$) and SSR($x_{2}|x_{1}$): A geometric description, The American Statistician, 47, 26-30
  16. Sharpe, N. R. and Roberts, R. A. (1997). The relationship among sums of squares, correlation coefficients, and suppression, The American Statistician, 51, 46-48
  17. Velicer, W. F. (1978). Suppressor variables and the semipartial correlation coefficient. Educational and Psychological Measurement, 38, 953-958