• 제목/요약/키워드: Multicollinearity

검색결과 174건 처리시간 0.029초

ILL-CONDITIONING IN LINEAR REGRESSION MODELS AND ITS DIAGNOSTICS

  • Ghorbani, Hamid
    • 한국수학교육학회지시리즈B:순수및응용수학
    • /
    • 제27권2호
    • /
    • pp.71-81
    • /
    • 2020
  • Multicollinearity is a common problem in linear regression models when two or more regressors are highly correlated, which yields some serious problems for the ordinary least square estimates of the parameters as well as model validation and interpretation. In this paper, first the problem of multicollinearity and its subsequent effects on the linear regression along with some important measures for detecting multicollinearity is reviewed, then the role of eigenvalues and eigenvectors in detecting multicollinearity are bolded. At the end a real data set is evaluated for which the fitted linear regression models is investigated for multicollinearity diagnostics.

로짓모형에 있어서 다중공선성의 영향에 관한 연구 (Effects of Multicollinearity in Logit Model)

  • 류시균
    • 대한교통학회지
    • /
    • 제26권1호
    • /
    • pp.113-126
    • /
    • 2008
  • 비확률변수간 선형관계로 정의되는 다중공선성은 설명변수간 선형방정식으로 표현되는 회귀모형의 신뢰도를 저하시키기 때문에 회귀모형의 구축과정에서는 세심한 검토와 대응이 이루어진다. 본 연구에서는 구조화된 수치실험을 통해서 로짓모형에 대한 다중공선성의 영향을 규명하였다. 효용함수를 구성하는 설명변수들간 상관관계의 정도에 따라서 추정된 모형의 적합도 지표와 계수의 신뢰도 지표가 어떻게 변동하는 지를 추적함으로써 다음과 같은 시사점을 확인할 수 있었다. 첫째, 설명변수의 추가를 통해서 모델의 적합도 개선이 가능한 회귀모형과 달리, 로짓모형에서는 효용함수에 설명변수를 추가하는 경우 로짓모형의 적합도가 개선될 수도, 역으로 저하될 수도 있음이 확인되었다. 둘째, 공통의 계수를 갖도록 모델을 구성하면 제네릭 변수간 상관관계가 높아짐에 따라 모델의 적합도가 저하됨을 확인하였다. 셋째, 설명 변수간 상관관계가 높은 경우 선택행동에 대한 설명변수의 기여도가 과대평가될 가능성을 확인하였다. 넷째, 설명변수간 상관관계가 높으면 추정된 계수의 신뢰도가 저하됨을 확인하였다. 결론적으로 본 연구를 통해서 그동안 로짓모형의 구축과정에서는 주목받지 못했던 다중공선성이 실제로는 세심한 배려와 적절한 대응을 통해서 제어되어야 함이 규명되었다.

Multicollinarity in Logistic Regression

  • Jong-Han lee;Myung-Hoe Huh
    • Communications for Statistical Applications and Methods
    • /
    • 제2권2호
    • /
    • pp.303-309
    • /
    • 1995
  • Many measures to detect multicollinearity in linear regression have been proposed in statistics and numerical analysis literature. Among them, condition number and variance inflation factor(VIF) are most popular. In this study, we give new interpretations of condition number and VIF in linear regression, using geometry on the explanatory space. In the same line, we derive natural measures of condition number and VIF for logistic regression. These computer intensive measures can be easily extended to evaluate multicollinearity in generalized linear models.

  • PDF

서비스 경영 혁신 기업 평가 모형의 개선 방안 연구 (A Research on Improving the Evaluation Model for Management Innovative Enterprises)

  • 노재확
    • 통상정보연구
    • /
    • 제12권4호
    • /
    • pp.279-302
    • /
    • 2010
  • A better selection model on management innovative enterprises is needed since the Korean government provides multi benefits to those selected enterprises. However, the selection model's propriety is suspicious because of the shortage of consideration of assessment items. In particular, the most important two assessment items, strategy and performance are suspected of multicollinearity because of high correlation scores. No consideration on multicollinearity among those items leads to erroneous selection which doubly counts the same components with different item names. The principle component analysis is applied to factor out the uncorrelated items. Using the resulted principle components, the new estimations are carried out. The comparison between estimated results from using principle components and non principle components shows that the present selection model overly considers the performance items compared to the real effect of items, which is a result of multicollinearity between performance and strategy.

  • PDF

CASE INFLUENCE ON MULTIPLE CORRELATION COEFFICIENT

  • KIM, Myung-Geun
    • Journal of applied mathematics & informatics
    • /
    • 제19권1_2호
    • /
    • pp.521-525
    • /
    • 2005
  • Case deletion diagnostic for multiple correlation coefficient is considered. A method of detecting observations that can hide or create multicollinearity is suggested. A numerical example is given for illustration.

수확예측(收穫豫測) Model의 Multicollinearity 문제점(問題點) 해결(解決)을 위(爲)한 Ridge Regression의 이용(利用) (The Use Ridge Regression for Yield Prediction Models with Multicollinearity Problems)

  • 신만용
    • 한국산림과학회지
    • /
    • 제79권3호
    • /
    • pp.260-268
    • /
    • 1990
  • 수확(收穫) 예측(豫測) model이 multicollinearity 문제점(問題點) 가질때 보다 정확한 추정식(推定式)을 얻기 위하여 두 종류의 ridge estimator와 최소(最小) 자승법(自乘法)(OLS)의 추정치를 비교(比較)하였다. 본 연구(硏究)에서 사용(使用)된 ridge estmator는 Mallows's (1973)Cp-like statistic과 Allens's (1974) PRESS-like statistic 이었다. 위의 세가지 estimator 예측(豫測) 능력(能力) 평가(評賣)는 Matney 등(等)(1988)에 의하여 개발(開發)된 수확(收穫) model을 이용(利用)하여 비교(比較)하였다. 사용되어진 자료(資料)는 미국(美國) 남부(南部) 테에다 소나무 시험림(試驗林)의 총(總)522개(個) plot을 이용(利用)하였다. 두 개(個)의 ridge estimator가 최소(最小) 자승법(自乘法)에 의한 추정치 보다 수확(收穫) 예측(豫測) 능력(能力)이 우수(優秀)하였으며, 특히 Mallows's statistic에 의한 ridge estimator가 가장 우수(優秀)하였다. 따라서 ridge estimator는 수확(收穫) 예측(豫測) model의 독립(獨立) 변수(變數) 간(間)에 multicollinearity 문제점(問題點)이 있을 때 최소(最小) 자승법(自乘法)에 의 한 추정치를 대치(代置)할 수 있는 estimator로서 추천(推薦)할 수 있었다.

  • PDF

SEM에서 위계모형을 이용한 다중공선성 문제 극복방안 연구 : 소셜커머스의 재구매의도 영향요인을 중심으로 (Exploring a Way to Overcome Multicollinearity Problems by Using Hierarchical Construct Model in Structural Equation Model)

  • 권순동
    • Journal of Information Technology Applications and Management
    • /
    • 제22권2호
    • /
    • pp.149-169
    • /
    • 2015
  • This study tried to find out how to overcome multicollinearity problems in the structural equation model by creating a hierarchical construct model about the repurchase intention of social commerce. This study selected, as independent variables, price, quality, service, and social influence, based on literature review about social commerce, and then, as detailed variables of independent variables, selected system quality, information quality, transaction safety, order fulfillment and after-sales service, communication, subjective norms, and reputation. As results of empirical analysis about hierarchical construct model, all the independent variables were accepted having a significant impact on repurchase intention of social commerce. Next, this study analyzed the competition model that eight independent variables of price, system quality, information quality, transaction safety, order fulfillment and after-sales service, communication, subjective norm, and reputation directly influence the repurchase intention of social commerce. As results of empirical analysis, system quality, information quality, transaction safety, communication appeared to be insignificant. This study showed that hierarchical construct model is useful to overcome the multicollinearity problem in structural equational model and to increase explanatory power.

Optimal fractions in terms of a prediction-oriented measure

  • Lee, Won-Woo
    • Journal of the Korean Statistical Society
    • /
    • 제22권2호
    • /
    • pp.209-217
    • /
    • 1993
  • The multicollinearity problem in a multiple linear regression model may present deleterious effects on predictions. Thus, its is desirable to consider the optimal fractions with respect to the unbiased estimate of the mean squares errors of the predicted values. Interstingly, the optimal fractions can be also illuminated by the Bayesian inerpretation of the general James-Stein estimators.

  • PDF

Optimizing SVM Ensembles Using Genetic Algorithms in Bankruptcy Prediction

  • Kim, Myoung-Jong;Kim, Hong-Bae;Kang, Dae-Ki
    • Journal of information and communication convergence engineering
    • /
    • 제8권4호
    • /
    • pp.370-376
    • /
    • 2010
  • Ensemble learning is a method for improving the performance of classification and prediction algorithms. However, its performance can be degraded due to multicollinearity problem where multiple classifiers of an ensemble are highly correlated with. This paper proposes genetic algorithm-based optimization techniques of SVM ensemble to solve multicollinearity problem. Empirical results with bankruptcy prediction on Korea firms indicate that the proposed optimization techniques can improve the performance of SVM ensemble.