• 제목/요약/키워드: explanatory variable

검색결과 440건 처리시간 0.022초

다중회귀에서 회귀계수 추정량의 특성 (Comments on the regression coefficients)

  • 강명욱
    • 응용통계연구
    • /
    • 제34권4호
    • /
    • pp.589-597
    • /
    • 2021
  • 단순회귀와 다중회귀에서 회귀계수의 의미는 차이가 있고 회귀계수의 추정값은 같지 않을 뿐 아니라 그 부호가 서로 다른 경우도 발생한다. 회귀모형에서 설명변수의 상대적 기여도의 파악은 회귀분석의 수행의 중요한 부분이다. 표준화 회귀모형에서 표준화 회귀계수는 해당 설명변수를 제외한 나머지 설명변수의 값이 고정되어있는 상황에서 설명변수가 표준편차만큼 증가하였을 때 반응변수가 표준편차를 기준으로 얼마나 변화했는가로 해석할 수 있지만 표준화 회귀계수의 크기가 각 설명변수의 상대적 중요도를 나타내는 척도라고 할 수 없음은 잘 알려져 있다. 본 논문에서는 다중회귀에서 회귀계수의 추정량을 상관계수와 결정계수의 함수로 나타내고 이를 추가적인 설명력과 추가적인 결정계수의 관점에서 생각해 본다. 또한 다양한 산점도에서의 상관계수와 회귀계수 추정값의 관계를 알아보고 설명변수가 두 개인 경우에 구체적으로 적용해 본다.

Two Diagnostic Plots in Constrained Regression

  • Kim, Myung-Geun
    • Communications for Statistical Applications and Methods
    • /
    • 제16권3호
    • /
    • pp.495-500
    • /
    • 2009
  • Two diagnostic plots, added variable plot and partial residual plot, are proposed when a new explanatory variable is linearly added to constrained regressions. They are useful for investigating the effect of adding an explanatory variable to the constrained regression. They visually give an overall impression of the strength of linear relationship between response variable and added variable. A numerical example is provided for illustration.

Improving Deep Learning Models Considering the Time Lags between Explanatory and Response Variables

  • Chaehyeon Kim;Ki Yong Lee
    • Journal of Information Processing Systems
    • /
    • 제20권3호
    • /
    • pp.345-359
    • /
    • 2024
  • A regression model represents the relationship between explanatory and response variables. In real life, explanatory variables often affect a response variable with a certain time lag, rather than immediately. For example, the marriage rate affects the birth rate with a time lag of 1 to 2 years. Although deep learning models have been successfully used to model various relationships, most of them do not consider the time lags between explanatory and response variables. Therefore, in this paper, we propose an extension of deep learning models, which automatically finds the time lags between explanatory and response variables. The proposed method finds out which of the past values of the explanatory variables minimize the error of the model, and uses the found values to determine the time lag between each explanatory variable and response variables. After determining the time lags between explanatory and response variables, the proposed method trains the deep learning model again by reflecting these time lags. Through various experiments applying the proposed method to a few deep learning models, we confirm that the proposed method can find a more accurate model whose error is reduced by more than 60% compared to the original model.

Biplots of Multivariate Data Guided by Linear and/or Logistic Regression

  • Huh, Myung-Hoe;Lee, Yonggoo
    • Communications for Statistical Applications and Methods
    • /
    • 제20권2호
    • /
    • pp.129-136
    • /
    • 2013
  • Linear regression is the most basic statistical model for exploring the relationship between a numerical response variable and several explanatory variables. Logistic regression secures the role of linear regression for the dichotomous response variable. In this paper, we propose a biplot-type display of the multivariate data guided by the linear regression and/or the logistic regression. The figures show the directional flow of the response variable as well as the interrelationship of explanatory variables.

여대생(女大生)의 성역할(性役割) 정체감(正體感)과 화장(化粧) 행동(行動)에 관(關)한 연구(硏究) (A Study on Sex Role Identity and Makeup Behavior)

  • 구자명;이구영
    • 패션비즈니스
    • /
    • 제6권2호
    • /
    • pp.124-136
    • /
    • 2002
  • This objective study were to classify the contents of makeup behavior, to investigate the relationship between makeup behavior and sex role identity, and to examine how the makeup behavior, makeup satisfaction was influenced by sex role identity and demographics. To achieve this, the researchers surveyed 162 women for the ages of 18 through 25. The result of this study are followed. 1) Four factor of makeup behavior were sexual attractiveness, aesthetic, psychological dependence and makeup interest. 2) There were significant positive relationship between makeup behavior and sex role identity. 3) Sexual attractiveness were influenced by femininity, income. The explanatory power of the 2 variables were 8.5%. Aesthetic were influenced by masculinity. The explanatory power of the 1 variable was 9.2%. Psychological dependence were influenced by femininity. The explanatory power of the 1 variable was 8.2%. Makeup interest were influenced by masculinity, age. The explanatory power of the 2 variables were 9.0%. 4 Makeup satisfaction were influenced by sexual attractiveness, aesthetic. The explanatory power of the 2 variables were 22.1%.

의사결정나무와 손실함수를 이용한 공정파라미터 허용차 설계에 관한 연구 (A Study on the Design of Tolerance for Process Parameter using Decision Tree and Loss Function)

  • 김용준;정영배
    • 산업경영시스템학회지
    • /
    • 제39권1호
    • /
    • pp.123-129
    • /
    • 2016
  • In the manufacturing industry fields, thousands of quality characteristics are measured in a day because the systems of process have been automated through the development of computer and improvement of techniques. Also, the process has been monitored in database in real time. Particularly, the data in the design step of the process have contributed to the product that customers have required through getting useful information from the data and reflecting them to the design of product. In this study, first, characteristics and variables affecting to them in the data of the design step of the process were analyzed by decision tree to find out the relation between explanatory and target variables. Second, the tolerance of continuous variables influencing on the target variable primarily was shown by the application of algorithm of decision tree, C4.5. Finally, the target variable, loss, was calculated by a loss function of Taguchi and analyzed. In this paper, the general method that the value of continuous explanatory variables has been used intactly not to be transformed to the discrete value and new method that the value of continuous explanatory variables was divided into 3 categories were compared. As a result, first, the tolerance obtained from the new method was more effective in decreasing the target variable, loss, than general method. In addition, the tolerance levels for the continuous explanatory variables to be chosen of the major variables were calculated. In further research, a systematic method using decision tree of data mining needs to be developed in order to categorize continuous variables under various scenarios of loss function.

다중회귀모형의 그래픽적 방법 (Graphical Method for Multiple Regression Model)

  • 이우리;이의기;홍종선
    • 응용통계연구
    • /
    • 제20권1호
    • /
    • pp.195-204
    • /
    • 2007
  • 기하학적인 방법을 사용하여 다중회귀모형 자료를 그래프로 구현하는 회귀제곱합 그림을 제안한다. 두 설명변수의 회귀제곱합은 한 변수의 단순회귀제곱합과 한 변수의 회귀모형에 다른 변수가 추가되었을 때 회귀제곱합의 증가분의 합으로 표현되는 관계식을 이용하여 회귀제곱합 그림을 반원의 형태로 구현한다. 회귀제곱합 그림은 설명변수에 대응하는 벡터로 표현되고, 반응변수에 영향력 정도를 시각적으로 구현하는 그래픽적인 방법이다. 수평축에 가까운 벡터에 대응하는 설명변수가 반응변수에 더 많은 영향을 주는 설명변수라고 판단할 수 있다 또한 두개의 설명변수에 대응하는 벡터 사이의 각도 크기로 서프레션의 발생여부를 진단 가능하다.

손실 비용을 고려한 공정 파라미터 허용차 산출 : 망대 특성치의 경우 (Tolerance Computation for Process Parameter Considering Loss Cost : In Case of the Larger is better Characteristics)

  • 김용준;김근식;박형근
    • 산업경영시스템학회지
    • /
    • 제40권2호
    • /
    • pp.129-136
    • /
    • 2017
  • Among the information technology and automation that have rapidly developed in the manufacturing industries recently, tens of thousands of quality variables are estimated and categorized in database every day. The former existing statistical methods, or variable selection and interpretation by experts, place limits on proper judgment. Accordingly, various data mining methods, including decision tree analysis, have been developed in recent years. Cart and C5.0 are representative algorithms for decision tree analysis, but these algorithms have limits in defining the tolerance of continuous explanatory variables. Also, target variables are restricted by the information that indicates only the quality of the products like the rate of defective products. Therefore it is essential to develop an algorithm that improves upon Cart and C5.0 and allows access to new quality information such as loss cost. In this study, a new algorithm was developed not only to find the major variables which minimize the target variable, loss cost, but also to overcome the limits of Cart and C5.0. The new algorithm is one that defines tolerance of variables systematically by adopting 3 categories of the continuous explanatory variables. The characteristics of larger-the-better was presumed in the environment of programming R to compare the performance among the new algorithm and existing ones, and 10 simulations were performed with 1,000 data sets for each variable. The performance of the new algorithm was verified through a mean test of loss cost. As a result of the verification show, the new algorithm found that the tolerance of continuous explanatory variables lowered loss cost more than existing ones in the larger is better characteristics. In a conclusion, the new algorithm could be used to find the tolerance of continuous explanatory variables to minimize the loss in the process taking into account the loss cost of the products.

교통사고모형 개발에서의 함수식 도출 방법론에 관한 연구 (Methodology for Determining Functional Forms in Developing Statistical Collision Models)

  • 백종대;험머 조셉
    • 한국도로학회논문집
    • /
    • 제14권5호
    • /
    • pp.189-199
    • /
    • 2012
  • PURPOSES: The purpose of this study is to propose a new methodology for developing statistical collision models and to show the validation results of the methodology. METHODS: A new modeling method of introducing variables into the model one by one in a multiplicative form is suggested. A method for choosing explanatory variables to be introduced into the model is explained. A method for determining functional forms for each explanatory variable is introduced as well as a parameter estimating procedure. A model selection method is also dealt with. Finally, the validation results is provided to demonstrate the efficacy of the final models developed using the method suggested in this study. RESULTS: According to the results of the validation for the total and injury collisions, the predictive powers of the models developed using the method suggested in this study were better than those of generalized linear models for the same data. CONCLUSIONS: Using the methodology suggested in this study, we could develop better statistical collision models having better predictive powers. This was because the methodology enabled us to find the relationships between dependant variable and each explanatory variable individually and to find the functional forms for the relationships which can be more likely non-linear.

엑셀 VBA을 이용한 가변수 회귀모형 교육도구 개발 (An educational tool for regression models with dummy variables using Excel VBA)

  • 최현석;박철용
    • Journal of the Korean Data and Information Science Society
    • /
    • 제24권3호
    • /
    • pp.593-601
    • /
    • 2013
  • 회귀모형에서 범주형 변수를 독립변수로 포함시켜야 할 경우가 발생한다. 회귀모형의 범주형 변수는 가변수를 통해 수량화된다. 이 연구에서는 하나의 양적 독립변수와 하나 혹은 두 개의 범주형 독립변수를 가지는 회귀모형에 대해 가설검정 결과와 함께 회귀직선을 보여주는 교육용 도구를 엑셀 VBA (Visual Basic for application)를 통해서 구현한다. 가설검정 결과와 회귀직선은 교호작용이 포함된 모형, 교호작용이 없는 모형 및 가변수가 없는 모형에 대해 단계별로 제공된다. 이 교육도구를 통해 가변수와 교호작용의 의미를 더 쉽게 이해할 수 있으며, 나아가 어떤 모형이 주어진 자료에 가장 적합한지 그림을 통해 판단할 수 있게 된다.