• 제목/요약/키워드: 다변량회귀

Search Result 337, Processing Time 0.025 seconds

Multivariate quantile regression tree (다변량 분위수 회귀나무 모형에 대한 연구)

  • Kim, Jaeoh;Cho, HyungJun;Bang, Sungwan
    • Journal of the Korean Data and Information Science Society
    • /
    • v.28 no.3
    • /
    • pp.533-545
    • /
    • 2017
  • Quantile regression models provide a variety of useful statistical information by estimating the conditional quantile function of the response variable. However, the traditional linear quantile regression model can lead to the distorted and incorrect results when analysing real data having a nonlinear relationship between the explanatory variables and the response variables. Furthermore, as the complexity of the data increases, it is required to analyse multiple response variables simultaneously with more sophisticated interpretations. For such reasons, we propose a multivariate quantile regression tree model. In this paper, a new split variable selection algorithm is suggested for a multivariate regression tree model. This algorithm can select the split variable more accurately than the previous method without significant selection bias. We investigate the performance of our proposed method with both simulation and real data studies.

Prediction of Retention Time for PAH Molecule in HPLC (고속액체 크로마토그래피에서 PAH분자의 구조에 따른 용리시간 예측)

  • Kim, Young-Gu
    • Journal of the Korean Chemical Society
    • /
    • v.44 no.2
    • /
    • pp.102-108
    • /
    • 2000
  • Relative retention times (RRTs) of RAH molecules in HPLC are trained and predicted intesting sets using a multiple linear regression (NLR) and an artificial neural network (ANN). The maindescriptors in QSRR are molecular connectivity ($^1X_v,\;^2X_v$), the length-to-breadth ratios (L/B), and molecular dipole moment(D). L/B which is related with slot model is a good descripter in ANN, but isn't in MLR. Varainces which show the accuracy of prediction times in testing sets are 0.0099, 0.0114 for ANN and MLR, respectively. It was shown that ANN can exceed the MLR in prediction accuracy.

  • PDF

다변량회귀모형(多變量回歸模型)을 이용한 규제변동(規制變動)의 재무효과 측정(測定)

  • Yu, Beom-Jun
    • The Korean Journal of Financial Management
    • /
    • v.9 no.1
    • /
    • pp.83-109
    • /
    • 1992
  • 본 연구는 다변량회귀모형(多變量回歸模型)이 동일한 산업내 동일한 시기에 이루어진 규제변동(規制變動)의 재무효과를 측정하는 데에 시장모형(市場模型)보다 장기간에 걸친 복수의 가변적 발표내용, 규제관련기업의 차별적 주가수익반응, 그리고 주가수익잔차간 높은 상관관계 등의 규제특성과 방법론적 문제점을 해결하는 데에 유용한 사건모형(事件模型)임을 실증하고자 한다. 본 연구는 규제변동의 실증적 사례로서 1988년 12월 2일 정부가 발표한 ${\ulcorner}$자본시장국제화의 단계적 확대추진계획${\lrcorner}$에 이르기까지의 일련의 법제적 조치와 발표내용을 사건으로 하여 금융증권산업내 은행, 증권회사, 보험회사 그리고 투자금융회사의 평균적, 개별적, 포트폴리오 비정상수익에 관한 제반공동가설을 모수추정(母數推定)의 제약(制約)에 따라 비제약적(非制約的) 다변량회귀모형(多變量回歸模型) 또는 제약적(制約的) 다변량회귀모형(多變量回歸模型)으로 검증하였다. 모든 13개 발표사건에 대한 평균적, 개별적, 포트폴리오 비정상수익의 가설검증결과에서 은행과 증권회사는 모두 통계적으로 비유의적 반응을 보인 반면, 보험회사와 투자금융회사는 최종발표일이 다가오면서 일부 발표사건에 유의적인 평균반응과 개별반응을 보였다. 특히 모든 금융증권기관은 모든 사건에 비유의적 포트폴리오반응을 보여, Stigler가 제시한 '부(富)의 이전가설(移轉假說)'은 기각되지 못하였다.

  • PDF

Principal selected response reduction in multivariate regression (다변량회귀에서 주선택 반응변수 차원축소)

  • Yoo, Jae Keun
    • The Korean Journal of Applied Statistics
    • /
    • v.34 no.4
    • /
    • pp.659-669
    • /
    • 2021
  • Multivariate regression often appears in longitudinal or functional data analysis. Since multivariate regression involves multi-dimensional response variables, it is more strongly affected by the so-called curse of dimension that univariate regression. To overcome this issue, Yoo (2018) and Yoo (2019a) proposed three model-based response dimension reduction methodologies. According to various numerical studies in Yoo (2019a), the default method suggested in Yoo (2019a) is least sensitive to the simulated models, but it is not the best one. To release this issue, the paper proposes an selection algorithm by comparing the other two methods with the default one. This approach is called principal selected response reduction. Various simulation studies show that the proposed method provides more accurate estimation results than the default one by Yoo (2019a), and it confirms practical and empirical usefulness of the propose method over the default one by Yoo (2019a).

Comparison of Principal Component Regression and Nonparametric Multivariate Trend Test for Multivariate Linkage (다변량 형질의 유전연관성에 대한 주성분을 이용한 회귀방법와 다변량 비모수 추세검정법의 비교)

  • Kim, Su-Young;Song, Hae-Hiang
    • The Korean Journal of Applied Statistics
    • /
    • v.21 no.1
    • /
    • pp.19-33
    • /
    • 2008
  • Linear regression method, proposed by Haseman and Elston(1972), for detecting linkage to a quantitative trait of sib pairs is a linkage testing method for a single locus and a single trait. However, multivariate methods for detecting linkage are needed, when information from each of several traits that are affected by the same major gene are available on each individual. Amos et al. (1990) extended the regression method of Haseman and Elston(1972) to incorporate observations of two or more traits by estimating the principal component linear function that results in the strongest correlation between the squared pair differences in the trait measurements and identity by descent at a marker locus. But, it is impossible to control the probability of type I errors with this method at present, since the exact distribution of the statistic that they use is yet unknown. In this paper, we propose a multivariate nonparametric trend test for detecting linkage to multiple traits. We compared with a simulation study the efficiencies of multivariate nonparametric trend test with those of the method developed by Amos et al. (1990) for quantitative traits data. For multivariate nonparametric trend test, the results of the simulation study reveal that the Type I error rates are close to the predetermined significance levels, and have in general high powers.

Effect of Dimension in Optimal Dimension Reduction Estimation for Conditional Mean Multivariate Regression (다변량회귀 조건부 평균모형에 대한 최적 차원축소 방법에서 차원수가 결과에 미치는 영향)

  • Seo, Eun-Kyoung;Park, Chong-Sun
    • Communications for Statistical Applications and Methods
    • /
    • v.19 no.1
    • /
    • pp.107-115
    • /
    • 2012
  • Yoo and Cook (2007) developed an optimal sufficient dimension reduction methodology for the conditional mean in multivariate regression and it is known that their method is asymptotically optimal and its test statistic has a chi-squared distribution asymptotically under the null hypothesis. To check the effect of dimension used in estimation on regression coefficients and the explanatory power of the conditional mean model in multivariate regression, we applied their method to several simulated data sets with various dimensions. A small simulation study showed that it is quite helpful to search for an appropriate dimension for a given data set if we use the asymptotic test for the dimension as well as results from the estimation with several dimensions simultaneously.

Comparison of Forecasting Performance in Multivariate Nonstationary Seasonal Time Series Models (다변량 비정상 계절형 시계열모형의 예측력 비교)

  • Seong, Byeong-Chan
    • Communications for Statistical Applications and Methods
    • /
    • v.18 no.1
    • /
    • pp.13-21
    • /
    • 2011
  • This paper studies the analysis of multivariate nonstationary time series with seasonality. Three types of multivariate time series models are considered: seasonal cointegration model, nonseasonal cointegration model with seasonal dummies, and vector autoregressive model in seasonal differences that are compared for forecasting performances using Korean macro-economic time series data. The cointegration models produce smaller forecast errors in short horizons; however, when longer forecasting periods are considered the vector autoregressive model appears preferable.

Evaluation of applicability of pan coefficient estimation method by multiple linear regression analysis (다변량 선형회귀분석을 이용한 증발접시계수 산정방법 적용성 검토)

  • Rim, Chang-Soo
    • Journal of Korea Water Resources Association
    • /
    • v.55 no.3
    • /
    • pp.229-243
    • /
    • 2022
  • The effects of monthly meteorological data measured at 11 stations in South Korea on pan coefficient were analyzed to develop the four types of multiple linear regression models for estimating pan coefficients. To evaluate the applicability of developed models, the models were compared with six previous models. Pan coefficients were most affected by air temperature for January, February, March, July, November and December, and by solar radiation for other months. On the whole, for 12 months of the year, the effects of wind speed and relative humidity on pan coefficient were less significant, compared with those of air temperature and solar radiation. For all meteorological stations and months, the model developed by applying 5 independent variables (wind speed, relative humidity, air temperature, ratio of sunshine duration and daylight duration, and solar radiation) for each station was the most effective for evaporation estimation. The model validation results indicate that the multiple linear regression models can be applied to some particular stations and months.

Note on the estimation of informative predictor subspace and projective-resampling informative predictor subspace (다변량회귀에서 정보적 설명 변수 공간의 추정과 투영-재표본 정보적 설명 변수 공간 추정의 고찰)

  • Yoo, Jae Keun
    • The Korean Journal of Applied Statistics
    • /
    • v.35 no.5
    • /
    • pp.657-666
    • /
    • 2022
  • An informative predictor subspace is useful to estimate the central subspace, when conditions required in usual suffcient dimension reduction methods fail. Recently, for multivariate regression, Ko and Yoo (2022) newly defined a projective-resampling informative predictor subspace, instead of the informative predictor subspace, by the adopting projective-resampling method (Li et al. 2008). The new space is contained in the informative predictor subspace but contains the central subspace. In this paper, a method directly to estimate the informative predictor subspace is proposed, and it is compapred with the method by Ko and Yoo (2022) through theoretical aspects and numerical studies. The numerical studies confirm that the Ko-Yoo method is better in the estimation of the central subspace than the proposed method and is more efficient in sense that the former has less variation in the estimation.

Multivariate Analysis for Clinicians (임상의를 위한 다변량 분석의 실제)

  • Oh, Joo Han;Chung, Seok Won
    • Clinics in Shoulder and Elbow
    • /
    • v.16 no.1
    • /
    • pp.63-72
    • /
    • 2013
  • In medical research, multivariate analysis, especially multiple regression analysis, is used to analyze the influence of multiple variables on the result. Multiple regression analysis should include variables in the model and the problem of multi-collinearity as there are many variables as well as the basic assumption of regression analysis. The multiple regression model is expressed as the coefficient of determination, $R^2$ and the influence of independent variables on result as a regression coefficient, ${\beta}$. Multiple regression analysis can be divided into multiple linear regression analysis, multiple logistic regression analysis, and Cox regression analysis according to the type of dependent variables (continuous variable, categorical variable (binary logit), and state variable, respectively), and the influence of variables on the result is evaluated by regression coefficient${\beta}$, odds ratio, and hazard ratio, respectively. The knowledge of multivariate analysis enables clinicians to analyze the result accurately and to design the further research efficiently.