• Title/Summary/Keyword: 다변수 선형 회귀

Search Result 27, Processing Time 0.025 seconds

How to Measure Nonlinear Dependence in Hydrologic Time Series (시계열 수문자료의 비선형 상관관계)

  • Mun, Yeong-Il
    • Journal of Korea Water Resources Association
    • /
    • v.30 no.6
    • /
    • pp.641-648
    • /
    • 1997
  • Mutual information is useful for analyzing nonlinear dependence in time series in much the same way as correlation is used to characterize linear dependence. We use multivariate kernel density estimators for the estimation of mutual information at different time lags for single and multiple time series. This approach is tested on a variety of hydrologic data sets, and suggested an appropriate delay time $ au$ at which the mutual information is almost zerothen multi-dimensional phase portraits could be constructed from measurements of a single scalar time series.

  • PDF

Multivariate quantile regression tree (다변량 분위수 회귀나무 모형에 대한 연구)

  • Kim, Jaeoh;Cho, HyungJun;Bang, Sungwan
    • Journal of the Korean Data and Information Science Society
    • /
    • v.28 no.3
    • /
    • pp.533-545
    • /
    • 2017
  • Quantile regression models provide a variety of useful statistical information by estimating the conditional quantile function of the response variable. However, the traditional linear quantile regression model can lead to the distorted and incorrect results when analysing real data having a nonlinear relationship between the explanatory variables and the response variables. Furthermore, as the complexity of the data increases, it is required to analyse multiple response variables simultaneously with more sophisticated interpretations. For such reasons, we propose a multivariate quantile regression tree model. In this paper, a new split variable selection algorithm is suggested for a multivariate regression tree model. This algorithm can select the split variable more accurately than the previous method without significant selection bias. We investigate the performance of our proposed method with both simulation and real data studies.

Evaluation of applicability of pan coefficient estimation method by multiple linear regression analysis (다변량 선형회귀분석을 이용한 증발접시계수 산정방법 적용성 검토)

  • Rim, Chang-Soo
    • Journal of Korea Water Resources Association
    • /
    • v.55 no.3
    • /
    • pp.229-243
    • /
    • 2022
  • The effects of monthly meteorological data measured at 11 stations in South Korea on pan coefficient were analyzed to develop the four types of multiple linear regression models for estimating pan coefficients. To evaluate the applicability of developed models, the models were compared with six previous models. Pan coefficients were most affected by air temperature for January, February, March, July, November and December, and by solar radiation for other months. On the whole, for 12 months of the year, the effects of wind speed and relative humidity on pan coefficient were less significant, compared with those of air temperature and solar radiation. For all meteorological stations and months, the model developed by applying 5 independent variables (wind speed, relative humidity, air temperature, ratio of sunshine duration and daylight duration, and solar radiation) for each station was the most effective for evaporation estimation. The model validation results indicate that the multiple linear regression models can be applied to some particular stations and months.

Multivariate Analysis for Clinicians (임상의를 위한 다변량 분석의 실제)

  • Oh, Joo Han;Chung, Seok Won
    • Clinics in Shoulder and Elbow
    • /
    • v.16 no.1
    • /
    • pp.63-72
    • /
    • 2013
  • In medical research, multivariate analysis, especially multiple regression analysis, is used to analyze the influence of multiple variables on the result. Multiple regression analysis should include variables in the model and the problem of multi-collinearity as there are many variables as well as the basic assumption of regression analysis. The multiple regression model is expressed as the coefficient of determination, $R^2$ and the influence of independent variables on result as a regression coefficient, ${\beta}$. Multiple regression analysis can be divided into multiple linear regression analysis, multiple logistic regression analysis, and Cox regression analysis according to the type of dependent variables (continuous variable, categorical variable (binary logit), and state variable, respectively), and the influence of variables on the result is evaluated by regression coefficient${\beta}$, odds ratio, and hazard ratio, respectively. The knowledge of multivariate analysis enables clinicians to analyze the result accurately and to design the further research efficiently.

Prediction of Gas Chromatographic Retention Times of PAH Using QSRR (기체크로마토그래피에서 QSRR을 통한 PAH 용리시간 예측)

  • Kim, Young Gu
    • Journal of the Korean Chemical Society
    • /
    • v.45 no.5
    • /
    • pp.422-428
    • /
    • 2001
  • Retention relative times(RRTs) of PAH molecules and their derivatives in gas chromatography are trained and predicted in testing sets using a multiple linear regression(MLR) and an artificial neural network(ANN). The main descriptors of PAHs and their derivatives in QSRR are the square root of molecular weight(sqmw), molecular connectivity($^1{\chi}_v$), molecular dipole moment(D) and length-to-breadth ratios(L/B). The results of MLR shows that a heavy molecule has a propensity for long retention time. L/B closely related with slot model is a good descriptor in MLR. On the other hand, ANN which is not effected by the linear dependencies among the descriptors were exclusively based on molecular weight and molecular dipole moment. The variances which shows the accuracy of prediction for retention times in testing sets are 1.860, 0.206 for MLR and ANN, respectively. It was shown that ANN can exceed the MLR in prediction accuracy.

  • PDF

Effect of Dimension in Optimal Dimension Reduction Estimation for Conditional Mean Multivariate Regression (다변량회귀 조건부 평균모형에 대한 최적 차원축소 방법에서 차원수가 결과에 미치는 영향)

  • Seo, Eun-Kyoung;Park, Chong-Sun
    • Communications for Statistical Applications and Methods
    • /
    • v.19 no.1
    • /
    • pp.107-115
    • /
    • 2012
  • Yoo and Cook (2007) developed an optimal sufficient dimension reduction methodology for the conditional mean in multivariate regression and it is known that their method is asymptotically optimal and its test statistic has a chi-squared distribution asymptotically under the null hypothesis. To check the effect of dimension used in estimation on regression coefficients and the explanatory power of the conditional mean model in multivariate regression, we applied their method to several simulated data sets with various dimensions. A small simulation study showed that it is quite helpful to search for an appropriate dimension for a given data set if we use the asymptotic test for the dimension as well as results from the estimation with several dimensions simultaneously.

Mesh Stiffness Prediction Models for Aircraft Power Train Systems Using Machine Learning Ensemble (머신러닝 앙상블을 사용한 항공기 동력 전달 체계의 물림 강성 예측 모델)

  • Yeonjoon Kang;Yeonhi Kim;Jungsun Park
    • Journal of Aerospace System Engineering
    • /
    • v.18 no.5
    • /
    • pp.1-14
    • /
    • 2024
  • This paper aimed to develop mesh stiffness prediction models using spur gear design parameters as input variables through a machine learning ensemble method. A dataset was generated by calculating individual stiffness using a calculation method presented in previous studies and deriving the minimum and maximum values of total mesh stiffness. Using multivariate linear regression, support vector regression, and decision tree regression, models were created to predict the minimum and maximum values of mesh stiffness. The stacking ensemble method was used to create meta models. Prediction models of three algorithms were used as base models. These Ensemble meta models were verified with specifications of gears used in actual aircraft engine starters, showing very high prediction performances. Thus, feasibility of applying Ensemble meta models to an actual gear system and their effectiveness were confirmed.

Penalized least distance estimator in the multivariate regression model (다변량 선형회귀모형의 벌점화 최소거리추정에 관한 연구)

  • Jungmin Shin;Jongkyeong Kang;Sungwan Bang
    • The Korean Journal of Applied Statistics
    • /
    • v.37 no.1
    • /
    • pp.1-12
    • /
    • 2024
  • In many real-world data, multiple response variables are often dependent on the same set of explanatory variables. In particular, if several response variables are correlated with each other, simultaneous estimation considering the correlation between response variables might be more effective way than individual analysis by each response variable. In this multivariate regression analysis, least distance estimator (LDE) can estimate the regression coefficients simultaneously to minimize the distance between each training data and the estimates in a multidimensional Euclidean space. It provides a robustness for the outliers as well. In this paper, we examine the least distance estimation method in multivariate linear regression analysis, and furthermore, we present the penalized least distance estimator (PLDE) for efficient variable selection. The LDE technique applied with the adaptive group LASSO penalty term (AGLDE) is proposed in this study which can reflect the correlation between response variables in the model and can efficiently select variables according to the importance of explanatory variables. The validity of the proposed method was confirmed through simulations and real data analysis.

Locally Weighted Polynomial Forecasting Model (지역가중다항식을 이용한 예측모형)

  • Mun, Yeong-Il
    • Journal of Korea Water Resources Association
    • /
    • v.33 no.1
    • /
    • pp.31-38
    • /
    • 2000
  • Relationships between hydrologic variables are often nonlinear. Usually the functional form of such a relationship is not known a priori. A multivariate, nonparametric regression methodology is provided here for approximating the underlying regression function using locally weighted polynomials. Locally weighted polynomials consider the approximation of the target function through a Taylor series expansion of the function in the neighborhood of the point of estimate. The utility of this nonparametric regression approach is demonstrated through an application to nonparametric short term forecasts of the biweekly Great Salt Lake volume.volume.

  • PDF

Comparison of Principal Component Regression and Nonparametric Multivariate Trend Test for Multivariate Linkage (다변량 형질의 유전연관성에 대한 주성분을 이용한 회귀방법와 다변량 비모수 추세검정법의 비교)

  • Kim, Su-Young;Song, Hae-Hiang
    • The Korean Journal of Applied Statistics
    • /
    • v.21 no.1
    • /
    • pp.19-33
    • /
    • 2008
  • Linear regression method, proposed by Haseman and Elston(1972), for detecting linkage to a quantitative trait of sib pairs is a linkage testing method for a single locus and a single trait. However, multivariate methods for detecting linkage are needed, when information from each of several traits that are affected by the same major gene are available on each individual. Amos et al. (1990) extended the regression method of Haseman and Elston(1972) to incorporate observations of two or more traits by estimating the principal component linear function that results in the strongest correlation between the squared pair differences in the trait measurements and identity by descent at a marker locus. But, it is impossible to control the probability of type I errors with this method at present, since the exact distribution of the statistic that they use is yet unknown. In this paper, we propose a multivariate nonparametric trend test for detecting linkage to multiple traits. We compared with a simulation study the efficiencies of multivariate nonparametric trend test with those of the method developed by Amos et al. (1990) for quantitative traits data. For multivariate nonparametric trend test, the results of the simulation study reveal that the Type I error rates are close to the predetermined significance levels, and have in general high powers.