• Title/Summary/Keyword: 다변량 회귀분석

Search Result 313, Processing Time 0.033 seconds

Comparison of Principal Component Regression and Nonparametric Multivariate Trend Test for Multivariate Linkage (다변량 형질의 유전연관성에 대한 주성분을 이용한 회귀방법와 다변량 비모수 추세검정법의 비교)

  • Kim, Su-Young;Song, Hae-Hiang
    • The Korean Journal of Applied Statistics
    • /
    • v.21 no.1
    • /
    • pp.19-33
    • /
    • 2008
  • Linear regression method, proposed by Haseman and Elston(1972), for detecting linkage to a quantitative trait of sib pairs is a linkage testing method for a single locus and a single trait. However, multivariate methods for detecting linkage are needed, when information from each of several traits that are affected by the same major gene are available on each individual. Amos et al. (1990) extended the regression method of Haseman and Elston(1972) to incorporate observations of two or more traits by estimating the principal component linear function that results in the strongest correlation between the squared pair differences in the trait measurements and identity by descent at a marker locus. But, it is impossible to control the probability of type I errors with this method at present, since the exact distribution of the statistic that they use is yet unknown. In this paper, we propose a multivariate nonparametric trend test for detecting linkage to multiple traits. We compared with a simulation study the efficiencies of multivariate nonparametric trend test with those of the method developed by Amos et al. (1990) for quantitative traits data. For multivariate nonparametric trend test, the results of the simulation study reveal that the Type I error rates are close to the predetermined significance levels, and have in general high powers.

Principal selected response reduction in multivariate regression (다변량회귀에서 주선택 반응변수 차원축소)

  • Yoo, Jae Keun
    • The Korean Journal of Applied Statistics
    • /
    • v.34 no.4
    • /
    • pp.659-669
    • /
    • 2021
  • Multivariate regression often appears in longitudinal or functional data analysis. Since multivariate regression involves multi-dimensional response variables, it is more strongly affected by the so-called curse of dimension that univariate regression. To overcome this issue, Yoo (2018) and Yoo (2019a) proposed three model-based response dimension reduction methodologies. According to various numerical studies in Yoo (2019a), the default method suggested in Yoo (2019a) is least sensitive to the simulated models, but it is not the best one. To release this issue, the paper proposes an selection algorithm by comparing the other two methods with the default one. This approach is called principal selected response reduction. Various simulation studies show that the proposed method provides more accurate estimation results than the default one by Yoo (2019a), and it confirms practical and empirical usefulness of the propose method over the default one by Yoo (2019a).

Multivariate Analysis for Clinicians (임상의를 위한 다변량 분석의 실제)

  • Oh, Joo Han;Chung, Seok Won
    • Clinics in Shoulder and Elbow
    • /
    • v.16 no.1
    • /
    • pp.63-72
    • /
    • 2013
  • In medical research, multivariate analysis, especially multiple regression analysis, is used to analyze the influence of multiple variables on the result. Multiple regression analysis should include variables in the model and the problem of multi-collinearity as there are many variables as well as the basic assumption of regression analysis. The multiple regression model is expressed as the coefficient of determination, $R^2$ and the influence of independent variables on result as a regression coefficient, ${\beta}$. Multiple regression analysis can be divided into multiple linear regression analysis, multiple logistic regression analysis, and Cox regression analysis according to the type of dependent variables (continuous variable, categorical variable (binary logit), and state variable, respectively), and the influence of variables on the result is evaluated by regression coefficient${\beta}$, odds ratio, and hazard ratio, respectively. The knowledge of multivariate analysis enables clinicians to analyze the result accurately and to design the further research efficiently.

Multivariate quantile regression tree (다변량 분위수 회귀나무 모형에 대한 연구)

  • Kim, Jaeoh;Cho, HyungJun;Bang, Sungwan
    • Journal of the Korean Data and Information Science Society
    • /
    • v.28 no.3
    • /
    • pp.533-545
    • /
    • 2017
  • Quantile regression models provide a variety of useful statistical information by estimating the conditional quantile function of the response variable. However, the traditional linear quantile regression model can lead to the distorted and incorrect results when analysing real data having a nonlinear relationship between the explanatory variables and the response variables. Furthermore, as the complexity of the data increases, it is required to analyse multiple response variables simultaneously with more sophisticated interpretations. For such reasons, we propose a multivariate quantile regression tree model. In this paper, a new split variable selection algorithm is suggested for a multivariate regression tree model. This algorithm can select the split variable more accurately than the previous method without significant selection bias. We investigate the performance of our proposed method with both simulation and real data studies.

Evaluation of applicability of pan coefficient estimation method by multiple linear regression analysis (다변량 선형회귀분석을 이용한 증발접시계수 산정방법 적용성 검토)

  • Rim, Chang-Soo
    • Journal of Korea Water Resources Association
    • /
    • v.55 no.3
    • /
    • pp.229-243
    • /
    • 2022
  • The effects of monthly meteorological data measured at 11 stations in South Korea on pan coefficient were analyzed to develop the four types of multiple linear regression models for estimating pan coefficients. To evaluate the applicability of developed models, the models were compared with six previous models. Pan coefficients were most affected by air temperature for January, February, March, July, November and December, and by solar radiation for other months. On the whole, for 12 months of the year, the effects of wind speed and relative humidity on pan coefficient were less significant, compared with those of air temperature and solar radiation. For all meteorological stations and months, the model developed by applying 5 independent variables (wind speed, relative humidity, air temperature, ratio of sunshine duration and daylight duration, and solar radiation) for each station was the most effective for evaporation estimation. The model validation results indicate that the multiple linear regression models can be applied to some particular stations and months.

Comparison of Forecasting Performance in Multivariate Nonstationary Seasonal Time Series Models (다변량 비정상 계절형 시계열모형의 예측력 비교)

  • Seong, Byeong-Chan
    • Communications for Statistical Applications and Methods
    • /
    • v.18 no.1
    • /
    • pp.13-21
    • /
    • 2011
  • This paper studies the analysis of multivariate nonstationary time series with seasonality. Three types of multivariate time series models are considered: seasonal cointegration model, nonseasonal cointegration model with seasonal dummies, and vector autoregressive model in seasonal differences that are compared for forecasting performances using Korean macro-economic time series data. The cointegration models produce smaller forecast errors in short horizons; however, when longer forecasting periods are considered the vector autoregressive model appears preferable.

Statistical Outliers in Florida Counties at the Presidential Election 2000 (2000년 미국대선 플로리다주의 투표결과 분석)

  • 김현철
    • The Korean Journal of Applied Statistics
    • /
    • v.15 no.1
    • /
    • pp.21-32
    • /
    • 2002
  • We searched out in the votes data of the State of Florida at presidential election 2000. We used a multivariate regression analysis. We got there were several outliers including Palm Beach County. It means that we should analyze the number of disqualified ballots which were double-punched as well as the votes, to insist the " Butterfly Ballot" made Palm Beach outlier.

Prediction of Retention Time for PAH Molecule in HPLC (고속액체 크로마토그래피에서 PAH분자의 구조에 따른 용리시간 예측)

  • Kim, Young-Gu
    • Journal of the Korean Chemical Society
    • /
    • v.44 no.2
    • /
    • pp.102-108
    • /
    • 2000
  • Relative retention times (RRTs) of RAH molecules in HPLC are trained and predicted intesting sets using a multiple linear regression (NLR) and an artificial neural network (ANN). The maindescriptors in QSRR are molecular connectivity ($^1X_v,\;^2X_v$), the length-to-breadth ratios (L/B), and molecular dipole moment(D). L/B which is related with slot model is a good descripter in ANN, but isn't in MLR. Varainces which show the accuracy of prediction times in testing sets are 0.0099, 0.0114 for ANN and MLR, respectively. It was shown that ANN can exceed the MLR in prediction accuracy.

  • PDF

기업부도예측을 위한 통합알고리즘

  • Bae Jae-Gwon;Kim Jin-Hwa
    • Proceedings of the Korea Inteligent Information System Society Conference
    • /
    • 2006.06a
    • /
    • pp.195-202
    • /
    • 2006
  • 본 연구에서는 보다 효과적인 기업부도예측을 위하여, 동계적 방법과 인공지능 방법을 결합한 통합모형을 제시하였다. 이를 위하여 통계적인 모형 중에서 가장 널리 활용되고 있는 다변량 판별분석, 로지스틱 회귀분석과 인공 지능적인 방법으로서 최근 널리 사용되고 있는 인공신경망, 규칙유도기법, 베이지안 망의 5가지 방법론을 통합한 Voting with Performance & Weights from ANN(WP-ANN) 통합모형을 제시하였다. 실험결과, 본 연구에서 제안한 WP-ANN 통합모형은 다변량 판별분석, 로지스탁 회귀분석, 인공신경망, 규칙유도기법, 베이지안 망 등의 단일모형과 비교한 결과 가장 예측정확성이 유수한 것으로 나타났다. 따라서 본 연구를 통해 기업부도예측에 있어서 WP-ANN 통합모형이 기존의 모형들에 비해 우수한 예측정확성을 나타냄을 알 수 있었다.

  • PDF

Rock TBM design model derived from the multi-variate regression analysis of TBM driving data (TBM 굴진자료의 다변량 회귀분석에 의한 암반대응형 TBM의 설계모델 도출)

  • Chang, Soo-Ho;Choi, Soon-Wook;Lee, Gyu-Phil;Bae, Gyu-Jin
    • Journal of Korean Tunnelling and Underground Space Association
    • /
    • v.13 no.6
    • /
    • pp.531-555
    • /
    • 2011
  • This study aims to derive the statistical models for the estimation of the required specifications of a rock TBM as well as for its cutterhead design suitable for a given rock mass condition. From a series of multi-variate regression analysis of 871 TBM driving data and 51 linear rock cutting test results, the optimum models were newly proposed to consider a variety of rock properties and mechanical cutting conditions. When the derived models were applied to two domestic shield tunnels, their predictions of cutter penetration depth, cutter acting forces and cutter spacing were very close to real TBM driving data, showing their high applicability.