• Title/Summary/Keyword: Multivariate Linear Regression

Search Result 199, Processing Time 0.026 seconds

On a Bayesian Estimation of Multivariate Regression Models with Constrained Coefficient Matrix

  • Kim, Hea-Jung
    • Journal of Korean Society for Quality Management
    • /
    • v.26 no.4
    • /
    • pp.151-165
    • /
    • 1998
  • Consider the linear multivariate regression model $Y=X_1B_1+X_2B_2+U$, where Vec(U)~N(0, $\sum \bigotimes I_N$). This paper is concerned with Bayes infreence of the model when it is suspected that the elements of $B_2$ are constrained in the form of intervals. The use of the Gibbs sampler as a method for calculating Bayesian marginal posterior desnities of the parameters under a generalized conjugate prior is developed. It is shown that the a, pp.oach is straightforward to specify distributionally and to implement computationally, with output readily adopted for required inference summaries. The method developed is a, pp.ied to a real problem.

  • PDF

Penalized least distance estimator in the multivariate regression model (다변량 선형회귀모형의 벌점화 최소거리추정에 관한 연구)

  • Jungmin Shin;Jongkyeong Kang;Sungwan Bang
    • The Korean Journal of Applied Statistics
    • /
    • v.37 no.1
    • /
    • pp.1-12
    • /
    • 2024
  • In many real-world data, multiple response variables are often dependent on the same set of explanatory variables. In particular, if several response variables are correlated with each other, simultaneous estimation considering the correlation between response variables might be more effective way than individual analysis by each response variable. In this multivariate regression analysis, least distance estimator (LDE) can estimate the regression coefficients simultaneously to minimize the distance between each training data and the estimates in a multidimensional Euclidean space. It provides a robustness for the outliers as well. In this paper, we examine the least distance estimation method in multivariate linear regression analysis, and furthermore, we present the penalized least distance estimator (PLDE) for efficient variable selection. The LDE technique applied with the adaptive group LASSO penalty term (AGLDE) is proposed in this study which can reflect the correlation between response variables in the model and can efficiently select variables according to the importance of explanatory variables. The validity of the proposed method was confirmed through simulations and real data analysis.

Parallelism Test of Slope in Simple Linear Regression Models (회귀모형의 기울기에 대한 품행성 검정)

  • Park, Hyun-Wook;Kim, Dong-Jae
    • Communications for Statistical Applications and Methods
    • /
    • v.16 no.1
    • /
    • pp.75-83
    • /
    • 2009
  • Parallelism tests are proposed for slope in the simple linear regression models. In this paper, we suggest the parametric test using HSD testing method (Tukey,1953) and distribution-free test using Kruskal-wallis (1952) for more than three slopes. Monte Carlo simulation study is adapted to compare the power of the proposed methods with Wilks' Lambda multivariate procedure.

Changes in Disc Height as a Prognostic Factor in Patients Undergoing Microscopic Discectomy

  • Myeonggeon Kweon;Koang-Hum Bak;Hyeong-Joong Yi;Kyu-Sun Choi;Myung-Hoon Han;Min-Kyun Na;Hyoung-Joon Chun
    • Journal of Korean Neurosurgical Society
    • /
    • v.67 no.2
    • /
    • pp.209-216
    • /
    • 2024
  • Objective : Some patients with disc herniation who underwent discectomy complain of back pain after surgery and are unsatisfied with the surgical results. This study aimed to evaluate the relationship between preoperative disc height (DH), postoperative DH, and pain score 12 months after surgery in patients who underwent microdiscectomy for herniated lumbar disc. Methods : This study enrolled patients who underwent microdiscectomy at a medical center between January 2012 and December 2020. Patients with X-ray or computed tomography and pain score assessment (visual analog scale score) prior to surgery, immediately post-op, and at 1, 6, and 12 months after surgery were included. The DH index was defined as DH/overlying vertebral width. The DH ratio was defined as the postoperative DH/preoperative DH. Simple linear regression and multivariate linear regression analyses were applied to assess the correlation between DHs and leg pain scores 12 months after surgery. Results : A total of 118 patients who underwent microdiscectomy were included. DH decreased up to 12 months after surgery. The DH ratio at 1, 6, and 12 months after discectomy showed a significant positive correlation with the pain scores at 12 months after discectomy (1 month : p=0.045, B=0.52; 6 months : p=0.008, B=0.78; 12 months : p=0.005, B=0.69). Multivariate linear regression analysis revealed that the level of surgery, sex, age, and body mass index had no significant relationship with back pain scores after 12 months. Conclusion : In patients who underwent microdiscectomy, the DH ratios at 1, 6, and 12 months after surgery were prognostic factors for back pain scores at 12 months after surgery. Aggressive discectomy is recommended for lower postoperative DH ratios and Visual analog scale scores, leading to improved patient satisfaction.

Multivariate Statistical Analysis Approach to Predict the Reactor Properties and the Product Quality of a Direct Esterification Reactor for PET Synthesis (다변량 통계분석법을 이용한 PET 중합공정 중 직접 에스테르화 반응기의 거동 및 생산제품 예측)

  • Kim Sung Young;Chung Chang Bock;Choi Soo Hyoung;Lee Bomsock;Lee Bomsock
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.11 no.6
    • /
    • pp.550-557
    • /
    • 2005
  • The multivariate statistical analysis methods, using both multiple linear regression(MLR) and partial least square(PLS), have been applied to predict the reactor properties and the product quality of a direct esterification reactor for polyethylene terephthalate(PET) synthesis. On the basis of the set of data including the flow rate of water vapor, the flow rate of EG vapor, the concentration of acid end groups of a product and other operating conditions such as temperature, pressure, reaction times and feed monomer mole ratio, two multi-variable analysis methods have been applied. Their regression and prediction abilities also have been compared. The prediction results are critically compared with the actual plant data and the other mathematical model based results in reliability. This paper shows that PLS method approach can be used for the reasonably accurate prediction of a product quality of a direct esterification reactor in PET synthesis process.

Statistical Matching Techniques Using the Robust Regression Model (로버스트 회귀모형을 이용한 자료결합방법)

  • Jhun, Myoung-Shic;Jung, Ji-Song;Park, Hye-Jin
    • The Korean Journal of Applied Statistics
    • /
    • v.21 no.6
    • /
    • pp.981-996
    • /
    • 2008
  • Statistical matching techniques whose aim is to achieve a complete data file from different sources. Since the statistical matching method proposed by Rubin (1986) assumes the multivariate normality for data, using this method to data which violates the assumption would involve some problems. This research proposed the statistical matching method using robust regression as an alternative to the linear regression. Furthermore, we carried out a simulation study to compare the performance of the robust regression model and the linear regression model for the statistical matching.

A GEE approach for the semiparametric accelerated lifetime model with multivariate interval-censored data

  • Maru Kim;Sangbum Choi
    • Communications for Statistical Applications and Methods
    • /
    • v.30 no.4
    • /
    • pp.389-402
    • /
    • 2023
  • Multivariate or clustered failure time data often occur in many medical, epidemiological, and socio-economic studies when survival data are collected from several research centers. If the data are periodically observed as in a longitudinal study, survival times are often subject to various types of interval-censoring, creating multivariate interval-censored data. Then, the event times of interest may be correlated among individuals who come from the same cluster. In this article, we propose a unified linear regression method for analyzing multivariate interval-censored data. We consider a semiparametric multivariate accelerated failure time model as a statistical analysis tool and develop a generalized Buckley-James method to make inferences by imputing interval-censored observations with their conditional mean values. Since the study population consists of several heterogeneous clusters, where the subjects in the same cluster may be related, we propose a generalized estimating equations approach to accommodate potential dependence in clusters. Our simulation results confirm that the proposed estimator is robust to misspecification of working covariance matrix and statistical efficiency can increase when the working covariance structure is close to the truth. The proposed method is applied to the dataset from a diabetic retinopathy study.

Serum 25-hydroxyvitamin D3 is associated with homocysteine more than with apolipoprotein B

  • Nam-Kyu, Kim;Min-Ah, Jung;Beom-hee, Choi;Nam-Seok, Joo
    • Nutrition Research and Practice
    • /
    • v.16 no.6
    • /
    • pp.745-754
    • /
    • 2022
  • BACKGROUND/OBJECTIVES: The incidence of cardiovascular diseases (CVDs) has increased worldwide. Although a low serum vitamin D level is known to be associated with the risk of CVD, the mechanism is not well understood yet. The aim of this study was to determine the relationship of serum 25-hydroxyvitamin D3 (25[OH]D) with homocysteine and apolipoprotein B (ApoB). SUBJECTS/METHODS: Of 777 subjects recruited from one health promotion center for routine heath exam from January 2010 to December 2016, 518 subjects were included in this study. Serum 25(OH)D, serum homocysteine, and other metabolic parameters including ApoB were analyzed. Simple and partial correlations were carried out after adjustments. Simple linear regression analysis was used for precise correlation of parameters. Multivariate regression analysis was done to know which factor (serum homocysteine or ApoB) was more related to serum 25(OH)D after adjustments. Finally, logarithms of homocysteine concentrations according to tertiles of serum 25(OH)D were compared. RESULTS: After sex and age adjustments, serum 25(OH)D showed negative correlations with serum homocysteine (r' = -0.114) and ApoB (r' = -0.098). In simple linear regression analysis, serum 25(OH)D showed a significant negative correlation with ApoB (P = 0.035). However, in multivariate regression analysis, serum 25(OH)D was significantly associated with serum homocysteine after adjustments (P = 0.022). In addition, serum homocysteine concentration was significantly high in the lowest 25(OH)D group (P = 0.046). CONCLUSION: Serum 25(OH)D concentration showed a stronger negative association with serum homocysteine than with ApoB.

An estimator of the mean of the squared functions for a nonparametric regression

  • Park, Chun-Gun
    • Journal of the Korean Data and Information Science Society
    • /
    • v.20 no.3
    • /
    • pp.577-585
    • /
    • 2009
  • So far in a nonparametric regression model one of the interesting problems is estimating the error variance. In this paper we propose an estimator of the mean of the squared functions which is the numerator of SNR (Signal to Noise Ratio). To estimate SNR, the mean of the squared function should be firstly estimated. Our focus is on estimating the amplitude, that is the mean of the squared functions, in a nonparametric regression using a simple linear regression model with the quadratic form of observations as the dependent variable and the function of a lag as the regressor. Our method can be extended to nonparametric regression models with multivariate functions on unequally spaced design points or clustered designed points.

  • PDF

A Multivariate Analysis of Korean Professional Players Salary (한국 프로스포츠 선수들의 연봉에 대한 다변량적 분석)

  • Song, Jong-Woo
    • The Korean Journal of Applied Statistics
    • /
    • v.21 no.3
    • /
    • pp.441-453
    • /
    • 2008
  • We analyzed Korean professional basketball and baseball players salary under the assumption that it depends on the personal records and contribution to the team in the previous year. We extensively used data visualization tools to check the relationship among the variables, to find outliers and to do model diagnostics. We used multiple linear regression and regression tree to fit the model and used cross-validation to find an optimal model. We check the relationship between variables carefully and chose a set of variables for the stepwise regression instead of using all variables. We found that points per game, number of assists, number of free throw successes, career are important variables for the basketball players. For the baseball pitchers, career, number of strike-outs per 9 innings, ERA, number of homeruns are important variables. For the baseball hitters, career, number of hits, FA are important variables.