• Title/Summary/Keyword: Normal linear regression model

Search Result 84, Processing Time 0.029 seconds

Bayesian Estimation for the Multiple Regression with Censored Data : Mutivariate Normal Error Terms

  • Yoon, Yong-Hwa
    • Journal of the Korean Data and Information Science Society
    • /
    • v.9 no.2
    • /
    • pp.165-172
    • /
    • 1998
  • This paper considers a linear regression model with censored data where each error term follows a multivariate normal distribution. In this paper we consider the diffuse prior distribution for parameters of the linear regression model. With censored data we derive the full conditional densities for parameters of a multiple regression model in order to obtain the marginal posterior densities of the relevant parameters through the Gibbs Sampler, which was proposed by Geman and Geman(1984) and utilized by Gelfand and Smith(1990) with statistical viewpoint.

  • PDF

Hidden Truncation Normal Regression

  • Kim, Sungsu
    • Communications for Statistical Applications and Methods
    • /
    • v.19 no.6
    • /
    • pp.793-798
    • /
    • 2012
  • In this paper, we propose regression methods based on the likelihood function. We assume Arnold-Beaver Skew Normal(ABSN) errors in a simple linear regression model. It was shown that the novel method performs better with an asymmetric data set compared to the usual regression model with the Gaussian errors. The utility of a novel method is demonstrated through simulation and real data sets.

Bayesian inference for an ordered multiple linear regression with skew normal errors

  • Jeong, Jeongmun;Chung, Younshik
    • Communications for Statistical Applications and Methods
    • /
    • v.27 no.2
    • /
    • pp.189-199
    • /
    • 2020
  • This paper studies a Bayesian ordered multiple linear regression model with skew normal error. It is reasonable that the kind of inherent information available in an applied regression requires some constraints on the coefficients to be estimated. In addition, the assumption of normality of the errors is sometimes not appropriate in the real data. Therefore, to explain such situations more flexibly, we use the skew-normal distribution given by Sahu et al. (The Canadian Journal of Statistics, 31, 129-150, 2003) for error-terms including normal distribution. For Bayesian methodology, the Markov chain Monte Carlo method is employed to resolve complicated integration problems. Also, under the improper priors, the propriety of the associated posterior density is shown. Our Bayesian proposed model is applied to NZAPB's apple data. For model comparison between the skew normal error model and the normal error model, we use the Bayes factor and deviance information criterion given by Spiegelhalter et al. (Journal of the Royal Statistical Society Series B (Statistical Methodology), 64, 583-639, 2002). We also consider the problem of detecting an influential point concerning skewness using Bayes factors. Finally, concluding remarks are discussed.

Normal Mixture Model with General Linear Regressive Restriction: Applied to Microarray Gene Clustering

  • Kim, Seung-Gu
    • Communications for Statistical Applications and Methods
    • /
    • v.14 no.1
    • /
    • pp.205-213
    • /
    • 2007
  • In this paper, the normal mixture model subjected to general linear restriction for component-means based on linear regression is proposed, and its fitting method by EM algorithm and Lagrange multiplier is provided. This model is applied to gene clustering of microarray expression data, which demonstrates it has very good performances for real data set. This model also allows to obtain the clusters that an analyst wants to find out in the fashion that the hypothesis for component-means is represented by the design matrices and the linear restriction matrices.

Robustness of model averaging methods for the violation of standard linear regression assumptions

  • Lee, Yongsu;Song, Juwon
    • Communications for Statistical Applications and Methods
    • /
    • v.28 no.2
    • /
    • pp.189-204
    • /
    • 2021
  • In a regression analysis, a single best model is usually selected among several candidate models. However, it is often useful to combine several candidate models to achieve better performance, especially, in the prediction viewpoint. Model combining methods such as stacking and Bayesian model averaging (BMA) have been suggested from the perspective of averaging candidate models. When the candidate models include a true model, it is expected that BMA generally gives better performance than stacking. On the other hand, when candidate models do not include the true model, it is known that stacking outperforms BMA. Since stacking and BMA approaches have different properties, it is difficult to determine which method is more appropriate under other situations. In particular, it is not easy to find research papers that compare stacking and BMA when regression model assumptions are violated. Therefore, in the paper, we compare the performance among model averaging methods as well as a single best model in the linear regression analysis when standard linear regression assumptions are violated. Simulations were conducted to compare model averaging methods with the linear regression when data include outliers and data do not include them. We also compared them when data include errors from a non-normal distribution. The model averaging methods were applied to the water pollution data, which have a strong multicollinearity among variables. Simulation studies showed that the stacking method tends to give better performance than BMA or standard linear regression analysis (including the stepwise selection method) in the sense of risks (see (3.1)) or prediction error (see (3.2)) when typical linear regression assumptions are violated.

Large Robust Designs for Generalized Linear Model

  • Kim, Young-Il;Kahng, Myung-Wook
    • Journal of the Korean Data and Information Science Society
    • /
    • v.10 no.2
    • /
    • pp.289-298
    • /
    • 1999
  • We consider a minimax approach to make a design robust to many types or uncertainty arising in reality when dealing with non-normal linear models. We try to build a design to protect against the worst case, i.e. to improve the "efficiency" of the worst situation that can happen. In this paper, we especially deal with the generalized linear model. It is a known fact that the generalized linear model is a universal approach, an extension of the normal linear regression model to cover other distributions. Therefore, the optimal design for the generalized linear model has very similar properties as the normal linear model except that it has some special characteristics. Uncertainties regarding the unknown parameters, link function, and the model structure are discussed. We show that the suggested approach is proven to be highly efficient and useful in practice. In the meantime, a computer algorithm is discussed and a conclusion follows.

  • PDF

Variable Selection in Linear Random Effects Models for Normal Data

  • Kim, Hea-Jung
    • Journal of the Korean Statistical Society
    • /
    • v.27 no.4
    • /
    • pp.407-420
    • /
    • 1998
  • This paper is concerned with selecting covariates to be included in building linear random effects models designed to analyze clustered response normal data. It is based on a Bayesian approach, intended to propose and develop a procedure that uses probabilistic considerations for selecting premising subsets of covariates. The approach reformulates the linear random effects model in a hierarchical normal and point mass mixture model by introducing a set of latent variables that will be used to identify subset choices. The hierarchical model is flexible to easily accommodate sign constraints in the number of regression coefficients. Utilizing Gibbs sampler, the appropriate posterior probability of each subset of covariates is obtained. Thus, In this procedure, the most promising subset of covariates can be identified as that with highest posterior probability. The procedure is illustrated through a simulation study.

  • PDF

Least absolute deviation estimator based consistent model selection in regression

  • Shende, K.S.;Kashid, D.N.
    • Communications for Statistical Applications and Methods
    • /
    • v.26 no.3
    • /
    • pp.273-293
    • /
    • 2019
  • We consider the problem of model selection in multiple linear regression with outliers and non-normal error distributions. In this article, the robust model selection criterion is proposed based on the robust estimation method with the least absolute deviation (LAD). The proposed criterion is shown to be consistent. We suggest proposed criterion based algorithms that are suitable for a large number of predictors in the model. These algorithms select only relevant predictor variables with probability one for large sample sizes. An exhaustive simulation study shows that the criterion performs well. However, the proposed criterion is applied to a real data set to examine its applicability. The simulation results show the proficiency of algorithms in the presence of outliers, non-normal distribution, and multicollinearity.

Bayesian Outlier Detection in Regression Model

  • Younshik Chung;Kim, Hyungsoon
    • Journal of the Korean Statistical Society
    • /
    • v.28 no.3
    • /
    • pp.311-324
    • /
    • 1999
  • The problem of 'outliers', observations which look suspicious in some way, has long been one of the most concern in the statistical structure to experimenters and data analysts. We propose a model for an outlier problem and also analyze it in linear regression model using a Bayesian approach. Then we use the mean-shift model and SSVS(George and McCulloch, 1993)'s idea which is based on the data augmentation method. The advantage of proposed method is to find a subset of data which is most suspicious in the given model by the posterior probability. The MCMC method(Gibbs sampler) can be used to overcome the complicated Bayesian computation. Finally, a proposed method is applied to a simulated data and a real data.

  • PDF

Influence diagnostics for skew-t censored linear regression models

  • Marcos S Oliveira;Daniela CR Oliveira;Victor H Lachos
    • Communications for Statistical Applications and Methods
    • /
    • v.30 no.6
    • /
    • pp.605-629
    • /
    • 2023
  • This paper proposes some diagnostics procedures for the skew-t linear regression model with censored response. The skew-t distribution is an attractive family of asymmetrical heavy-tailed densities that includes the normal, skew-normal and student's-t distributions as special cases. Inspired by the power and wide applicability of the EM-type algorithm, local and global influence analysis, based on the conditional expectation of the complete-data log-likelihood function are developed, following Zhu and Lee's approach. For the local influence analysis, four specific perturbation schemes are discussed. Two real data sets, from education and economics, which are right and left censoring, respectively, are analyzed in order to illustrate the usefulness of the proposed methodology.