• Title/Summary/Keyword: Normal Error Regression Model

Search Result 39, Processing Time 0.037 seconds

Bayesian Estimation for the Multiple Regression with Censored Data : Mutivariate Normal Error Terms

  • Yoon, Yong-Hwa
    • Journal of the Korean Data and Information Science Society
    • /
    • v.9 no.2
    • /
    • pp.165-172
    • /
    • 1998
  • This paper considers a linear regression model with censored data where each error term follows a multivariate normal distribution. In this paper we consider the diffuse prior distribution for parameters of the linear regression model. With censored data we derive the full conditional densities for parameters of a multiple regression model in order to obtain the marginal posterior densities of the relevant parameters through the Gibbs Sampler, which was proposed by Geman and Geman(1984) and utilized by Gelfand and Smith(1990) with statistical viewpoint.

  • PDF

Bayesian inference for an ordered multiple linear regression with skew normal errors

  • Jeong, Jeongmun;Chung, Younshik
    • Communications for Statistical Applications and Methods
    • /
    • v.27 no.2
    • /
    • pp.189-199
    • /
    • 2020
  • This paper studies a Bayesian ordered multiple linear regression model with skew normal error. It is reasonable that the kind of inherent information available in an applied regression requires some constraints on the coefficients to be estimated. In addition, the assumption of normality of the errors is sometimes not appropriate in the real data. Therefore, to explain such situations more flexibly, we use the skew-normal distribution given by Sahu et al. (The Canadian Journal of Statistics, 31, 129-150, 2003) for error-terms including normal distribution. For Bayesian methodology, the Markov chain Monte Carlo method is employed to resolve complicated integration problems. Also, under the improper priors, the propriety of the associated posterior density is shown. Our Bayesian proposed model is applied to NZAPB's apple data. For model comparison between the skew normal error model and the normal error model, we use the Bayes factor and deviance information criterion given by Spiegelhalter et al. (Journal of the Royal Statistical Society Series B (Statistical Methodology), 64, 583-639, 2002). We also consider the problem of detecting an influential point concerning skewness using Bayes factors. Finally, concluding remarks are discussed.

A Bayesian Approach for Accelerated Failure Time Model with Skewed Normal Error

  • Kim, Chansoo
    • Communications for Statistical Applications and Methods
    • /
    • v.10 no.2
    • /
    • pp.268-275
    • /
    • 2003
  • We consider the Bayesian accelerated failure time model. The error distribution is assigned a skewed normal distribution which is including normal distribution. For noninformative priors of regression coefficients, we show the propriety of posterior distribution. A Markov Chain Monte Carlo algorithm(i.e., Gibbs Sampler) is used to obtain a predictive distribution for a future observation and Bayes estimates of regression coefficients.

Measurement Error Model with Skewed Normal Distribution (왜도정규분포 기반의 측정오차모형)

  • Heo, Tae-Young;Choi, Jungsoon;Park, Man Sik
    • The Korean Journal of Applied Statistics
    • /
    • v.26 no.6
    • /
    • pp.953-958
    • /
    • 2013
  • This study suggests a measurement error model based on skewed normal distribution instead of normal distribution to identify slope parameter properties in a simple liner regression model. We prove that the slope parameter in a simple linear regression model is underestimated.

Further Applications of Johnson's SU-normal Distribution to Various Regression Models

  • Choi, Pilsun;Min, In-Sik
    • Communications for Statistical Applications and Methods
    • /
    • v.15 no.2
    • /
    • pp.161-171
    • /
    • 2008
  • This study discusses Johnson's $S_U$-normal distribution capturing a wide range of non-normality in various regression models. We provide the likelihood inference using Johnson's $S_U$-normal distribution, and propose a likelihood ratio (LR) test for normality. We also apply the $S_U$-normal distribution to the binary and censored regression models. Monte Carlo simulations are used to show that the LR test using the $S_U$-normal distribution can be served as a model specification test for normal error distribution, and that the $S_U$-normal maximum likelihood (ML) estimators tend to yield more reliable marginal effect estimates in the binary and censored model when the error distributions are non-normal.

A comparative study of the Gini coefficient estimators based on the regression approach

  • Mirzaei, Shahryar;Borzadaran, Gholam Reza Mohtashami;Amini, Mohammad;Jabbari, Hadi
    • Communications for Statistical Applications and Methods
    • /
    • v.24 no.4
    • /
    • pp.339-351
    • /
    • 2017
  • Resampling approaches were the first techniques employed to compute a variance for the Gini coefficient; however, many authors have shown that an analysis of the Gini coefficient and its corresponding variance can be obtained from a regression model. Despite the simplicity of the regression approach method to compute a standard error for the Gini coefficient, the use of the proposed regression model has been challenging in economics. Therefore in this paper, we focus on a comparative study among the regression approach and resampling techniques. The regression method is shown to overestimate the standard error of the Gini index. The simulations show that the Gini estimator based on the modified regression model is also consistent and asymptotically normal with less divergence from normal distribution than other resampling techniques.

A Study on Error Detection Algorithm of COD Measurement Machine

  • Choi, Hyun-Seok;Song, Gyu-Moon;Kim, Tae-Yoon
    • Journal of the Korean Data and Information Science Society
    • /
    • v.18 no.4
    • /
    • pp.847-857
    • /
    • 2007
  • This paper provides a statistical algorithm which detects COD (chemical oxygen demand) measurement machine error on real-time. For this we propose to use regression model fitting and check its validity against the current observations. The main idea is that the normal regression relation between COD measurement and other parameters inside the machine will be violated when the machine is out of order.

  • PDF

A Study on Improving the predict accuracy rate of Hybrid Model Technique Using Error Pattern Modeling : Using Logistic Regression and Discriminant Analysis

  • Cho, Yong-Jun;Hur, Joon
    • Journal of the Korean Data and Information Science Society
    • /
    • v.17 no.2
    • /
    • pp.269-278
    • /
    • 2006
  • This paper presents the new hybrid data mining technique using error pattern, modeling of improving classification accuracy. The proposed method improves classification accuracy by combining two different supervised learning methods. The main algorithm generates error pattern modeling between the two supervised learning methods(ex: Neural Networks, Decision Tree, Logistic Regression and so on.) The Proposed modeling method has been applied to the simulation of 10,000 data sets generated by Normal and exponential random distribution. The simulation results show that the performance of proposed method is superior to the existing methods like Logistic regression and Discriminant analysis.

  • PDF

A Study on Stochastic Estimation of Monthly Runoff by Multiple Regression Analysis (다중회귀분석에 의한 하천 월 유출량의 추계학적 추정에 관한 연구)

  • 김태철;정하우
    • Magazine of the Korean Society of Agricultural Engineers
    • /
    • v.22 no.3
    • /
    • pp.75-87
    • /
    • 1980
  • Most hydro]ogic phenomena are the complex and organic products of multiple causations like climatic and hydro-geological factors. A certain significant correlation on the run-off in river basin would be expected and foreseen in advance, and the effect of each these causual and associated factors (independant variables; present-month rainfall, previous-month run-off, evapotranspiration and relative humidity etc.) upon present-month run-off(dependent variable) may be determined by multiple regression analysis. Functions between independant and dependant variables should be treated repeatedly until satisfactory and optimal combination of independant variables can be obtained. Reliability of the estimated function should be tested according to the result of statistical criterion such as analysis of variance, coefficient of determination and significance-test of regression coefficients before first estimated multiple regression model in historical sequence is determined. But some error between observed and estimated run-off is still there. The error arises because the model used is an inadequate description of the system and because the data constituting the record represent only a sample from a population of monthly discharge observation, so that estimates of model parameter will be subject to sampling errors. Since this error which is a deviation from multiple regression plane cannot be explained by first estimated multiple regression equation, it can be considered as a random error governed by law of chance in nature. This unexplained variance by multiple regression equation can be solved by stochastic approach, that is, random error can be stochastically simulated by multiplying random normal variate to standard error of estimate. Finally hybrid model on estimation of monthly run-off in nonhistorical sequence can be determined by combining the determistic component of multiple regression equation and the stochastic component of random errors. Monthly run-off in Naju station in Yong-San river basin is estimated by multiple regression model and hybrid model. And some comparisons between observed and estimated run-off and between multiple regression model and already-existing estimation methods such as Gajiyama formula, tank model and Thomas-Fiering model are done. The results are as follows. (1) The optimal function to estimate monthly run-off in historical sequence is multiple linear regression equation in overall-month unit, that is; Qn=0.788Pn+0.130Qn-1-0.273En-0.1 About 85% of total variance of monthly runoff can be explained by multiple linear regression equation and its coefficient of determination (R2) is 0.843. This means we can estimate monthly runoff in historical sequence highly significantly with short data of observation by above mentioned equation. (2) The optimal function to estimate monthly runoff in nonhistorical sequence is hybrid model combined with multiple linear regression equation in overall-month unit and stochastic component, that is; Qn=0. 788Pn+0. l30Qn-1-0. 273En-0. 10+Sy.t The rest 15% of unexplained variance of monthly runoff can be explained by addition of stochastic process and a bit more reliable results of statistical characteristics of monthly runoff in non-historical sequence are derived. This estimated monthly runoff in non-historical sequence shows up the extraordinary value (maximum, minimum value) which is not appeared in the observed runoff as a random component. (3) "Frequency best fit coefficient" (R2f) of multiple linear regression equation is 0.847 which is the same value as Gaijyama's one. This implies that multiple linear regression equation and Gajiyama formula are theoretically rather reasonable functions.

  • PDF

Least absolute deviation estimator based consistent model selection in regression

  • Shende, K.S.;Kashid, D.N.
    • Communications for Statistical Applications and Methods
    • /
    • v.26 no.3
    • /
    • pp.273-293
    • /
    • 2019
  • We consider the problem of model selection in multiple linear regression with outliers and non-normal error distributions. In this article, the robust model selection criterion is proposed based on the robust estimation method with the least absolute deviation (LAD). The proposed criterion is shown to be consistent. We suggest proposed criterion based algorithms that are suitable for a large number of predictors in the model. These algorithms select only relevant predictor variables with probability one for large sample sizes. An exhaustive simulation study shows that the criterion performs well. However, the proposed criterion is applied to a real data set to examine its applicability. The simulation results show the proficiency of algorithms in the presence of outliers, non-normal distribution, and multicollinearity.