• Title/Summary/Keyword: linear regression model

Search Result 1,946, Processing Time 0.046 seconds

Analysis of health-related quality of life using Beta regression (베타회귀분석 방법을 이용한 건강 관련 삶의 질 자료 분석)

  • Jang, Eun Jin
    • Journal of the Korean Data and Information Science Society
    • /
    • v.28 no.3
    • /
    • pp.547-557
    • /
    • 2017
  • The health-related quality of life data are commonly skewed and bounded with spike at the perfect health status, and the variance tended to be heteroscedastic. In this study, we have developed a prediction model for EQ-5D using linear regression model, beta regression model, and extended beta regression model with mean and precision submodel, and also compared the predictive accuracy. The extended beta regression model allows to model skewness and differences in dispersion related to covariates. Although the extended beta regression model has higher prediction accuracy than the linear regression model, the overlapped confidence intervals suggested that the extended beta regression model was superior to the linear regression model. However, the expended beta regression model could explain the heteroscedasticity and predict within the bounded range. Therefore, the expended beta regression model are appropriate for fitting the health-related quality of life data such as EQ-5D.

Partially linear support vector orthogonal quantile regression with measurement errors

  • Hwang, Changha
    • Journal of the Korean Data and Information Science Society
    • /
    • v.26 no.1
    • /
    • pp.209-216
    • /
    • 2015
  • Quantile regression models with covariate measurement errors have received a great deal of attention in both the theoretical and the applied statistical literature. A lot of effort has been devoted to develop effective estimation methods for such quantile regression models. In this paper we propose the partially linear support vector orthogonal quantile regression model in the presence of covariate measurement errors. We also provide a generalized approximate cross-validation method for choosing the hyperparameters and the ratios of the error variances which affect the performance of the proposed model. The proposed model is evaluated through simulations.

Local linear regression analysis for interval-valued data

  • Jang, Jungteak;Kang, Kee-Hoon
    • Communications for Statistical Applications and Methods
    • /
    • v.27 no.3
    • /
    • pp.365-376
    • /
    • 2020
  • Interval-valued data, a type of symbolic data, is given as an interval in which the observation object is not a single value. It can also occur frequently in the process of aggregating large databases into a form that is easy to manage. Various regression methods for interval-valued data have been proposed relatively recently. In this paper, we introduce a nonparametric regression model using the kernel function and a nonlinear regression model for the interval-valued data. We also propose applying the local linear regression model, one of the nonparametric methods, to the interval-valued data. Simulations based on several distributions of the center point and the range are conducted using each of the methods presented in this paper. Various conditions confirm that the performance of the proposed local linear estimator is better than the others.

A Study on Stochastic Estimation of Monthly Runoff by Multiple Regression Analysis (다중회귀분석에 의한 하천 월 유출량의 추계학적 추정에 관한 연구)

  • 김태철;정하우
    • Magazine of the Korean Society of Agricultural Engineers
    • /
    • v.22 no.3
    • /
    • pp.75-87
    • /
    • 1980
  • Most hydro]ogic phenomena are the complex and organic products of multiple causations like climatic and hydro-geological factors. A certain significant correlation on the run-off in river basin would be expected and foreseen in advance, and the effect of each these causual and associated factors (independant variables; present-month rainfall, previous-month run-off, evapotranspiration and relative humidity etc.) upon present-month run-off(dependent variable) may be determined by multiple regression analysis. Functions between independant and dependant variables should be treated repeatedly until satisfactory and optimal combination of independant variables can be obtained. Reliability of the estimated function should be tested according to the result of statistical criterion such as analysis of variance, coefficient of determination and significance-test of regression coefficients before first estimated multiple regression model in historical sequence is determined. But some error between observed and estimated run-off is still there. The error arises because the model used is an inadequate description of the system and because the data constituting the record represent only a sample from a population of monthly discharge observation, so that estimates of model parameter will be subject to sampling errors. Since this error which is a deviation from multiple regression plane cannot be explained by first estimated multiple regression equation, it can be considered as a random error governed by law of chance in nature. This unexplained variance by multiple regression equation can be solved by stochastic approach, that is, random error can be stochastically simulated by multiplying random normal variate to standard error of estimate. Finally hybrid model on estimation of monthly run-off in nonhistorical sequence can be determined by combining the determistic component of multiple regression equation and the stochastic component of random errors. Monthly run-off in Naju station in Yong-San river basin is estimated by multiple regression model and hybrid model. And some comparisons between observed and estimated run-off and between multiple regression model and already-existing estimation methods such as Gajiyama formula, tank model and Thomas-Fiering model are done. The results are as follows. (1) The optimal function to estimate monthly run-off in historical sequence is multiple linear regression equation in overall-month unit, that is; Qn=0.788Pn+0.130Qn-1-0.273En-0.1 About 85% of total variance of monthly runoff can be explained by multiple linear regression equation and its coefficient of determination (R2) is 0.843. This means we can estimate monthly runoff in historical sequence highly significantly with short data of observation by above mentioned equation. (2) The optimal function to estimate monthly runoff in nonhistorical sequence is hybrid model combined with multiple linear regression equation in overall-month unit and stochastic component, that is; Qn=0. 788Pn+0. l30Qn-1-0. 273En-0. 10+Sy.t The rest 15% of unexplained variance of monthly runoff can be explained by addition of stochastic process and a bit more reliable results of statistical characteristics of monthly runoff in non-historical sequence are derived. This estimated monthly runoff in non-historical sequence shows up the extraordinary value (maximum, minimum value) which is not appeared in the observed runoff as a random component. (3) "Frequency best fit coefficient" (R2f) of multiple linear regression equation is 0.847 which is the same value as Gaijyama's one. This implies that multiple linear regression equation and Gajiyama formula are theoretically rather reasonable functions.

  • PDF

A Random Fuzzy Linear Regression Model

  • Changhyuck Oh
    • Communications for Statistical Applications and Methods
    • /
    • v.5 no.2
    • /
    • pp.287-295
    • /
    • 1998
  • A random fuzzy linear regression model is introduced, which includes both randomness and fuzziness. Estimators for the parameters are suggested, which are derived mainly using properties of randomness.

  • PDF

Evaluation of the heat island in transition zone of three cities in Kyungpook, Korea (추이대(推移帶)를 중심으로 한 경상북도 3개 도시의 열섬 평가)

  • Park, In Hwan;Jang, Gab Sue;Kim, Jong Yong
    • Journal of Environmental Impact Assessment
    • /
    • v.8 no.2
    • /
    • pp.73-82
    • /
    • 1999
  • This study analyzed the relationship between NDVI(Normalized Difference Vegetation Index) and urban heat island in three cities: Daegu, Kyungju, and Pohang for understanding the degree of nature conservation concentrating in the transition zone of them. Daegu city is the third city in Korea which has a dense population. Kyungju is a traditional city which has good nature. Pohang is an industrial city which has those of characters of Daegu and Kyungju. Landsat 1M data in May 17, 1997 were used for the analysis of heat island. There were about four theoretical models to estimate the surface temperature from TM data: Two-point linear model, Linear regression model, Quadratic regression model, and Cubic regression model. In this study, Linear regression model had been utilized to analyze the urban heat island. On the resultant images, the transition zone of Daegu was urbanized more extremely than those of other two cities. It is thought that the analysis of relationship between NDVI and surface temperature, used in this study, is regarded as one of effective methodologies for urban-environmental detection from satellite imageries.

  • PDF

Alternative Derivation of Stepwise Multivariate Linear Regression (段階的 多變量 線型回歸에 관하여)

  • 申敏雄;金周成
    • Journal of the Korean Statistical Society
    • /
    • v.7 no.2
    • /
    • pp.105-108
    • /
    • 1978
  • Freund, Vail, and Ross, Goldberger and Jochems and Goldberger have given some results for the stepwise estimation of the parameters of a univariate regression model, D.G. Kabe gave similar results for a multivariate linear regression model. We give here alternative derivation of some results derived by D.G. Kabe.

  • PDF

Design of the optimal inputs for parameter estimation in linear dynamic systems (선형계통의 파라미터 추정을 위한 최적 입력의 설계)

  • 양흥석;이석원;정찬수
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 1986.10a
    • /
    • pp.73-77
    • /
    • 1986
  • Optimal input design problem for linear regression model with constrained output variance has been considered. It is shown that the optimal input signal for the linear regression model can also be realized as an ARMA process. Monte-Carlo simulation results show that the optimal stochastic input leads to comparatively better estimation accuracy than white input signal.

  • PDF

Development of the Index for Estimating the Arc Status in the Short-circuiting Transfer Region of GMA Welding (GMA용접의 단락이행영역에 있어서 아크 상태 평가를 위한 모델 개발)

  • 강문진;이세헌;엄기원
    • Journal of Welding and Joining
    • /
    • v.17 no.4
    • /
    • pp.85-92
    • /
    • 1999
  • In GMAW, the spatter is generated because of the variation of the arc state. If the arc state is quantitatively assessed, the control method to make the spatter be reduced is able to develop. This study was attempted to develop the optimal model that could estimate the arc state quantitatively. To do this, the generated spatters was captured under the limited welding conditions, and the waveforms of the arc voltage and of the welding current were collected. From the collected waveforms, the waveform factors and their standard deviations were produced, and the linear and non-linear regression models constituted using the factors and their standard deviations are proposed to estimate the arc state. the performance test to the proposed models was practiced. Obtained results are as follow. From the results of correlation analysis between the factors and the amount of the generated spatters, the standard deviations of the waveform factors have more the multiple regression coefficients than the waveform factors. Because the correlation coefficient between T and {TEX}$T_{a}${/TEX}, and s[T] and s[{TEX}$T_{a}${/TEX}] was nearly one, it was found that these factors have the same effect to the spatter generation. In the regression models to estimate the arc state, it was fond that the linear and the non linear models were also consisted of similar factors. In addition, the linear regression model was assessed the optimal model for estimating the arc state because the variance of data was narrow and multiple regression coefficient was highest among the models. But in the welding conditions which the amount of the generated spatters were small, it was found that the non linear regression model had better the estimation performance for the spatter generation than the linear.

  • PDF

Equivalence in Alpha-Level Linear Regression

  • Yoon, Jin-Hee;Jung, Hye-Young;Choi, Seung-Hoe
    • Communications for Statistical Applications and Methods
    • /
    • v.17 no.4
    • /
    • pp.611-624
    • /
    • 2010
  • Several methods were suggested for constructing a fuzzy relationship between fuzzy independent and dependent variables. This paper reviews the use of the method by minimizing the square of the difference between an observed and a predicted fuzzy number in an ${\alpha}$-level linear regression model. We introduce a new distance between fuzzy numbers on the basis of a mode, a core point and a radius of an ${\alpha}$-level set of a fuzzy number an construct the fuzzy regression model using the proposed fuzzy distance. We also investigate sufficient condition for an equivalence in the ${\alpha}$-level regression model.