• 제목/요약/키워드: Regression model.

검색결과 9,515건 처리시간 0.04초

다양한 평가 지표와 최적화 기법을 통한 오염부하 산정 회귀 모형 평가 (Evaluation of Regression Models with various Criteria and Optimization Methods for Pollutant Load Estimations)

  • 김종건;임경재;박윤식
    • 한국수자원학회:학술대회논문집
    • /
    • 한국수자원학회 2018년도 학술발표회
    • /
    • pp.448-448
    • /
    • 2018
  • In this study, the regression models (Load ESTimator and eight-parameter model) were evaluated to estimate instantaneous pollutant loads under various criteria and optimization methods. As shown in the results, LOADEST commonly used in interpolating pollutant loads could not necessarily provide the best results with the automatic selected regression model. It is inferred that the various regression models in LOADEST need to be considered to find the best solution based on the characteristics of watersheds applied. The recently developed eight-parameter model integrated with Genetic Algorithm (GA) and Gradient Descent Method (GDM) were also compared with LOADEST indicating that the eight-parameter model performed better than LOADEST, but it showed different behaviors in calibration and validation. The eight-parameter model with GDM could reproduce the nitrogen loads properly outside of calibration period (validation). Furthermore, the accuracy and precision of model estimations were evaluated using various criteria (e.g., $R^2$ and gradient and constant of linear regression line). The results showed higher precisions with the $R^2$ values closed to 1.0 in LOADEST and better accuracy with the constants (in linear regression line) closed to 0.0 in the eight-parameter model with GDM. In hence, based on these finding we recommend that users need to evaluate the regression models under various criteria and calibration methods to provide the more accurate and precise results for pollutant load estimations.

  • PDF

Regression Quantile Estimators of a Nonlinear Time Series Regression Model

  • 김태수;허선;김해경
    • 한국통계학회:학술대회논문집
    • /
    • 한국통계학회 2000년도 추계학술발표회 논문집
    • /
    • pp.13-15
    • /
    • 2000
  • In this paper, we deal with the asymptotic properties of the regression quantile estimators in the nonlinear time series regression model. For the sinusodial model which frequently appears fer a time series analysis, we study the strong consistency and asymptotic normality of regression quantile ostinators.

  • PDF

Nonparametric Estimation in Regression Model

  • Han, Sang Moon
    • Communications for Statistical Applications and Methods
    • /
    • 제8권1호
    • /
    • pp.15-27
    • /
    • 2001
  • One proposal is made for constructing nonparametric estimator of slope parameters in a regression model under symmetric error distributions. This estimator is based on the use of idea of Johns for estimating the center of the symmetric distribution together with the idea of regression quantiles and regression trimmed mean. This nonparametric estimator and some other L-estimators are studied by Monte Carlo.

  • PDF

Censored varying coefficient regression model using Buckley-James method

  • Shim, Jooyong;Seok, Kyungha
    • Journal of the Korean Data and Information Science Society
    • /
    • 제28권5호
    • /
    • pp.1167-1177
    • /
    • 2017
  • The censored regression using the pseudo-response variable proposed by Buckley and James has been one of the most well-known models. Recently, the varying coefficient regression model has received a great deal of attention as an important tool for modeling. In this paper we propose a censored varying coefficient regression model using Buckley-James method to consider situations where the regression coefficients of the model are not constant but change as the smoothing variables change. By using the formulation of least squares support vector machine (LS-SVM), the coefficient estimators of the proposed model can be easily obtained from simple linear equations. Furthermore, a generalized cross validation function can be easily derived. In this paper, we evaluated the proposed method and demonstrated the adequacy through simulate data sets and real data sets.

Prediction of extreme PM2.5 concentrations via extreme quantile regression

  • Lee, SangHyuk;Park, Seoncheol;Lim, Yaeji
    • Communications for Statistical Applications and Methods
    • /
    • 제29권3호
    • /
    • pp.319-331
    • /
    • 2022
  • In this paper, we develop a new statistical model to forecast the PM2.5 level in Seoul, South Korea. The proposed model is based on the extreme quantile regression model with lasso penalty. Various meteorological variables and air pollution variables are considered as predictors in the regression model, and the lasso quantile regression performs variable selection and solves the multicollinearity problem. The final prediction model is obtained by combining various extreme lasso quantile regression estimators and we construct a binary classifier based on the model. Prediction performance is evaluated through the statistical measures of the performance of a binary classification test. We observe that the proposed method works better compared to the other classification methods, and predicts 'very bad' cases of the PM2.5 level well.

Fuzzy Linear Regression with the Weakest t-norm

  • Lee, Sung-Ho;Kim, Kyung-Moo
    • Journal of the Korean Data and Information Science Society
    • /
    • 제9권2호
    • /
    • pp.105-111
    • /
    • 1998
  • In this paper a fuzzy regression model based on the weakest t-norm is introduced. The model shows a regression model which has fuzzy coefficients and fuzzy variables.

  • PDF

다중회귀분석에 의한 하천 월 유출량의 추계학적 추정에 관한 연구 (A Study on Stochastic Estimation of Monthly Runoff by Multiple Regression Analysis)

  • 김태철;정하우
    • 한국농공학회지
    • /
    • 제22권3호
    • /
    • pp.75-87
    • /
    • 1980
  • Most hydro]ogic phenomena are the complex and organic products of multiple causations like climatic and hydro-geological factors. A certain significant correlation on the run-off in river basin would be expected and foreseen in advance, and the effect of each these causual and associated factors (independant variables; present-month rainfall, previous-month run-off, evapotranspiration and relative humidity etc.) upon present-month run-off(dependent variable) may be determined by multiple regression analysis. Functions between independant and dependant variables should be treated repeatedly until satisfactory and optimal combination of independant variables can be obtained. Reliability of the estimated function should be tested according to the result of statistical criterion such as analysis of variance, coefficient of determination and significance-test of regression coefficients before first estimated multiple regression model in historical sequence is determined. But some error between observed and estimated run-off is still there. The error arises because the model used is an inadequate description of the system and because the data constituting the record represent only a sample from a population of monthly discharge observation, so that estimates of model parameter will be subject to sampling errors. Since this error which is a deviation from multiple regression plane cannot be explained by first estimated multiple regression equation, it can be considered as a random error governed by law of chance in nature. This unexplained variance by multiple regression equation can be solved by stochastic approach, that is, random error can be stochastically simulated by multiplying random normal variate to standard error of estimate. Finally hybrid model on estimation of monthly run-off in nonhistorical sequence can be determined by combining the determistic component of multiple regression equation and the stochastic component of random errors. Monthly run-off in Naju station in Yong-San river basin is estimated by multiple regression model and hybrid model. And some comparisons between observed and estimated run-off and between multiple regression model and already-existing estimation methods such as Gajiyama formula, tank model and Thomas-Fiering model are done. The results are as follows. (1) The optimal function to estimate monthly run-off in historical sequence is multiple linear regression equation in overall-month unit, that is; Qn=0.788Pn+0.130Qn-1-0.273En-0.1 About 85% of total variance of monthly runoff can be explained by multiple linear regression equation and its coefficient of determination (R2) is 0.843. This means we can estimate monthly runoff in historical sequence highly significantly with short data of observation by above mentioned equation. (2) The optimal function to estimate monthly runoff in nonhistorical sequence is hybrid model combined with multiple linear regression equation in overall-month unit and stochastic component, that is; Qn=0. 788Pn+0. l30Qn-1-0. 273En-0. 10+Sy.t The rest 15% of unexplained variance of monthly runoff can be explained by addition of stochastic process and a bit more reliable results of statistical characteristics of monthly runoff in non-historical sequence are derived. This estimated monthly runoff in non-historical sequence shows up the extraordinary value (maximum, minimum value) which is not appeared in the observed runoff as a random component. (3) "Frequency best fit coefficient" (R2f) of multiple linear regression equation is 0.847 which is the same value as Gaijyama's one. This implies that multiple linear regression equation and Gajiyama formula are theoretically rather reasonable functions.

  • PDF

A comparative study of the Gini coefficient estimators based on the regression approach

  • Mirzaei, Shahryar;Borzadaran, Gholam Reza Mohtashami;Amini, Mohammad;Jabbari, Hadi
    • Communications for Statistical Applications and Methods
    • /
    • 제24권4호
    • /
    • pp.339-351
    • /
    • 2017
  • Resampling approaches were the first techniques employed to compute a variance for the Gini coefficient; however, many authors have shown that an analysis of the Gini coefficient and its corresponding variance can be obtained from a regression model. Despite the simplicity of the regression approach method to compute a standard error for the Gini coefficient, the use of the proposed regression model has been challenging in economics. Therefore in this paper, we focus on a comparative study among the regression approach and resampling techniques. The regression method is shown to overestimate the standard error of the Gini index. The simulations show that the Gini estimator based on the modified regression model is also consistent and asymptotically normal with less divergence from normal distribution than other resampling techniques.

COMPARISON OF VARIABLE SELECTION AND STRUCTURAL SPECIFICATION BETWEEN REGRESSION AND NEURAL NETWORK MODELS FOR HOUSEHOLD VEHICULAR TRIP FORECASTING

  • Yi, Jun-Sub
    • Journal of applied mathematics & informatics
    • /
    • 제6권2호
    • /
    • pp.599-609
    • /
    • 1999
  • Neural networks are explored as an alternative to a regres-sion model for prediction of the number of daily household vehicular trips. This study focuses on contrasting a neural network model with a regression model in term of variable selection as well as the appli-cation of these models for prediction of extreme observations, The differences in the models regarding data transformation variable selec-tion and multicollinearity are considered. The results indicate that the neural network model is a viable alternative to the regression model for addressing both messy data problems and limitation in variable structure specification.