• Title/Summary/Keyword: regression coefficient estimation methods

Search Result 62, Processing Time 0.025 seconds

Comparison of Different Multiple Linear Regression Models for Real-time Flood Stage Forecasting (실시간 수위 예측을 위한 다중선형회귀 모형의 비교)

  • Choi, Seung Yong;Han, Kun Yeun;Kim, Byung Hyun
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.32 no.1B
    • /
    • pp.9-20
    • /
    • 2012
  • Recently to overcome limitations of conceptual, hydrological and physics based models for flood stage forecasting, multiple linear regression model as one of data-driven models have been widely adopted for forecasting flood streamflow(stage). The objectives of this study are to compare performance of different multiple linear regression models according to regression coefficient estimation methods and determine most effective multiple linear regression flood stage forecasting models. To do this, the time scale was determined through the autocorrelation analysis of input data and different flood stage forecasting models developed using regression coefficient estimation methods such as LS(least square), WLS(weighted least square), SPW(stepwise) was applied to flood events in Jungrang stream. To evaluate performance of established models, fours statistical indices were used, namely; Root mean square error(RMSE), Nash Sutcliffe efficiency coefficient (NSEC), mean absolute error (MAE), adjusted coefficient of determination($R^{*2}$). The results show that the flood stage forecasting model using SPW(stepwise) parameter estimation can carry out the river flood stage prediction better in comparison with others, and the flood stage forecasting model using LS(least square) parameter estimation is also found to be slightly better than the flood stage forecasting model using WLS(weighted least square) parameter estimation.

Efficient estimation and variable selection for partially linear single-index-coefficient regression models

  • Kim, Young-Ju
    • Communications for Statistical Applications and Methods
    • /
    • v.26 no.1
    • /
    • pp.69-78
    • /
    • 2019
  • A structured model with both single-index and varying coefficients is a powerful tool in modeling high dimensional data. It has been widely used because the single-index can overcome the curse of dimensionality and varying coefficients can allow nonlinear interaction effects in the model. For high dimensional index vectors, variable selection becomes an important question in the model building process. In this paper, we propose an efficient estimation and a variable selection method based on a smoothing spline approach in a partially linear single-index-coefficient regression model. We also propose an efficient algorithm for simultaneously estimating the coefficient functions in a data-adaptive lower-dimensional approximation space and selecting significant variables in the index with the adaptive LASSO penalty. The empirical performance of the proposed method is illustrated with simulated and real data examples.

Assessment of slope stability using multiple regression analysis

  • Marrapu, Balendra M.;Jakka, Ravi S.
    • Geomechanics and Engineering
    • /
    • v.13 no.2
    • /
    • pp.237-254
    • /
    • 2017
  • Estimation of slope stability is a very important task in geotechnical engineering. However, its estimation using conventional and soft computing methods has several drawbacks. Use of conventional limit equilibrium methods for the evaluation of slope stability is very tedious and time consuming, while the use of soft computing approaches like Artificial Neural Networks and Fuzzy Logic are black box approaches. Multiple Regression (MR) analysis provides an alternative to conventional and soft computing methods, for the evaluation of slope stability. MR models provide a simplified equation, which can be used to calculate critical factor of safety of slopes without adopting any iterative procedure, thereby reducing the time and complexity involved in the evaluation of slope stability. In the present study, a multiple regression model has been developed and tested its accuracy in the estimation of slope stability using real field data. Here, two separate multiple regression models have been developed for dry and wet slopes. Further, the accuracy of these developed models have been compared and validated with respect to conventional limit equilibrium methods in terms of Mean Square Error (MSE) & Coefficient of determination ($R^2$). As the developed MR models here are not based on any region specific data and covers wide range of parametric variations, they can be directly applied to any real slopes.

Efficient Noise Estimation for Speech Enhancement in Wavelet Packet Transform

  • Jung, Sung-Il;Yang, Sung-Il
    • The Journal of the Acoustical Society of Korea
    • /
    • v.25 no.4E
    • /
    • pp.154-158
    • /
    • 2006
  • In this paper, we suggest a noise estimation method for speech enhancement in nonstationary noisy environments. The proposed method consists of the following two main processes. First, in order to receive fewer affect of variable signals, a best fitting regression line is used, which is obtained by applying a least squares method to coefficient magnitudes in a node with a uniform wavelet packet transform. Next, in order to update the noise estimation efficiently, a differential forgetting factor and a correlation coefficient per subband are used, where subband is employed for applying the weighted value according to the change of signals. In particular, this method has the ability to update the noise estimation by using the estimated noise at the previous frame only, without utilizing the statistical information of long past frames and explicit nonspeech frames by voice activity detector. In objective assessments, it was observed that the performance of the proposed method was better than that of the compared (minima controlled recursive averaging, weighted average) methods. Furthermore, the method showed a reliable result even at low SNR.

Estimation of Upstream Ungauged Watershed Streamflow using Downstream Discharge Data (하류 유량자료를 이용한 상류유역의 미계측 유출량 추정)

  • Jung, Young Hun;Jung, Chung Gil;Jung, Sung Won;Park, Jong Yoon;Kim, Seong Joon
    • Journal of The Korean Society of Agricultural Engineers
    • /
    • v.54 no.6
    • /
    • pp.169-176
    • /
    • 2012
  • This study describes the estimation of upstream ungauged watershed streamflow using downstream discharge data. For downstream Dongchon (DC) and upstream Kumho (KH) water level stations in Kumho river basin ($2,087.9km^2$), three methods of Soil and Water Assessment Tool (SWAT) modeling, drainage-area ratio method and regional regression equation were evaluated. The SWAT was calibrated at DC with the determination coefficient ($R^2$) of 0.70 and validated at KH with $R^2$ of 0.60. The drainage-area ratio method showed $R^2$ of 0.93. For the regional regression, the watershed area, average slope, and stream length were used as variables. Using the derived equation at DC, the KH could estimate the flow with maximum 41.2 % error for the observed streamflow.

A STUDY ON THE PARAMETER ESTIMATION OF SNYDER-TYPE SYNTHETIC UNIT-HYDROGRAPH DEVELOPMENT IN KUM RIVER BASIN

  • Jeong, Sang-man;Park, Seok-Chae;Lee, Joo-Heon
    • Water Engineering Research
    • /
    • v.2 no.4
    • /
    • pp.219-229
    • /
    • 2001
  • Synthetic unit hydrograph equations for rainfall run-off characteristics analysis and estimation of design flood have long and quite frequently been presented, the Snyder and SCS synthetic unit hydrograph. The major inputs to the Snyder and SCS synthetic unit hydrograph are lag time and peak coefficient. In this study, the methods for estimating lag time and peak coefficient for small watersheds proposed by Zhao and McEnroe(1999) were applied to the Kum river basin in Korea. We investigated lag times of relatively small watersheds in the Kum river basin in Korea. For this investigation the recent rainfall and stream flow data for 10 relatively small watersheds with drainage areas ranging from 134 to 902 square kilometers were gathered and used. 250 flood flow events were identified along the way, and the lag time for the flood events was determined by using the rainfall and stream flow data. Lag time is closely related with the basin characteristics of a given drainage area such as channel length, channel slope, and drainage area. A regression analysis was conducted to relate lag time to the watershed characteristics. The resulting regression model is as shown below: ※ see full text (equations) In the model, Tlag is the lag time in hours, Lc is the length of the main river in kilometers and Se is the equivalent channel slope of the main channel. The coefficient of determinations (r$^2$)expressed in the regression equation is 0.846. The peak coefficient is not correlated significantly with any of the watershed characteristics. We recommend a peak coefficient of 0.60 as input to the Snyder unit-hydrograph model for the ungauged Kum river watersheds

  • PDF

A Study on Stochastic Estimation of Monthly Runoff by Multiple Regression Analysis (다중회귀분석에 의한 하천 월 유출량의 추계학적 추정에 관한 연구)

  • 김태철;정하우
    • Magazine of the Korean Society of Agricultural Engineers
    • /
    • v.22 no.3
    • /
    • pp.75-87
    • /
    • 1980
  • Most hydro]ogic phenomena are the complex and organic products of multiple causations like climatic and hydro-geological factors. A certain significant correlation on the run-off in river basin would be expected and foreseen in advance, and the effect of each these causual and associated factors (independant variables; present-month rainfall, previous-month run-off, evapotranspiration and relative humidity etc.) upon present-month run-off(dependent variable) may be determined by multiple regression analysis. Functions between independant and dependant variables should be treated repeatedly until satisfactory and optimal combination of independant variables can be obtained. Reliability of the estimated function should be tested according to the result of statistical criterion such as analysis of variance, coefficient of determination and significance-test of regression coefficients before first estimated multiple regression model in historical sequence is determined. But some error between observed and estimated run-off is still there. The error arises because the model used is an inadequate description of the system and because the data constituting the record represent only a sample from a population of monthly discharge observation, so that estimates of model parameter will be subject to sampling errors. Since this error which is a deviation from multiple regression plane cannot be explained by first estimated multiple regression equation, it can be considered as a random error governed by law of chance in nature. This unexplained variance by multiple regression equation can be solved by stochastic approach, that is, random error can be stochastically simulated by multiplying random normal variate to standard error of estimate. Finally hybrid model on estimation of monthly run-off in nonhistorical sequence can be determined by combining the determistic component of multiple regression equation and the stochastic component of random errors. Monthly run-off in Naju station in Yong-San river basin is estimated by multiple regression model and hybrid model. And some comparisons between observed and estimated run-off and between multiple regression model and already-existing estimation methods such as Gajiyama formula, tank model and Thomas-Fiering model are done. The results are as follows. (1) The optimal function to estimate monthly run-off in historical sequence is multiple linear regression equation in overall-month unit, that is; Qn=0.788Pn+0.130Qn-1-0.273En-0.1 About 85% of total variance of monthly runoff can be explained by multiple linear regression equation and its coefficient of determination (R2) is 0.843. This means we can estimate monthly runoff in historical sequence highly significantly with short data of observation by above mentioned equation. (2) The optimal function to estimate monthly runoff in nonhistorical sequence is hybrid model combined with multiple linear regression equation in overall-month unit and stochastic component, that is; Qn=0. 788Pn+0. l30Qn-1-0. 273En-0. 10+Sy.t The rest 15% of unexplained variance of monthly runoff can be explained by addition of stochastic process and a bit more reliable results of statistical characteristics of monthly runoff in non-historical sequence are derived. This estimated monthly runoff in non-historical sequence shows up the extraordinary value (maximum, minimum value) which is not appeared in the observed runoff as a random component. (3) "Frequency best fit coefficient" (R2f) of multiple linear regression equation is 0.847 which is the same value as Gaijyama's one. This implies that multiple linear regression equation and Gajiyama formula are theoretically rather reasonable functions.

  • PDF

Estimation on Modified Proportional Hazards Model

  • Lee, Kwang-Ho;Lee, Mi-Sook
    • Journal of the Korean Data and Information Science Society
    • /
    • v.5 no.1
    • /
    • pp.59-66
    • /
    • 1994
  • Heller and Simonoff(1990) compared several methods of estimating the regression coefficient in a modified proportional hazards model, when the response variable is subject to censoring. We give another method of estimating the parameters in the model which also allows the dependent variable to be censored and the error distribution to be unspecified. The proposed method differs from that of Miller(1976) and that of Buckely and James(1979). We also obtain the variance estimator of the coefficient estimator and compare that with the Buckely-James Variance estimator studied by Hillis(1993).

  • PDF

A Study on the Maneuvering Hydrodynamic Derivatives Estimation Applied the Stern Shape of a Vessel (선미 형상을 반영한 조종 유체력 미계수 추정에 관한 연구)

  • Yoon, Seung-Bae;Kim, Dong-Young;Kim, Sang-Hyun
    • Journal of the Society of Naval Architects of Korea
    • /
    • v.53 no.1
    • /
    • pp.76-83
    • /
    • 2016
  • The various model tests are carried out to estimate and verify a ship performance in the design stage. But in view of the cost, the model test should be applied to every project vessel is very inefficient. Therefore, other methods of predicting the maneuverability with confined data are required at the initial design stage. The purpose of this study is to estimate the hydrodynamic derivatives by using the multiple regression analysis and PMM test data. The characteristics of the stern shape which has an important effect on the maneuverability are applied to the regression analysis in this study. The correlation analysis is performed to select the proper hull form coefficients and stern shape factors used as the variables in the regression analysis. The comparative analysis of estimate results and model test results is conducted on two ships to investigate the effectiveness of the maneuvering hydrodynamic derivatives estimation applied the stern shape. Through the present study, it is verified that the estimation using the stern shape factors as the variables are valid when the stern shape factors are located in the center of the database.

A comparison on coefficient estimation methods in single index models (단일지표모형에서 계수 추정방법의 비교)

  • Choi, Young-Woong;Kang, Kee-Hoon
    • Journal of the Korean Data and Information Science Society
    • /
    • v.21 no.6
    • /
    • pp.1171-1180
    • /
    • 2010
  • It is well known that the asymptotic convergence rates of nonparametric regression estimator gets worse as the dimension of covariates gets larger. One possible way to overcome this problem is reducing the dimension of covariates by using single index models. Two coefficient estimation methods in single index models are introduced. One is semiparametric least square estimation method, which tries to find approximate solution by using iterative computation. The other one is weighted average derivative estimation method, which is non-iterative method. Both of these methods offer the parametric convergence rate to normal distribution. However, practical comparison of these two methods has not been done yet. In this article, we compare these methods by examining the variances of estimators in various models.