• Title/Summary/Keyword: Multiple regression model

Search Result 2,549, Processing Time 0.032 seconds

Orographic Precipitation Analysis with Regional Frequency Analysis and Multiple Linear Regression (지역빈도해석 및 다중회귀분석을 이용한 산악형 강수해석)

  • Yun, Hye-Seon;Um, Myoung-Jin;Cho, Won-Cheol;Heo, Jun-Haeng
    • Journal of Korea Water Resources Association
    • /
    • v.42 no.6
    • /
    • pp.465-480
    • /
    • 2009
  • In this study, single and multiple linear regression model were used to derive the relationship between precipitation and altitude, latitude and longitude in Jejudo. The single linear regression analysis was focused on whether orographic effect was existed in Jejudo by annual average precipitation, and the multiple linear regression analysis on whether orographic effect was applied to each duration and return period of quantile from regional frequency analysis by index flood method. As results of the regression analysis, it shows the relationship between altitude and precipitation strongly form a linear relationship as the length of duration and return period increase. The multiple linear regression precipitation estimates(which used altitude, latitude, and longitude information) were found to be more reasonable than estimates obtained using altitude only or altitude-latitude and altitude-longitude. Especially, as results of spatial distribution analysis by kriging method using GIS, it also provides realistic estimates for precipitation that the precipitation was occurred the southeast region as real climate of Jejudo. However, the accuracy of regression model was decrease which derived a short duration of precipitation or estimated high region precipitation even had long duration. Consequently the other factor caused orographic effect would be needed to estimate precipitation to improve accuracy.

CNN-LSTM Coupled Model for Prediction of Waterworks Operation Data

  • Cao, Kerang;Kim, Hangyung;Hwang, Chulhyun;Jung, Hoekyung
    • Journal of Information Processing Systems
    • /
    • v.14 no.6
    • /
    • pp.1508-1520
    • /
    • 2018
  • In this paper, we propose an improved model to provide users with a better long-term prediction of waterworks operation data. The existing prediction models have been studied in various types of models such as multiple linear regression model while considering time, days and seasonal characteristics. But the existing model shows the rate of prediction for demand fluctuation and long-term prediction is insufficient. Particularly in the deep running model, the long-short-term memory (LSTM) model has been applied to predict data of water purification plant because its time series prediction is highly reliable. However, it is necessary to reflect the correlation among various related factors, and a supplementary model is needed to improve the long-term predictability. In this paper, convolutional neural network (CNN) model is introduced to select various input variables that have a necessary correlation and to improve long term prediction rate, thus increasing the prediction rate through the LSTM predictive value and the combined structure. In addition, a multiple linear regression model is applied to compile the predicted data of CNN and LSTM, which then confirms the data as the final predicted outcome.

Wage Determinants Analysis by Quantile Regression Tree

  • Chang, Young-Jae
    • Communications for Statistical Applications and Methods
    • /
    • v.19 no.2
    • /
    • pp.293-301
    • /
    • 2012
  • Quantile regression proposed by Koenker and Bassett (1978) is a statistical technique that estimates conditional quantiles. The advantage of using quantile regression is the robustness in response to large outliers compared to ordinary least squares(OLS) regression. A regression tree approach has been applied to OLS problems to fit flexible models. Loh (2002) proposed the GUIDE algorithm that has a negligible selection bias and relatively low computational cost. Quantile regression can be regarded as an analogue of OLS, therefore it can also be applied to GUIDE regression tree method. Chaudhuri and Loh (2002) proposed a nonparametric quantile regression method that blends key features of piecewise polynomial quantile regression and tree-structured regression based on adaptive recursive partitioning. Lee and Lee (2006) investigated wage determinants in the Korean labor market using the Korean Labor and Income Panel Study(KLIPS). Following Lee and Lee, we fit three kinds of quantile regression tree models to KLIPS data with respect to the quantiles, 0.05, 0.2, 0.5, 0.8, and 0.95. Among the three models, multiple linear piecewise quantile regression model forms the shortest tree structure, while the piecewise constant quantile regression model has a deeper tree structure with more terminal nodes in general. Age, gender, marriage status, and education seem to be the determinants of the wage level throughout the quantiles; in addition, education experience appears as the important determinant of the wage level in the highly paid group.

Short-term Peak Power Demand Forecasting using Model in Consideration of Weather Variable (기상 변수를 고려한 모델에 의한 단기 최대전력수요예측)

  • 고희석;이충식;최종규;지봉호
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.2 no.3
    • /
    • pp.73-78
    • /
    • 2001
  • BP neural network model and multiple-regression model were composed for forecasting the special-days load. Special-days load was forecasted using that neural network model made use of pattern conversion ratio and multiple-regression made use of weekday-change ratio. This methods identified the suitable as that special-days load of short and long term was forecasted with the weekly average percentage error of 1∼2[%] in the weekly peak load forecasting model using pattern conversion ratio. But this methods were hard with special-days load forecasting of summertime. therefore it was forecasted with the multiple-regression models. This models were used to the weekday-change ratio, and the temperature-humidity and discomfort-index as explanatory variable. This methods identified the suitable as that compared forecasting result of weekday load with forecasting result of special-days load because months average percentage error was alike. And, the fit of the presented forecast models using statistical tests had been proved. Big difficult problem of peak load forecasting had been solved that because identified the fit of the methods of special-days load forecasting in the paper presented.

  • PDF

Identifying Factors for Corn Yield Prediction Models and Evaluating Model Selection Methods

  • Chang Jiyul;Clay David E.
    • KOREAN JOURNAL OF CROP SCIENCE
    • /
    • v.50 no.4
    • /
    • pp.268-275
    • /
    • 2005
  • Early predictions of crop yields call provide information to producers to take advantages of opportunities into market places, to assess national food security, and to provide early food shortage warning. The objectives of this study were to identify the most useful parameters for estimating yields and to compare two model selection methods for finding the 'best' model developed by multiple linear regression. This research was conducted in two 65ha corn/soybean rotation fields located in east central South Dakota. Data used to develop models were small temporal variability information (STVI: elevation, apparent electrical conductivity $(EC_a)$, slope), large temporal variability information (LTVI : inorganic N, Olsen P, soil moisture), and remote sensing information (green, red, and NIR bands and normalized difference vegetation index (NDVI), green normalized difference vegetation index (GDVI)). Second order Akaike's Information Criterion (AICc) and Stepwise multiple regression were used to develop the best-fitting equations in each system (information groups). The models with $\Delta_i\leq2$ were selected and 22 and 37 models were selected at Moody and Brookings, respectively. Based on the results, the most useful variables to estimate corn yield were different in each field. Elevation and $EC_a$ were consistently the most useful variables in both fields and most of the systems. Model selection was different in each field. Different number of variables were selected in different fields. These results might be contributed to different landscapes and management histories of the study fields. The most common variables selected by AICc and Stepwise were different. In validation, Stepwise was slightly better than AICc at Moody and at Brookings AICc was slightly better than Stepwise. Results suggest that the Alec approach can be used to identify the most useful information and select the 'best' yield models for production fields.

Regression Quantile Estimations on Censored Survival Data

  • Shim, Joo-Yong
    • Journal of the Korean Data and Information Science Society
    • /
    • v.13 no.2
    • /
    • pp.31-38
    • /
    • 2002
  • In the case of multiple survival times which might be censored at each covariate vector, we study the regression quantile estimations in this paper. The estimations are based on the empirical distribution functions of the censored times and the sample quantiles of the observed survival times at each covariate vector and the weighted least square method is applied for the estimation of the regression quantile. The estimators are shown to be asymptotically normally distributed under some regularity conditions.

  • PDF

Detection of Change-Points by Local Linear Regression Fit;

  • Kim, Jong Tae;Choi, Hyemi;Huh, Jib
    • Communications for Statistical Applications and Methods
    • /
    • v.10 no.1
    • /
    • pp.31-38
    • /
    • 2003
  • A simple method is proposed to detect the number of change points and test the location and size of multiple change points with jump discontinuities in an otherwise smooth regression model. The proposed estimators are based on a local linear regression fit by the comparison of left and right one-side kernel smoother. Our proposed methodology is explained and applied to real data and simulated data.

Prediction of Future Sea Surface Temperature around the Korean Peninsular based on Statistical Downscaling (통계적 축소법을 이용한 한반도 인근해역의 미래 표층수온 추정)

  • Ham, Hee-Jung;Kim, Sang-Su;Yoon, Woo-Seok
    • Journal of Industrial Technology
    • /
    • v.31 no.B
    • /
    • pp.107-112
    • /
    • 2011
  • Recently, climate change around the world due to global warming has became an important issue and damages by climate change have a bad effect on human life. Changes of Sea Surface Temperature(SST) is associated with natural disaster such as Typhoon and El Nino. So we predicted daily future SST using Statistical Downscaling Method and CGCM 3.1 A1B scenario. 9 points of around Korea peninsular were selected to predict future SST and built up a regression model using Multiple Linear Regression. CGCM 3.1 was simulated with regression model, and that comparing Probability Density Function, Box-Plot, and statistical data to evaluate suitability of regression models, it was validated that regression models were built up properly.

  • PDF

Estimating the Total Precipitation Amount with Simulated Precipitation for Ungauged Stations in Jeju Island (미계측 관측 강수 자료 생성을 통한 제주도 지역의 수문총량 추정)

  • Kim, Nam-Won;Um, Myoung-Jin;Chung, Il-Moon;Heo, Jun-Haeng
    • Journal of Korea Water Resources Association
    • /
    • v.45 no.9
    • /
    • pp.875-885
    • /
    • 2012
  • In this study, the total precipitation amount in Jeju Island was estimated with the simulated precipitation for ungauged stations missing precipitation data using the spatial precipitation analysis. The missing data were generated through the modified multiple linear regression in this study, and the analysis of spatial precipitation was conducted with the PRISM(Parameter-elevation Regression on Independent Slope Model). The generated data with modified multiple linear regression model have similar pattern with original data. Thus, the model in this study shows good applicability to estimate the missing data. The difference of annual average precipitation between Case 1 (original data) and Case 2 (modified data) appears very small ratio which is about 1.5%. However, the difference of annual average precipitation according to elevation shows the large ratio up to 37.4%. As the results, the method of estimating missing data in this study would be useful to calculate the total precipitation amount at the low station density area and the places with the high spatial variation of precipitation.

Development of Accident Density Model in Korea (국내 교통사고 밀도 모형 개발)

  • Park, Na Young;Kim, Tae Yang;Park, Byung Ho
    • Journal of the Korean Society of Safety
    • /
    • v.32 no.3
    • /
    • pp.130-135
    • /
    • 2017
  • This study deal with the traffic accident. The purpose of this study is to develop the accident density models reflecting the transportation and socioeconomic characteristics based on 230 zones of Korea. In this study, The models which are tested to be statistically significant are developed through multiple linear regression analysis. The main research results are as follows. First, in the transportation-based model, road length, avenue ratio, number of intersections and tunnels are analyzed to be positive to the model, however, school zone is analyzed to be negative to the model. Second, in the socioeconomic-based model, population density, transportation vulnerable ratio, children and truck ratio are analyzed to be positive to the model. Finally, in the integrated models, road ratio, population density, transportation vulnerable ratio, children ratio, truck ratio and number of companies are analyzed to be positive, however, school zone is analyzed to be negative to the model. This results could be expected to give good implications to accident-reduction policy-making.