• Title/Summary/Keyword: 다중회귀분석모형

Search Result 634, Processing Time 0.027 seconds

Flood risk index optimization using multiple linear regression (다중선형회귀를 이용한 홍수위험지수 최적화)

  • Kim, Myojeong;Kim, Gwangseob
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2016.05a
    • /
    • pp.283-283
    • /
    • 2016
  • 기후변화의 지역적 영향으로 호우의 강도와 빈도가 증가하고 있는 상황에서 수재해 대응을 위하여 다양한 기술들이 필요하며 특히 홍수 취약성에 대한 분석과 평가가 선행되어야 한다. 본 연구에서는 기존의 PSR(Pressure-State-Response) 모형과 DPSIR(Driving force-Pressure-StateImpact-Response 모형을 다중선형회귀 기법을 사용하여 최적화하였다(Fig. 1). 대상기간은 2008년부터 2013년까지이며, mod 1에서는 연도별로 다중선형회귀기법을 사용하여 최적 가중치를 산정하였고, mod 2에서는 대상기간(2008 ~ 2013) 전체에 대해 다중선형회귀기법을 사용하여 최적 가중치를 산정하는 방법을 적용하였다.

  • PDF

Procedure for the Selection of Principal Components in Principal Components Regression (주성분회귀분석에서 주성분선정을 위한 새로운 방법)

  • Kim, Bu-Yong;Shin, Myung-Hee
    • The Korean Journal of Applied Statistics
    • /
    • v.23 no.5
    • /
    • pp.967-975
    • /
    • 2010
  • Since the least squares estimation is not appropriate when multicollinearity exists among the regressors of the linear regression model, the principal components regression is used to deal with the multicollinearity problem. This article suggests a new procedure for the selection of suitable principal components. The procedure is based on the condition index instead of the eigenvalue. The principal components corresponding to the indices are removed from the model if any condition indices are larger than the upper limit of the cutoff value. On the other hand, the corresponding principal components are included if any condition indices are smaller than the lower limit. The forward inclusion method is employed to select proper principal components if any condition indices are between the upper limit and the lower limit. The limits are obtained from the linear model which is constructed on the basis of the conjoint analysis. The procedure is evaluated by Monte Carlo simulation in terms of the mean square error of estimator. The simulation results indicate that the proposed procedure is superior to the existing methods.

Development of Regression Models Resolving High-Dimensional Data and Multicollinearity Problem for Heavy Rain Damage Data (호우피해자료에서의 고차원 자료 및 다중공선성 문제를 해소한 회귀모형 개발)

  • Kim, Jeonghwan;Park, Jihyun;Choi, Changhyun;Kim, Hung Soo
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.38 no.6
    • /
    • pp.801-808
    • /
    • 2018
  • The learning of the linear regression model is stable on the assumption that the sample size is sufficiently larger than the number of explanatory variables and there is no serious multicollinearity between explanatory variables. In this study, we investigated the difficulty of model learning when the assumption was violated by analyzing a real heavy rain damage data and we proposed to use a principal component regression model or a ridge regression model after integrating data to overcome the difficulty. We evaluated the predictive performance of the proposed models by using the test data independent from the training data, and confirmed that the proposed methods showed better predictive performances than the linear regression model.

Analysis of AI interview data using unified non-crossing multiple quantile regression tree model (통합 비교차 다중 분위수회귀나무 모형을 활용한 AI 면접체계 자료 분석)

  • Kim, Jaeoh;Bang, Sungwan
    • The Korean Journal of Applied Statistics
    • /
    • v.33 no.6
    • /
    • pp.753-762
    • /
    • 2020
  • With an increasing interest in integrating artificial intelligence (AI) into interview processes, the Republic of Korea (ROK) army is trying to lead and analyze AI-powered interview platform. This study is to analyze the AI interview data using a unified non-crossing multiple quantile tree (UNQRT) model. Compared to the UNQRT, the existing models, such as quantile regression and quantile regression tree model (QRT), are inadequate for the analysis of AI interview data. Specially, the linearity assumption of the quantile regression is overly strong for the aforementioned application. While the QRT model seems to be applicable by relaxing the linearity assumption, it suffers from crossing problems among estimated quantile functions and leads to an uninterpretable model. The UNQRT circumvents the crossing problem of quantile functions by simultaneously estimating multiple quantile functions with a non-crossing constraint and is robust from extreme quantiles. Furthermore, the single tree construction from the UNQRT leads to an interpretable model compared to the QRT model. In this study, by using the UNQRT, we explored the relationship between the results of the Army AI interview system and the existing personnel data to derive meaningful results.

Development of Accident Forecasting Models in Freeway Tunnels using Multiple Linear Regression Analysis (다중선형 회귀분석을 이용한 고속도로 터널구간의 교통사고 예측모형 개발)

  • Park, Ju-Hwan;Kim, Sang-Gu
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.11 no.6
    • /
    • pp.145-154
    • /
    • 2012
  • This paper analyzed the characteristics of traffic accidents in all tunnels on nationwide freeways and selected some various independent variables related to accident occurrence in tunnels. The study aims to develop reliable accident forecasting models using the various dependent variables such as the number of accident (no.), no./km, and no./MVK. Finally, reliable multiple linear regression models were proposed in this paper. This study tested the validity verification of developed models through statistics such as $R^2$, F values, multicollinearity, residual analysis. The paper selected the accident forecasting models considering the characteristics of tunnel accidents and two models were finally proposed according to two groups of tunnel length. In the selected models, natural logarithm of ln(no./MVK) is used for the dependent variable and AADT, vertical slope, and tunnel hight are used for the independent variables. The reliability of two models was proved by the comparison analysis between field data and estimating data using RMSE and MAE. These models may be not only effective in evaluating tunnel safety under design and planning phases of tunnel but also useful to reduce traffic accidents in tunnels and to manage the traffic flow of tunnel.

Application of multiple linear regression and artificial neural network models to forecast long-term precipitation in the Geum River basin (다중회귀모형과 인공신경망모형을 이용한 금강권역 강수량 장기예측)

  • Kim, Chul-Gyum;Lee, Jeongwoo;Lee, Jeong Eun;Kim, Hyeonjun
    • Journal of Korea Water Resources Association
    • /
    • v.55 no.10
    • /
    • pp.723-736
    • /
    • 2022
  • In this study, monthly precipitation forecasting models that can predict up to 12 months in advance were constructed for the Geum River basin, and two statistical techniques, multiple linear regression (MLR) and artificial neural network (ANN), were applied to the model construction. As predictor candidates, a total of 47 climate indices were used, including 39 global climate patterns provided by the National Oceanic and Atmospheric Administration (NOAA) and 8 meteorological factors for the basin. Forecast models were constructed by using climate indices with high correlation by analyzing the teleconnection between the monthly precipitation and each climate index for the past 40 years based on the forecast month. In the goodness-of-fit test results for the average value of forecasts of each month for 1991 to 2021, the MLR models showed -3.3 to -0.1% for the percent bias (PBIAS), 0.45 to 0.50 for the Nash-Sutcliffe efficiency (NSE), and 0.69 to 0.70 for the Pearson correlation coefficient (r), whereas, the ANN models showed PBIAS -5.0~+0.5%, NSE 0.35~0.47, and r 0.64~0.70. The mean values predicted by the MLR models were found to be closer to the observation than the ANN models. The probability of including observations within the forecast range for each month was 57.5 to 83.6% (average 72.9%) for the MLR models, and 71.5 to 88.7% (average 81.1%) for the ANN models, indicating that the ANN models showed better results. The tercile probability by month was 25.9 to 41.9% (average 34.6%) for the MLR models, and 30.3 to 39.1% (average 34.7%) for the ANN models. Both models showed long-term predictability of monthly precipitation with an average of 33.3% or more in tercile probability. In conclusion, the difference in predictability between the two models was found to be relatively small. However, when judging from the hit rate for the prediction range or the tercile probability, the monthly deviation for predictability was found to be relatively small for the ANN models.

Forecasting Technique of Downstream Water Level using the Observed Water Level of Upper Stream (수계 상류 관측 수위자료를 이용한 하류 홍수위 예측기법)

  • Kim, Sang Mun;Choi, Byungwoong;Lee, Namjoo
    • Ecology and Resilient Infrastructure
    • /
    • v.7 no.4
    • /
    • pp.345-352
    • /
    • 2020
  • Securing the lead time for evacuation is crucial to minimize flood damage. In this study, downstream water levels for heavy rainfall were predicted using measured water level observation data. Multiple regression analysis and artificial neural networks were applied to the Seom River experimental watershed to predict the water level. Water level observation data for the Seom River experimental watershed from 2002 to 2010 were used to perform the multiple regression analysis and to train the artificial neural networks. The water level was predicted using the trained model. The simulation results for the coefficients of determination of the artificial neural network level prediction ranged from 0.991 to 0.999, while those of the multiple regression analysis ranged from 0.945 to 0.990. The water level prediction model developed using an artificial neural network was better than the multiple-regression analysis model. This technique for forecasting downstream water levels is expected to contribute toward flooding warning systems that secure the lead time for streams.

Predicting a Queue Length Using a Deep Learning Model at Signalized Intersections (딥러닝 모형을 이용한 신호교차로 대기행렬길이 예측)

  • Na, Da-Hyuk;Lee, Sang-Soo;Cho, Keun-Min;Kim, Ho-Yeon
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.20 no.6
    • /
    • pp.26-36
    • /
    • 2021
  • In this study, a deep learning model for predicting the queue length was developed using the information collected from the image detector. Then, a multiple regression analysis model, a statistical technique, was derived and compared using two indices of mean absolute error(MAE) and root mean square error(RMSE). From the results of multiple regression analysis, time, day of the week, occupancy, and bus traffic were found to be statistically significant variables. Occupancy showed the most strong impact on the queue length among the variables. For the optimal deep learning model, 4 hidden layers and 6 lookback were determined, and MAE and RMSE were 6.34 and 8.99. As a result of evaluating the two models, the MAE of the multiple regression model and the deep learning model were 13.65 and 6.44, respectively, and the RMSE were 19.10 and 9.11, respectively. The deep learning model reduced the MAE by 52.8% and the RMSE by 52.3% compared to the multiple regression model.

Predicting Financial Success of a Movie Using Multiple Regression Analysis (다중회귀 분석을 이용한 영화 흥행 예측)

  • Jeong, Hoe-Yun;Yang, Hyung-Jeong
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2013.07a
    • /
    • pp.275-278
    • /
    • 2013
  • 영화의 흥행 요소를 파악하여 영화의 흥행 여부를 예측하는 것은 영화의 수익성 부분에서 아주 중요하다. 영화 시장이 과거와는 다르게 증가함에 따라, 다양한 영화 흥행에 관한 예측 연구들이 개발되었다. 본 논문에서는 영화 흥행 요소들을 수집하고 다중회귀 분석을 통해서 유의수준을 만족하는 흥행 요소들을 선택한다. 그 후, 이러한 요소들을 예측 방법들의 입력값으로 사용하여 영화 흥행을 예측한다. 성능을 비교하기 위해 본 논문에서 제안한 방법과 현재 개발된 영화 흥행 예측 방법(다중회귀, 의사결정트리, 인공신경망)들을 정확도와 평균제곱근오차를 통해 예측 모형의 성능을 비교한다. 그 결과, 다중 회귀 분석을 통해 유의한 흥행요소들만을 고려한 예측 방법의 정확도가 모든 흥행 요소들을 고려한 예측 방법보다 평균 8.2% 향상되었고, 현재까지 개발된 영화 흥행 예측 방법보다 더 높은 예측 성능을 보여준다.

  • PDF

Comparison of Different Multiple Linear Regression Models for Real-time Flood Stage Forecasting (실시간 수위 예측을 위한 다중선형회귀 모형의 비교)

  • Choi, Seung Yong;Han, Kun Yeun;Kim, Byung Hyun
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.32 no.1B
    • /
    • pp.9-20
    • /
    • 2012
  • Recently to overcome limitations of conceptual, hydrological and physics based models for flood stage forecasting, multiple linear regression model as one of data-driven models have been widely adopted for forecasting flood streamflow(stage). The objectives of this study are to compare performance of different multiple linear regression models according to regression coefficient estimation methods and determine most effective multiple linear regression flood stage forecasting models. To do this, the time scale was determined through the autocorrelation analysis of input data and different flood stage forecasting models developed using regression coefficient estimation methods such as LS(least square), WLS(weighted least square), SPW(stepwise) was applied to flood events in Jungrang stream. To evaluate performance of established models, fours statistical indices were used, namely; Root mean square error(RMSE), Nash Sutcliffe efficiency coefficient (NSEC), mean absolute error (MAE), adjusted coefficient of determination($R^{*2}$). The results show that the flood stage forecasting model using SPW(stepwise) parameter estimation can carry out the river flood stage prediction better in comparison with others, and the flood stage forecasting model using LS(least square) parameter estimation is also found to be slightly better than the flood stage forecasting model using WLS(weighted least square) parameter estimation.