• 제목/요약/키워드: Multiple regression model

검색결과 2,523건 처리시간 0.036초

다중 선형 회귀와 랜덤 포레스트 기반의 코로나19 신규 확진자 예측 (Prediction of New Confirmed Cases of COVID-19 based on Multiple Linear Regression and Random Forest)

  • 김준수;최병재
    • 대한임베디드공학회논문지
    • /
    • 제17권4호
    • /
    • pp.249-255
    • /
    • 2022
  • The COVID-19 virus appeared in 2019 and is extremely contagious. Because it is very infectious and has a huge impact on people's mobility. In this paper, multiple linear regression and random forest models are used to predict the number of COVID-19 cases using COVID-19 infection status data (open source data provided by the Ministry of health and welfare) and Google Mobility Data, which can check the liquidity of various categories. The data has been divided into two sets. The first dataset is COVID-19 infection status data and all six variables of Google Mobility Data. The second dataset is COVID-19 infection status data and only two variables of Google Mobility Data: (1) Retail stores and leisure facilities (2) Grocery stores and pharmacies. The models' performance has been compared using the mean absolute error indicator. We also a correlation analysis of the random forest model and the multiple linear regression model.

다중회귀모형으로 추정된 모수에 의한 최적단위유량도의 유도에 관한 연구 (A Study on the Derivation of the Unit Hydrograph using Multiple Regression Model)

  • 이종남;김채원;황창현
    • 물과 미래
    • /
    • 제25권1호
    • /
    • pp.93-100
    • /
    • 1992
  • Abstract A study on the Derivation of the Unit Hydrograph using Multiple Regression Moe이. The purpose of this study is to deriver an optimal unit hydrograph suing the multiple regression model, particularly when only small amount of data is available. The presence of multicollinearity among the input data can cause serious oscillations in the derivation of the unit hydrograph. In this case, the oscillations in the unit hydrograph ordinate are eliminated by combining the data. The data used in this study are based upon the collection and arrangement of rainfall-runoff data(1977-1989) at the Soyang-river Dam site. When the matrix X is the rainfall series, the condition number and the reciprocal of the minimum eigenvalue of XTX are calculated by the Jacobi an method, and are compared with the oscillation in the unit hydrograph. The optimal unit hydrograph is derived by combining the numerous rainfall-runoff data. The conclusions are as follows; 1)The oscillations in the derived unit hydrograph are reduced by combining the data from each flood event. 2) The reciprocals of the minimum eigen\value of XTX, 1/k and the condition number CN are increased when the oscillations are active in the derived unit hydrograph. 3)The parameter estimates are validated by extending the model to the Soyang river Dam site with elimination of the autocorrelation in the disturbances. Finally, this paper illustrates the application of the multiple regression model to drive an optimal unit hydrograph dealing with the multicollinearity and the autocorrelation which cause some problems.

  • PDF

회귀방정식과 PID제어기에 의한 DC모터 제어 (DC Motor Control using Regression Equation and PID Controller)

  • 서기영;이수흠;문상필;이내일;최종수
    • 융합신호처리학회 학술대회논문집
    • /
    • 한국신호처리시스템학회 2000년도 하계종합학술대회논문집
    • /
    • pp.129-132
    • /
    • 2000
  • We propose a new method to deal with the optimized auto-tuning for the PID controller which is used to the process -control in various fields. First of all, in this method, initial values of DC motor are determined by the Ziegler-Nichols method. Finally, after studying the parameters of PID controller by input vector of multiple regression analysis, when we give new K, L, T values to multiple regression model, the optimized parameters of PID controller is found by multiple regression analysis program.

  • PDF

실선에 의한 표류 예측모델에 관한 연구 (Study of estimated model of drift through real ship)

  • 이창헌;김광일;유상록;김민선;한승훈
    • 수산해양기술연구
    • /
    • 제60권1호
    • /
    • pp.57-70
    • /
    • 2024
  • In order to present a predictive drift model, Jeju National University's training ship was tested for about 11 hours and 40 minutes, and 81 samples that selected one of the entire samples at ten-minute intervals were subjected to regression analysis after verifying outliers and influence points. In the outlier and influence point analysis, although there is a part where the wind direction exceeds 1 in the DFBETAS (difference in Betas) value, the CV (cumulative variable) value is 6%, close to 1. Therefore, it was judged that there would be no problem in conducting multiple regression analyses on samples. The standard regression coefficient showed how much current and wind affect the dependent variable. It showed that current speed and direction were the most important variables for drift speed and direction, with values of 47.1% and 58.1%, respectively. The analysis showed that the statistical values indicated the fit of the model at the significance level of 0.05 for multiple regression analysis. The multiple correlation coefficients indicating the degree of influence on the dependent variable were 83.2% and 89.0%, respectively. The determination of coefficients were 69.3% and 79.3%, and the adjusted determination of coefficients were 67.6% and 78.3%, respectively. In this study, a more quantitative prediction model will be presented because it is performed after identifying outliers and influence points of sample data before multiple regression analysis. Therefore, many studies will be active in the future by combining them.

Use of big data for estimation of impacts of meteorological variables on environmental radiation dose on Ulleung Island, Republic of Korea

  • Joo, Han Young;Kim, Jae Wook;Jeong, So Yun;Kim, Young Seo;Moon, Joo Hyun
    • Nuclear Engineering and Technology
    • /
    • 제53권12호
    • /
    • pp.4189-4200
    • /
    • 2021
  • In this study, the relationship between the environmental radiation dose rate and meteorological variables was investigated with multiple regression analysis and big data of those variables. The environmental radiation dose rate and 36 different meteorological variables were measured on Ulleung Island, Republic of Korea, from 2011 to 2015. Not all meteorological variables were used in the regression analysis because the different meteorological variables significantly affect the environmental radiation dose rate during different periods, and the degree of influence changes with time. By applying the Pearson correlation analysis and stepwise selection methods to the big dataset, the major meteorological variables influencing the environmental radiation dose rate were identified, which were then used as the independent variables for the regression model. Subsequently, multiple regression models for the monthly datasets and dataset of the entire period were developed.

임상의를 위한 다변량 분석의 실제 (Multivariate Analysis for Clinicians)

  • 오주한;정석원
    • Clinics in Shoulder and Elbow
    • /
    • 제16권1호
    • /
    • pp.63-72
    • /
    • 2013
  • 임상 의학의 연구에 사용되는 대표적 다변량 분석 방법은 다중 회귀 분석 방법인데, 이는 인과 관계를 토대로 여러 개의 변수에 의한 한꺼번에의 영향력을 분석하기 위한 방법이다. 다중 회귀 분석은 기본적으로 회귀 분석의 기본 가정을 만족해야 함은 물론, 여러 개의 독립 변수들이 포함되기 때문에 변수들을 모형에 포함시키는 방법 및 다중 공선성 문제에 대한 고려가 필요하다. 다중 회귀 분석 모형의 설명력은 결정 계수 $R^2$으로 표현되어 1에 가까울수록 설명력이 크며, 각 독립 변수들의 결과에의 영향력은 회귀 계수인 ${\beta}$값으로 표현된다. 다중 회귀 분석은 종속 변수의 형태에 따라 다중 선형 회귀 분석, 다중 로지스틱 회귀 분석, 콕스 회귀 분석으로 나눌 수 있다. 종속 변수가 연속 변수인 경우 다중 선형 회귀 분석, 범주형 변수인 경우 다중 로지스틱 회귀 분석, 시간의 영향을 고려한 상태 변수인 경우는 콕스 회귀 분석을 시행해야 하며, 각각 결과에의 영향력은 회귀 계수 ${\beta}$, 교차비, 위험비로 평가한다. 이러한 다변량 분석에 대한 이해는 연구를 계획하고 결과를 분석하고자 하는 임상 의사에게 있어 보다 효율적인 연구를 위해 필수적인 소양이라고 할 수 있다.

A Technique to Improve the Fit of Linear Regression Models for Successive Sets of Data

  • Park, Sung H.
    • Journal of the Korean Statistical Society
    • /
    • 제5권1호
    • /
    • pp.19-28
    • /
    • 1976
  • In empirical study for fitting a multiple linear regression model for successive cross-sections data observed on the same set of independent variables over several time periods, one often faces the problem of poor $R^2$, the multiple coefficient of determination, which provides a standard measure of how good a specified regression line fits the sample data.

  • PDF

국내 원형교차로 사고모형 (Accident Models of Circular Intersections in Korea)

  • 이승주;박민규;박병호
    • 한국안전학회지
    • /
    • 제29권1호
    • /
    • pp.54-58
    • /
    • 2014
  • This study deals with the accidents of circular intersections in Korea. The goal is to develop the accident models for 94 circular intersections. In pursuing the above, this study gives particular attentions to collecting the data of geometric structure and accidents, and comparatively analyzing such the models as Poisson and NB regression and multiple regression model using SPSS 17.0 and LIMDEP 3.0. The main results are as follows. First, the negative binomial model among various models was analyzed to be the most appropriate. Second, 3 independent variables was adopted in the model, and these variables was analyzed to have a positive relation to the accident rate. Finally, the reduced width of circulatory roadway, removal of the parking lot within circulatory roadway and appropriate levels of approach lane were required to improve the safety of circular intersection.

Semiparametric Bayesian Regression Model for Multiple Event Time Data

  • Kim, Yongdai
    • Journal of the Korean Statistical Society
    • /
    • 제31권4호
    • /
    • pp.509-518
    • /
    • 2002
  • This paper is concerned with semiparametric Bayesian analysis of the proportional intensity regression model of the Poisson process for multiple event time data. A nonparametric prior distribution is put on the baseline cumulative intensity function and a usual parametric prior distribution is given to the regression parameter. Also we allow heterogeneity among the intensity processes in different subjects by using unobserved random frailty components. Gibbs sampling approach with the Metropolis-Hastings algorithm is used to explore the posterior distributions. Finally, the results are applied to a real data set.

하계의 일 최고 오존농도 예측을 위한 신경망모델의 개발 (Development of Neural Network Model for Pridiction of Daily Maximum Ozone Concentration in Summer)

  • 김용국;이종범
    • 한국대기환경학회지
    • /
    • 제10권4호
    • /
    • pp.224-232
    • /
    • 1994
  • A new neural network model has been developed to predict short-term air pollution concentration. In addition, a multiple regression model widely used in statistical analysis was tested. These models were applied for prediction of daily maximum ozone concentration in Seoul during the summer season of 1991. The time periods between May and September 1989 and 1990 were utilized to train set of learning patterns in neural network model, and to estimate multiple regression model. To evaluate the results of the different models, several Performance indices were used. The results indicated that the multiple regression model tended to underpredict the daily maximum ozone concentration with small r$^{2}$(0.38). Also, large errors were found in this model; 21.1 ppb for RMSE, 0.324 for NMSE, and -0.164 for MRE. On the other hand, the results obtained from the neural network model were very promising. Thus, we can know that this model has a prominent efficiency in the adaptive control for the non-linear multi- variable systems such as photochemical oxidants. Also, when the recent new information was added in the neural network model, prediction accuracy was increased. From the new model, the values of RMSE, NMSE and r$^{2}$ were 13.2ppb, 0.089, 0.003 and 0.55 respectively.

  • PDF