• Title/Summary/Keyword: Multiple regression model

Search Result 2,510, Processing Time 0.03 seconds

Prediction of New Confirmed Cases of COVID-19 based on Multiple Linear Regression and Random Forest (다중 선형 회귀와 랜덤 포레스트 기반의 코로나19 신규 확진자 예측)

  • Kim, Jun Su;Choi, Byung-Jae
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.17 no.4
    • /
    • pp.249-255
    • /
    • 2022
  • The COVID-19 virus appeared in 2019 and is extremely contagious. Because it is very infectious and has a huge impact on people's mobility. In this paper, multiple linear regression and random forest models are used to predict the number of COVID-19 cases using COVID-19 infection status data (open source data provided by the Ministry of health and welfare) and Google Mobility Data, which can check the liquidity of various categories. The data has been divided into two sets. The first dataset is COVID-19 infection status data and all six variables of Google Mobility Data. The second dataset is COVID-19 infection status data and only two variables of Google Mobility Data: (1) Retail stores and leisure facilities (2) Grocery stores and pharmacies. The models' performance has been compared using the mean absolute error indicator. We also a correlation analysis of the random forest model and the multiple linear regression model.

A Study on the Derivation of the Unit Hydrograph using Multiple Regression Model (다중회귀모형으로 추정된 모수에 의한 최적단위유량도의 유도에 관한 연구)

  • 이종남;김채원;황창현
    • Water for future
    • /
    • v.25 no.1
    • /
    • pp.93-100
    • /
    • 1992
  • A study on the Derivation of the Unit Hydrograph using Multiple Regression Moe이. The purpose of this study is to deriver an optimal unit hydrograph suing the multiple regression model, particularly when only small amount of data is available. The presence of multicollinearity among the input data can cause serious oscillations in the derivation of the unit hydrograph. In this case, the oscillations in the unit hydrograph ordinate are eliminated by combining the data. The data used in this study are based upon the collection and arrangement of rainfall-runoff data(1977-1989) at the Soyang-river Dam site. When the matrix X is the rainfall series, the condition number and the reciprocal of the minimum eigenvalue of XTX are calculated by the Jacobi an method, and are compared with the oscillation in the unit hydrograph. The optimal unit hydrograph is derived by combining the numerous rainfall-runoff data. The conclusions are as follows; 1)The oscillations in the derived unit hydrograph are reduced by combining the data from each flood event. 2) The reciprocals of the minimum eigen\value of XTX, 1/k and the condition number CN are increased when the oscillations are active in the derived unit hydrograph. 3)The parameter estimates are validated by extending the model to the Soyang river Dam site with elimination of the autocorrelation in the disturbances. Finally, this paper illustrates the application of the multiple regression model to drive an optimal unit hydrograph dealing with the multicollinearity and the autocorrelation which cause some problems.

  • PDF

DC Motor Control using Regression Equation and PID Controller (회귀방정식과 PID제어기에 의한 DC모터 제어)

  • 서기영;이수흠;문상필;이내일;최종수
    • Proceedings of the Korea Institute of Convergence Signal Processing
    • /
    • 2000.08a
    • /
    • pp.129-132
    • /
    • 2000
  • We propose a new method to deal with the optimized auto-tuning for the PID controller which is used to the process -control in various fields. First of all, in this method, initial values of DC motor are determined by the Ziegler-Nichols method. Finally, after studying the parameters of PID controller by input vector of multiple regression analysis, when we give new K, L, T values to multiple regression model, the optimized parameters of PID controller is found by multiple regression analysis program.

  • PDF

Study of estimated model of drift through real ship (실선에 의한 표류 예측모델에 관한 연구)

  • Chang-Heon LEE;Kwang-Il KIM;Sang-Lok YOO;Min-Son KIM;Seung-Hun HAN
    • Journal of the Korean Society of Fisheries and Ocean Technology
    • /
    • v.60 no.1
    • /
    • pp.57-70
    • /
    • 2024
  • In order to present a predictive drift model, Jeju National University's training ship was tested for about 11 hours and 40 minutes, and 81 samples that selected one of the entire samples at ten-minute intervals were subjected to regression analysis after verifying outliers and influence points. In the outlier and influence point analysis, although there is a part where the wind direction exceeds 1 in the DFBETAS (difference in Betas) value, the CV (cumulative variable) value is 6%, close to 1. Therefore, it was judged that there would be no problem in conducting multiple regression analyses on samples. The standard regression coefficient showed how much current and wind affect the dependent variable. It showed that current speed and direction were the most important variables for drift speed and direction, with values of 47.1% and 58.1%, respectively. The analysis showed that the statistical values indicated the fit of the model at the significance level of 0.05 for multiple regression analysis. The multiple correlation coefficients indicating the degree of influence on the dependent variable were 83.2% and 89.0%, respectively. The determination of coefficients were 69.3% and 79.3%, and the adjusted determination of coefficients were 67.6% and 78.3%, respectively. In this study, a more quantitative prediction model will be presented because it is performed after identifying outliers and influence points of sample data before multiple regression analysis. Therefore, many studies will be active in the future by combining them.

Use of big data for estimation of impacts of meteorological variables on environmental radiation dose on Ulleung Island, Republic of Korea

  • Joo, Han Young;Kim, Jae Wook;Jeong, So Yun;Kim, Young Seo;Moon, Joo Hyun
    • Nuclear Engineering and Technology
    • /
    • v.53 no.12
    • /
    • pp.4189-4200
    • /
    • 2021
  • In this study, the relationship between the environmental radiation dose rate and meteorological variables was investigated with multiple regression analysis and big data of those variables. The environmental radiation dose rate and 36 different meteorological variables were measured on Ulleung Island, Republic of Korea, from 2011 to 2015. Not all meteorological variables were used in the regression analysis because the different meteorological variables significantly affect the environmental radiation dose rate during different periods, and the degree of influence changes with time. By applying the Pearson correlation analysis and stepwise selection methods to the big dataset, the major meteorological variables influencing the environmental radiation dose rate were identified, which were then used as the independent variables for the regression model. Subsequently, multiple regression models for the monthly datasets and dataset of the entire period were developed.

Multivariate Analysis for Clinicians (임상의를 위한 다변량 분석의 실제)

  • Oh, Joo Han;Chung, Seok Won
    • Clinics in Shoulder and Elbow
    • /
    • v.16 no.1
    • /
    • pp.63-72
    • /
    • 2013
  • In medical research, multivariate analysis, especially multiple regression analysis, is used to analyze the influence of multiple variables on the result. Multiple regression analysis should include variables in the model and the problem of multi-collinearity as there are many variables as well as the basic assumption of regression analysis. The multiple regression model is expressed as the coefficient of determination, $R^2$ and the influence of independent variables on result as a regression coefficient, ${\beta}$. Multiple regression analysis can be divided into multiple linear regression analysis, multiple logistic regression analysis, and Cox regression analysis according to the type of dependent variables (continuous variable, categorical variable (binary logit), and state variable, respectively), and the influence of variables on the result is evaluated by regression coefficient${\beta}$, odds ratio, and hazard ratio, respectively. The knowledge of multivariate analysis enables clinicians to analyze the result accurately and to design the further research efficiently.

A Technique to Improve the Fit of Linear Regression Models for Successive Sets of Data

  • Park, Sung H.
    • Journal of the Korean Statistical Society
    • /
    • v.5 no.1
    • /
    • pp.19-28
    • /
    • 1976
  • In empirical study for fitting a multiple linear regression model for successive cross-sections data observed on the same set of independent variables over several time periods, one often faces the problem of poor $R^2$, the multiple coefficient of determination, which provides a standard measure of how good a specified regression line fits the sample data.

  • PDF

Accident Models of Circular Intersections in Korea (국내 원형교차로 사고모형)

  • Lee, Seung Ju;Park, Min Kyu;Park, Byung Ho
    • Journal of the Korean Society of Safety
    • /
    • v.29 no.1
    • /
    • pp.54-58
    • /
    • 2014
  • This study deals with the accidents of circular intersections in Korea. The goal is to develop the accident models for 94 circular intersections. In pursuing the above, this study gives particular attentions to collecting the data of geometric structure and accidents, and comparatively analyzing such the models as Poisson and NB regression and multiple regression model using SPSS 17.0 and LIMDEP 3.0. The main results are as follows. First, the negative binomial model among various models was analyzed to be the most appropriate. Second, 3 independent variables was adopted in the model, and these variables was analyzed to have a positive relation to the accident rate. Finally, the reduced width of circulatory roadway, removal of the parking lot within circulatory roadway and appropriate levels of approach lane were required to improve the safety of circular intersection.

Semiparametric Bayesian Regression Model for Multiple Event Time Data

  • Kim, Yongdai
    • Journal of the Korean Statistical Society
    • /
    • v.31 no.4
    • /
    • pp.509-518
    • /
    • 2002
  • This paper is concerned with semiparametric Bayesian analysis of the proportional intensity regression model of the Poisson process for multiple event time data. A nonparametric prior distribution is put on the baseline cumulative intensity function and a usual parametric prior distribution is given to the regression parameter. Also we allow heterogeneity among the intensity processes in different subjects by using unobserved random frailty components. Gibbs sampling approach with the Metropolis-Hastings algorithm is used to explore the posterior distributions. Finally, the results are applied to a real data set.

Development of Neural Network Model for Pridiction of Daily Maximum Ozone Concentration in Summer (하계의 일 최고 오존농도 예측을 위한 신경망모델의 개발)

  • 김용국;이종범
    • Journal of Korean Society for Atmospheric Environment
    • /
    • v.10 no.4
    • /
    • pp.224-232
    • /
    • 1994
  • A new neural network model has been developed to predict short-term air pollution concentration. In addition, a multiple regression model widely used in statistical analysis was tested. These models were applied for prediction of daily maximum ozone concentration in Seoul during the summer season of 1991. The time periods between May and September 1989 and 1990 were utilized to train set of learning patterns in neural network model, and to estimate multiple regression model. To evaluate the results of the different models, several Performance indices were used. The results indicated that the multiple regression model tended to underpredict the daily maximum ozone concentration with small r$^{2}$(0.38). Also, large errors were found in this model; 21.1 ppb for RMSE, 0.324 for NMSE, and -0.164 for MRE. On the other hand, the results obtained from the neural network model were very promising. Thus, we can know that this model has a prominent efficiency in the adaptive control for the non-linear multi- variable systems such as photochemical oxidants. Also, when the recent new information was added in the neural network model, prediction accuracy was increased. From the new model, the values of RMSE, NMSE and r$^{2}$ were 13.2ppb, 0.089, 0.003 and 0.55 respectively.

  • PDF