• 제목/요약/키워드: linear regression models

검색결과 937건 처리시간 0.028초

Fuzzy Local Linear Regression Analysis

  • Hong, Dug-Hun;Kim, Jong-Tae
    • Journal of the Korean Data and Information Science Society
    • /
    • 제18권2호
    • /
    • pp.515-524
    • /
    • 2007
  • This paper deals with local linear estimation of fuzzy regression models based on Diamond(1998) as a new class of non-linear fuzzy regression. The purpose of this paper is to introduce a use of smoothing in testing for lack of fit of parametric fuzzy regression models.

  • PDF

An Approach to Applying Multiple Linear Regression Models by Interlacing Data in Classifying Similar Software

  • Lim, Hyun-il
    • Journal of Information Processing Systems
    • /
    • 제18권2호
    • /
    • pp.268-281
    • /
    • 2022
  • The development of information technology is bringing many changes to everyday life, and machine learning can be used as a technique to solve a wide range of real-world problems. Analysis and utilization of data are essential processes in applying machine learning to real-world problems. As a method of processing data in machine learning, we propose an approach based on applying multiple linear regression models by interlacing data to the task of classifying similar software. Linear regression is widely used in estimation problems to model the relationship between input and output data. In our approach, multiple linear regression models are generated by training on interlaced feature data. A combination of these multiple models is then used as the prediction model for classifying similar software. Experiments are performed to evaluate the proposed approach as compared to conventional linear regression, and the experimental results show that the proposed method classifies similar software more accurately than the conventional model. We anticipate the proposed approach to be applied to various kinds of classification problems to improve the accuracy of conventional linear regression.

Robustness of model averaging methods for the violation of standard linear regression assumptions

  • Lee, Yongsu;Song, Juwon
    • Communications for Statistical Applications and Methods
    • /
    • 제28권2호
    • /
    • pp.189-204
    • /
    • 2021
  • In a regression analysis, a single best model is usually selected among several candidate models. However, it is often useful to combine several candidate models to achieve better performance, especially, in the prediction viewpoint. Model combining methods such as stacking and Bayesian model averaging (BMA) have been suggested from the perspective of averaging candidate models. When the candidate models include a true model, it is expected that BMA generally gives better performance than stacking. On the other hand, when candidate models do not include the true model, it is known that stacking outperforms BMA. Since stacking and BMA approaches have different properties, it is difficult to determine which method is more appropriate under other situations. In particular, it is not easy to find research papers that compare stacking and BMA when regression model assumptions are violated. Therefore, in the paper, we compare the performance among model averaging methods as well as a single best model in the linear regression analysis when standard linear regression assumptions are violated. Simulations were conducted to compare model averaging methods with the linear regression when data include outliers and data do not include them. We also compared them when data include errors from a non-normal distribution. The model averaging methods were applied to the water pollution data, which have a strong multicollinearity among variables. Simulation studies showed that the stacking method tends to give better performance than BMA or standard linear regression analysis (including the stepwise selection method) in the sense of risks (see (3.1)) or prediction error (see (3.2)) when typical linear regression assumptions are violated.

Performance Comparison Analysis of Artificial Intelligence Models for Estimating Remaining Capacity of Lithium-Ion Batteries

  • Kyu-Ha Kim;Byeong-Soo Jung;Sang-Hyun Lee
    • International Journal of Advanced Culture Technology
    • /
    • 제11권3호
    • /
    • pp.310-314
    • /
    • 2023
  • The purpose of this study is to predict the remaining capacity of lithium-ion batteries and evaluate their performance using five artificial intelligence models, including linear regression analysis, decision tree, random forest, neural network, and ensemble model. We is in the study, measured Excel data from the CS2 lithium-ion battery was used, and the prediction accuracy of the model was measured using evaluation indicators such as mean square error, mean absolute error, coefficient of determination, and root mean square error. As a result of this study, the Root Mean Square Error(RMSE) of the linear regression model was 0.045, the decision tree model was 0.038, the random forest model was 0.034, the neural network model was 0.032, and the ensemble model was 0.030. The ensemble model had the best prediction performance, with the neural network model taking second place. The decision tree model and random forest model also performed quite well, and the linear regression model showed poor prediction performance compared to other models. Therefore, through this study, ensemble models and neural network models are most suitable for predicting the remaining capacity of lithium-ion batteries, and decision tree and random forest models also showed good performance. Linear regression models showed relatively poor predictive performance. Therefore, it was concluded that it is appropriate to prioritize ensemble models and neural network models in order to improve the efficiency of battery management and energy systems.

로터리 사고발생 위치별 사고모형 개발 (Developing Accident Models of Rotary by Accident Occurrence Location)

  • 나희;박병호
    • 한국도로학회논문집
    • /
    • 제14권4호
    • /
    • pp.83-91
    • /
    • 2012
  • PURPOSES : This study deals with Rotary by Accident Occurrence Location. The purpose of this study is to develop the accident models of rotary by location. METHODS : In pursuing the above, this study gives particular attentions to developing the appropriate models using multiple linear, Poisson and negative binomial regression models and statistical analysis tools. RESULTS : First, four multiple linear regression models which are statistically significant(their $R^2$ values are 0.781, 0.300, 0.784 and 0.644 respectively) are developed, and four Poisson regression models which are statistically significant(their ${\rho}^2$ values are 0.407, 0.306, 0.378 and 0.366 respectively) are developed. Second, the test results of fitness using RMSE, %RMSE, MPB and MAD show that Poisson regression model in the case of circulatory roadway, pedestrian crossing and others and multiple linear regression model in the case of entry/exit sections are appropriate to the given data. Finally, the common variable that affects to the accident is adopted to be traffic volume. CONCLUSIONS : 8 models which are all statistically significant are developed, and the common and specific variables that are related to the models are derived.

A Comparative Study on the Performance of Bayesian Partially Linear Models

  • Woo, Yoonsung;Choi, Taeryon;Kim, Wooseok
    • Communications for Statistical Applications and Methods
    • /
    • 제19권6호
    • /
    • pp.885-898
    • /
    • 2012
  • In this paper, we consider Bayesian approaches to partially linear models, in which a regression function is represented by a semiparametric additive form of a parametric linear regression function and a nonparametric regression function. We make a comparative study on the performance of widely used Bayesian partially linear models in terms of empirical analysis. Specifically, we deal with three Bayesian methods to estimate the nonparametric regression function, one method using Fourier series representation, the other method based on Gaussian process regression approach, and the third method based on the smoothness of the function and differencing. We compare the numerical performance of three methods by the root mean squared error(RMSE). For empirical analysis, we consider synthetic data with simulation studies and real data application by fitting each of them with three Bayesian methods and comparing the RMSEs.

Robustness of Minimum Disparity Estimators in Linear Regression Models

  • Pak, Ro-Jin
    • Journal of the Korean Statistical Society
    • /
    • 제24권2호
    • /
    • pp.349-360
    • /
    • 1995
  • This paper deals with the robustness properties of the minimum disparity estimation in linear regression models. The estimators defined as statistical quantities whcih minimize the blended weight Hellinger distance between a weighted kernel density estimator of the residuals and a smoothed model density of the residuals. It is shown that if the weights of the density estimator are appropriately chosen, the estimates of the regression parameters are robust.

  • PDF

Analysis of Characteristics of All Solid-State Batteries Using Linear Regression Models

  • Kyo-Chan Lee;Sang-Hyun Lee
    • International journal of advanced smart convergence
    • /
    • 제13권1호
    • /
    • pp.206-211
    • /
    • 2024
  • This study used a total of 205,565 datasets of 'voltage', 'current', '℃', and 'time(s)' to systematically analyze the properties and performance of solid electrolytes. As a method for characterizing solid electrolytes, a linear regression model, one of the machine learning models, is used to visualize the relationship between 'voltage' and 'current' and calculate the regression coefficient, mean squared error (MSE), and coefficient of determination (R^2). The regression coefficient between 'Voltage' and 'Current' in the results of the linear regression model is about 1.89, indicating that 'Voltage' has a positive effect on 'Current', and it is expected that the current will increase by about 1.89 times as the voltage increases. MSE found that the mean squared error between the model's predicted and actual values was about 0.3, with smaller values closer to the model's predictions to the actual values. The coefficient of determination (R^2) is about 0.25, which can be interpreted as explaining 25% of the data.

LACTATION CURVE OF HOLSTEIN FRIESIAN COWS IN THE KINGDOM OF SAUDI ARABIA

  • Ali, A.K.A.;Al-Jumaah, R.S.;Hayes, E.
    • Asian-Australasian Journal of Animal Sciences
    • /
    • 제9권4호
    • /
    • pp.439-447
    • /
    • 1996
  • Monthly test day production for 12,020 records, were collected from six of the largest specialized dairy farms located in central region of the Kingdom of Saudi Arabia. The records described lactating cows in four parities and two seasons of calving. Monthly test day records were fitted using Wood's model $At{{^b}{_e}}^{-ct}$ with multiple and additive error term. Linear and non-linear regression models were used to find the estimates of the parameters necessary to draw the lactation curves. The shape of the lactation curves of different parities showed that third lactation has the heighest peak (43.08 kg) for linear regression model and (42.08 kg) for non-linear regression model. Fourth lactation has the lowest peak (24.00kg) for linear regression model and (25.64 kg) for non-linear regression models. Cows of second and third lactations reached the peak at 58 day for both linear and non-linear regression models. Cows of first lactation were more persistent and had late peak at 68 and 67 days for both models respectively. While, third lactation cows were lower persistent and had early peak at 58 day for both models. Cows calved at winter months have higher starting values (A), higher ascending slope (b) and higher decending slope (c). Least square means of milk yield of the first four parities and for overall data were 6,653, 7,659, 7,482, 6,988 and 7,614 kg respectively. The corresponding lactation period were 358, 367, 350, 363 and 364 days respectively.