통합 검색 | Korea Science

Robustness of model averaging methods for the violation of standard linear regression assumptions

Lee, Yongsu;Song, Juwon
- Communications for Statistical Applications and Methods
- /
- 제28권2호
- /
- pp.189-204
- /
- 2021
In a regression analysis, a single best model is usually selected among several candidate models. However, it is often useful to combine several candidate models to achieve better performance, especially, in the prediction viewpoint. Model combining methods such as stacking and Bayesian model averaging (BMA) have been suggested from the perspective of averaging candidate models. When the candidate models include a true model, it is expected that BMA generally gives better performance than stacking. On the other hand, when candidate models do not include the true model, it is known that stacking outperforms BMA. Since stacking and BMA approaches have different properties, it is difficult to determine which method is more appropriate under other situations. In particular, it is not easy to find research papers that compare stacking and BMA when regression model assumptions are violated. Therefore, in the paper, we compare the performance among model averaging methods as well as a single best model in the linear regression analysis when standard linear regression assumptions are violated. Simulations were conducted to compare model averaging methods with the linear regression when data include outliers and data do not include them. We also compared them when data include errors from a non-normal distribution. The model averaging methods were applied to the water pollution data, which have a strong multicollinearity among variables. Simulation studies showed that the stacking method tends to give better performance than BMA or standard linear regression analysis (including the stepwise selection method) in the sense of risks (see (3.1)) or prediction error (see (3.2)) when typical linear regression assumptions are violated.
https://doi.org/10.29220/CSAM.2021.28.2.189 인용 PDF KSCI

전기 가격 예측을 위한 맵리듀스 기반의 로컬 단위 선형회귀 모델 (MapReduce-based Localized Linear Regression for Electricity Price Forecasting)

한진주;이인규;온병원
- 전기학회논문지P
- /
- 제67권4호
- /
- pp.183-190
- /
- 2018
Predicting accurate electricity prices is an important task in the electricity trading market. To address the electricity price forecasting problem, various approaches have been proposed so far and it is known that linear regression-based approaches are the best. However, the use of such linear regression-based methods is limited due to low accuracy and performance. In traditional linear regression methods, it is not practical to find a nonlinear regression model that explains the training data well. If the training data is complex (i.e., small-sized individual data and large-sized features), it is difficult to find the polynomial function with n terms as the model that fits to the training data. On the other hand, as a linear regression model approximating a nonlinear regression model is used, the accuracy of the model drops considerably because it does not accurately reflect the characteristics of the training data. To cope with this problem, we propose a new electricity price forecasting method that divides the entire dataset to multiple split datasets and find the best linear regression models, each of which is the optimal model in each dataset. Meanwhile, to improve the performance of the proposed method, we modify the proposed localized linear regression method in the map and reduce way that is a framework for parallel processing data stored in a Hadoop distributed file system. Our experimental results show that the proposed model outperforms the existing linear regression model. Specifically, the accuracy of the proposed method is improved by 45% and the performance is faster 5 times than the existing linear regression-based model.
https://doi.org/10.5370/KIEEP.2018.67.4.183 인용 PDF KSCI

Scree Diagram for Detecting Multicollinearity and Estimating Ridge Constant in Linear Regression Model

Jang, Dae-Heung
- Communications for Statistical Applications and Methods
- /
- 제5권1호
- /
- pp.19-24
- /
- 1998
When multicollinearity appears in linear regression model, we can use ridge regression for stabilizing the regression coefficient estimates. We propose the screen diagram as a graphical method for detecting multicollinearity and estimating ridge constant in linear regression model.
PDF

퍼지 선형회귀모형과 응용 (Fuzzy linear regression model and its application)

이성호;홍덕헌
- 응용통계연구
- /
- 제10권2호
- /
- pp.403-411
- /
- 1997
본 연구에서는 시스템을 지배하는 변수들에 대한 자료가 부정확하거나 애매모호한 경우에 통계적 회귀모형의 대안으로서 제안된 퍼지 회귀모형과 그 모수 추정을 살펴본다. 그리고 사례연구를 통하여 퍼지 회귀모형의 장단점을 이해하고 결과를 비교해본다.
PDF

FUZZY REGRESSION MODEL WITH MONOTONIC RESPONSE FUNCTION

Choi, Seung Hoe;Jung, Hye-Young;Lee, Woo-Joo;Yoon, Jin Hee
- 대한수학회논문집
- /
- 제33권3호
- /
- pp.973-983
- /
- 2018
Fuzzy linear regression model has been widely studied with many successful applications but there have been only a few studies on the fuzzy regression model with monotonic response function as a generalization of the linear response function. In this paper, we propose the fuzzy regression model with the monotonic response function and the algorithm to construct the proposed model by using ${\alpha}-level$ set of fuzzy number and the resolution identity theorem. To estimate parameters of the proposed model, the least squares (LS) method and the least absolute deviation (LAD) method have been used in this paper. In addition, to evaluate the performance of the proposed model, two performance measures of goodness of fit are introduced. The numerical examples indicate that the fuzzy regression model with the monotonic response function is preferable to the fuzzy linear regression model when the fuzzy data represent the non-linear pattern.
https://doi.org/10.4134/CKMS.c170079 인용 PDF KSCI

A Note on a Fuzzy Linear Regression Model for Fuzzy Input-output Date Using Real Coefficients

Hong, Dug-Hun
- Communications for Statistical Applications and Methods
- /
- 제8권2호
- /
- pp.319-325
- /
- 2001
In this note, we propose a simple fuzzy linear regression model for fuzzy input-output data based on Tanaka's approach. Then an LP-based method to derived the satisfying solution of the decision making is developed.
PDF

An Approach to Applying Multiple Linear Regression Models by Interlacing Data in Classifying Similar Software

Lim, Hyun-il
- Journal of Information Processing Systems
- /
- 제18권2호
- /
- pp.268-281
- /
- 2022
The development of information technology is bringing many changes to everyday life, and machine learning can be used as a technique to solve a wide range of real-world problems. Analysis and utilization of data are essential processes in applying machine learning to real-world problems. As a method of processing data in machine learning, we propose an approach based on applying multiple linear regression models by interlacing data to the task of classifying similar software. Linear regression is widely used in estimation problems to model the relationship between input and output data. In our approach, multiple linear regression models are generated by training on interlaced feature data. A combination of these multiple models is then used as the prediction model for classifying similar software. Experiments are performed to evaluate the proposed approach as compared to conventional linear regression, and the experimental results show that the proposed method classifies similar software more accurately than the conventional model. We anticipate the proposed approach to be applied to various kinds of classification problems to improve the accuracy of conventional linear regression.
https://doi.org/10.3745/JIPS.04.0241 인용 PDF KSCI

Fuzzy regression using regularlization method based on Tanaka's model

Hong Dug-Hun;Kim Kyung-Tae
- 한국지능시스템학회논문지
- /
- 제16권4호
- /
- pp.499-505
- /
- 2006
Regularlization approach to regression can be easily found in Statistics and Information Science literature. The technique of regularlization was introduced as a way of controlling the smoothness properties of regression function. In this paper, we have presented a new method to evaluate linear and non-linear fuzzy regression model based on Tanaka's model using the idea of regularlization technique. Especially this method is a very attractive approach to model non -linear fuzzy data.
https://doi.org/10.5391/JKIIS.2006.16.4.499 인용 PDF KSCI

Performance Comparison Analysis of Artificial Intelligence Models for Estimating Remaining Capacity of Lithium-Ion Batteries

Kyu-Ha Kim;Byeong-Soo Jung;Sang-Hyun Lee
- International Journal of Advanced Culture Technology
- /
- 제11권3호
- /
- pp.310-314
- /
- 2023
The purpose of this study is to predict the remaining capacity of lithium-ion batteries and evaluate their performance using five artificial intelligence models, including linear regression analysis, decision tree, random forest, neural network, and ensemble model. We is in the study, measured Excel data from the CS2 lithium-ion battery was used, and the prediction accuracy of the model was measured using evaluation indicators such as mean square error, mean absolute error, coefficient of determination, and root mean square error. As a result of this study, the Root Mean Square Error(RMSE) of the linear regression model was 0.045, the decision tree model was 0.038, the random forest model was 0.034, the neural network model was 0.032, and the ensemble model was 0.030. The ensemble model had the best prediction performance, with the neural network model taking second place. The decision tree model and random forest model also performed quite well, and the linear regression model showed poor prediction performance compared to other models. Therefore, through this study, ensemble models and neural network models are most suitable for predicting the remaining capacity of lithium-ion batteries, and decision tree and random forest models also showed good performance. Linear regression models showed relatively poor predictive performance. Therefore, it was concluded that it is appropriate to prioritize ensemble models and neural network models in order to improve the efficiency of battery management and energy systems.
https://doi.org/10.17703/IJACT.2023.11.3.310 인용 PDF

다중선형회귀법을 활용한 예민화와 환경변수에 따른 AL-6XN강의 공식특성 예측 (Prediction of Pitting Corrosion Characteristics of AL-6XN Steel with Sensitization and Environmental Variables Using Multiple Linear Regression Method)

정광후;김성종
- Corrosion Science and Technology
- /
- 제19권6호
- /
- pp.302-309
- /
- 2020
This study aimed to predict the pitting corrosion characteristics of AL-6XN super-austenitic steel using multiple linear regression. The variables used in the model are degree of sensitization, temperature, and pH. Experiments were designed and cyclic polarization curve tests were conducted accordingly. The data obtained from the cyclic polarization curve tests were used as training data for the multiple linear regression model. The significance of each factor in the response (critical pitting potential, repassivation potential) was analyzed. The multiple linear regression model was validated using experimental conditions that were not included in the training data. As a result, the degree of sensitization showed a greater effect than the other variables. Multiple linear regression showed poor performance for prediction of repassivation potential. On the other hand, the model showed a considerable degree of predictive performance for critical pitting potential. The coefficient of determination (R2) was 0.7745. The possibility for pitting potential prediction was confirmed using multiple linear regression.
https://doi.org/10.14773/cst.2020.19.6.302 인용 PDF KSCI

검색결과 1,985건 처리시간 0.027초

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)