• Title/Summary/Keyword: Multiple Linear Regression Model

Search Result 621, Processing Time 0.028 seconds

Development of the Algorithm for Optimizing Wavelength Selection in Multiple Linear Regression

  • Hoeil Chung
    • Near Infrared Analysis
    • /
    • v.1 no.1
    • /
    • pp.1-7
    • /
    • 2000
  • A convenient algorithm for optimizing wavelength selection in multiple linear regression (MLR) has been developed. MOP (MLP Optimization Program) has been developed to test all possible MLR calibration models in a given spectral range and finally find an optimal MLR model with external validation capability. MOP generates all calibration models from all possible combinations of wavelength, and simultaneously calculates SEC (Standard Error of Calibration) and SEV (Standard Error of Validation) by predicting samples in a validation data set. Finally, with determined SEC and SEV, it calculates another parameter called SAD (Sum of SEC, SEV, and Absolute Difference between SEC and SEV: sum(SEC+SEV+Abs(SEC-SEV)). SAD is an useful parameter to find an optimal calibration model without over-fitting by simultaneously evaluating SEC, SEV, and difference of error between calibration and validation. The calibration model corresponding to the smallest SAD value is chosen as an optimum because the errors in both calibration and validation are minimal as well as similar in scale. To evaluate the capability of MOP, the determination of benzene content in unleaded gasoline has been examined. MOP successfully found the optimal calibration model and showed the better calibration and independent prediction performance compared to conventional MLR calibration.

Motion estimation method using multiple linear regression model (다중선형회귀모델을 이용한 움직임 추정방법)

  • 김학수;임원택;이재철;이규원;박규택
    • Journal of the Korean Institute of Telematics and Electronics S
    • /
    • v.34S no.10
    • /
    • pp.98-103
    • /
    • 1997
  • Given the small bit allocation for motion information in very low bit-rate coding, motion estimation using the block matching algorithm(BMA) fails to maintain an acceptable level of prediction errors. The reson is that the motion model, or spatial transformation, assumed in block matching cannot approximate the motion in the real world precisely with a small number of parameters. In order to overcome the drawback of the conventional block matching algorithm, several triangle-based methods which utilize triangular patches insead of blocks have been proposed. To estimate the motions of image sequences, these methods usually have been based on the combination of optical flow equation, affine transform, and iteration. But the compuataional cost of these methods is expensive. This paper presents a fast motion estimation algorithm using a multiple linear regression model to solve the defects of the BMA and the triange-based methods. After describing the basic 2-D triangle-based method, the details of the proposed multiple linear regression model are presented along with the motion estimation results from one standard video sequence, representative of MPEG-4 class A data. The simulationresuls show that in the proposed method, the average PSNR is improved about 1.24 dB in comparison with the BMA method, and the computational cost is reduced about 25% in comparison with the 2-D triangle-based method.

  • PDF

Development of a Multiple Linear Regression Model to Analyze Traffic Volume Error Factors in Radar Detectors

  • Kim, Do Hoon;Kim, Eung Cheol
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.39 no.5
    • /
    • pp.253-263
    • /
    • 2021
  • Traffic data collected using advanced equipment are highly valuable for traffic planning and efficient road operation. However, there is a problem regarding the reliability of the analysis results due to equipment defects, errors in the data aggregation process, and missing data. Unlike other detectors installed for each vehicle lane, radar detectors can yield different error types because they detect all traffic volume in multilane two-way roads via a single installation external to the roadway. For the traffic data of a radar detector to be representative of reliable data, the error factors of the radar detector must be analyzed. This study presents a field survey of variables that may cause errors in traffic volume collection by targeting the points where radar detectors are installed. Video traffic data are used to determine the errors in traffic measured by a radar detector. This study establishes three types of radar detector traffic errors, i.e., artificial, mechanical, and complex errors. Among these types, it is difficult to determine the cause of the errors due to several complex factors. To solve this problem, this study developed a radar detector traffic volume error analysis model using a multiple linear regression model. The results indicate that the characteristics of the detector, road facilities, geometry, and other traffic environment factors affect errors in traffic volume detection.

A Study on Square Pore Shape Discrimination Model of Scaffold Using Machine Learning Based Multiple Linear Regression (다중 선형 회귀 기반 기계 학습을 이용한 인공지지체의 사각 기공 형태 진단 모델에 관한 연구)

  • Lee, Song-Yeon;Huh, Yong Jeong
    • Journal of the Semiconductor & Display Technology
    • /
    • v.19 no.4
    • /
    • pp.59-64
    • /
    • 2020
  • In this paper, we found the solution using data based machine learning regression method to check the pore shape, to solve the problem of the experiment quantity occurring when producing scaffold with the 3d printer. Through experiments, we learned secured each print condition and pore shape. We have produced the scaffold from scaffold pore shape defect prediction model using multiple linear regression method. We predicted scaffold pore shapes of unsecured print condition using the manufactured scaffold pore shape defect prediction model. We randomly selected 20 print conditions from various predicted print conditions. We print scaffold five times under same print condition. We measured the pore shape of scaffold. We compared printed average pore shape with predicted pore shape. We have confirmed the prediction model precision is 99 %.

Bayesian inference for an ordered multiple linear regression with skew normal errors

  • Jeong, Jeongmun;Chung, Younshik
    • Communications for Statistical Applications and Methods
    • /
    • v.27 no.2
    • /
    • pp.189-199
    • /
    • 2020
  • This paper studies a Bayesian ordered multiple linear regression model with skew normal error. It is reasonable that the kind of inherent information available in an applied regression requires some constraints on the coefficients to be estimated. In addition, the assumption of normality of the errors is sometimes not appropriate in the real data. Therefore, to explain such situations more flexibly, we use the skew-normal distribution given by Sahu et al. (The Canadian Journal of Statistics, 31, 129-150, 2003) for error-terms including normal distribution. For Bayesian methodology, the Markov chain Monte Carlo method is employed to resolve complicated integration problems. Also, under the improper priors, the propriety of the associated posterior density is shown. Our Bayesian proposed model is applied to NZAPB's apple data. For model comparison between the skew normal error model and the normal error model, we use the Bayes factor and deviance information criterion given by Spiegelhalter et al. (Journal of the Royal Statistical Society Series B (Statistical Methodology), 64, 583-639, 2002). We also consider the problem of detecting an influential point concerning skewness using Bayes factors. Finally, concluding remarks are discussed.

A study of Predicting International Gasoline Prices based on Multiple Linear Regression with Economic Indicators (경제지표를 활용한 다중선형회귀 모델 기반 국제 휘발유 가격 예측)

  • Myeongeun Han;Jiyeon Kim;Hyunhee Lee;Sein Kim;Minseo Park
    • The Journal of the Convergence on Culture Technology
    • /
    • v.10 no.1
    • /
    • pp.159-164
    • /
    • 2024
  • The domestic petroleum market is highly sensitive to changes in international oil prices. So, it is important to identify and respond to those changes. In particular, it is necessary to clearly understand the factors causing the price fluctuations of gasoline, which exhibits high consumption. International gasoline prices are influenced by global factors such as gasoline supplies, geopolitical events, and fluctuations in the U.S. dollar. However, previous studies have only focused on gasoline supplies. In this study, we explore the causal relationship between economic indicators and international gasoline prices using various machine learning-based regression models. First, we collect data on various global economic indicators. Second, we perform data preprocessing. Third, we model using Multiple linear regression, Ridge regression, and Lasso(Least Absolute Shrinkage and Selection Operator) regression. The multiple linear regression model showed the highest accuracy at 96.73% in test sets. As a result, Our Multiple linear regression model showed the highest accuracy at 96.73% in test sets. We will expect that our proposed model will be helpful for domestic economic stability and energy policy decisions.

MULTIPLE OUTLIER DETECTION IN LOGISTIC REGRESSION BY USING INFLUENCE MATRIX

  • Lee, Gwi-Hyun;Park, Sung-Hyun
    • Journal of the Korean Statistical Society
    • /
    • v.36 no.4
    • /
    • pp.457-469
    • /
    • 2007
  • Many procedures are available to identify a single outlier or an isolated influential point in linear regression and logistic regression. But the detection of influential points or multiple outliers is more difficult, owing to masking and swamping problems. The multiple outlier detection methods for logistic regression have not been studied from the points of direct procedure yet. In this paper we consider the direct methods for logistic regression by extending the $Pe\tilde{n}a$ and Yohai (1995) influence matrix algorithm. We define the influence matrix in logistic regression by using Cook's distance in logistic regression, and test multiple outliers by using the mean shift model. To show accuracy of the proposed multiple outlier detection algorithm, we simulate artificial data including multiple outliers with masking and swamping.

A model to characterize the effect of particle size of fly ash on the mechanical properties of concrete by the grey multiple linear regression

  • Cui, Yunpeng;Liu, Jun;Wang, Licheng;Liu, Runqing;Pang, Bo
    • Computers and Concrete
    • /
    • v.26 no.2
    • /
    • pp.175-183
    • /
    • 2020
  • Fly ash has become an important component of concrete as supplementary cementitious material with the development of concrete technology. To make use of fly ash efficiently, four types of fly ash with particle size distributions that are in conformity with four functions, namely, S.Tsivilis, Andersen, Normal and F distribution, respectively, were prepared. The four particle size distributions as functions of the strength and pore structure of concrete were thereafter constructed and investigated. The results showed that the compressive and flexural strength of concrete with the fly ash that conforming to S.Tsivilis, Normal, F distribution increased by 5-10 MPa and 1-2 MPa, respectively, compared to the reference sample at 28 d. The pore structure of the concrete was improved, in which the total porosity of concrete decreased by 2-5% at 28 d. With regarding to the fly ash with Andersen distribution, it was however not conducive to the strength development of concrete. Regression model based on the grey multiple linear regression theory was proved to be efficient to predict the strength of concrete, according to the characteristic parameters of particle size and pore structure of the fly ash.

A Case Study on the Improvement of Display FAB Production Capacity Prediction (디스플레이 FAB 생산능력 예측 개선 사례 연구)

  • Ghil, Joonpil;Choi, Jin Young
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.43 no.2
    • /
    • pp.137-145
    • /
    • 2020
  • Various elements of Fabrication (FAB), mass production of existing products, new product development and process improvement evaluation might increase the complexity of production process when products are produced at the same time. As a result, complex production operation makes it difficult to predict production capacity of facilities. In this environment, production forecasting is the basic information used for production plan, preventive maintenance, yield management, and new product development. In this paper, we tried to develop a multiple linear regression analysis model in order to improve the existing production capacity forecasting method, which is to estimate production capacity by using a simple trend analysis during short time periods. Specifically, we defined overall equipment effectiveness of facility as a performance measure to represent production capacity. Then, we considered the production capacities of interrelated facilities in the FAB production process during past several weeks as independent regression variables in order to reflect the impact of facility maintenance cycles and production sequences. By applying variable selection methods and selecting only some significant variables, we developed a multiple linear regression forecasting model. Through a numerical experiment, we showed the superiority of the proposed method by obtaining the mean residual error of 3.98%, and improving the previous one by 7.9%.

Evaluation of applicability of pan coefficient estimation method by multiple linear regression analysis (다변량 선형회귀분석을 이용한 증발접시계수 산정방법 적용성 검토)

  • Rim, Chang-Soo
    • Journal of Korea Water Resources Association
    • /
    • v.55 no.3
    • /
    • pp.229-243
    • /
    • 2022
  • The effects of monthly meteorological data measured at 11 stations in South Korea on pan coefficient were analyzed to develop the four types of multiple linear regression models for estimating pan coefficients. To evaluate the applicability of developed models, the models were compared with six previous models. Pan coefficients were most affected by air temperature for January, February, March, July, November and December, and by solar radiation for other months. On the whole, for 12 months of the year, the effects of wind speed and relative humidity on pan coefficient were less significant, compared with those of air temperature and solar radiation. For all meteorological stations and months, the model developed by applying 5 independent variables (wind speed, relative humidity, air temperature, ratio of sunshine duration and daylight duration, and solar radiation) for each station was the most effective for evaporation estimation. The model validation results indicate that the multiple linear regression models can be applied to some particular stations and months.