• 제목/요약/키워드: Multiple Linear Regression Model

검색결과 627건 처리시간 0.022초

Development of the Algorithm for Optimizing Wavelength Selection in Multiple Linear Regression

  • Hoeil Chung
    • Near Infrared Analysis
    • /
    • 제1권1호
    • /
    • pp.1-7
    • /
    • 2000
  • A convenient algorithm for optimizing wavelength selection in multiple linear regression (MLR) has been developed. MOP (MLP Optimization Program) has been developed to test all possible MLR calibration models in a given spectral range and finally find an optimal MLR model with external validation capability. MOP generates all calibration models from all possible combinations of wavelength, and simultaneously calculates SEC (Standard Error of Calibration) and SEV (Standard Error of Validation) by predicting samples in a validation data set. Finally, with determined SEC and SEV, it calculates another parameter called SAD (Sum of SEC, SEV, and Absolute Difference between SEC and SEV: sum(SEC+SEV+Abs(SEC-SEV)). SAD is an useful parameter to find an optimal calibration model without over-fitting by simultaneously evaluating SEC, SEV, and difference of error between calibration and validation. The calibration model corresponding to the smallest SAD value is chosen as an optimum because the errors in both calibration and validation are minimal as well as similar in scale. To evaluate the capability of MOP, the determination of benzene content in unleaded gasoline has been examined. MOP successfully found the optimal calibration model and showed the better calibration and independent prediction performance compared to conventional MLR calibration.

다중선형회귀모델을 이용한 움직임 추정방법 (Motion estimation method using multiple linear regression model)

  • 김학수;임원택;이재철;이규원;박규택
    • 전자공학회논문지S
    • /
    • 제34S권10호
    • /
    • pp.98-103
    • /
    • 1997
  • Given the small bit allocation for motion information in very low bit-rate coding, motion estimation using the block matching algorithm(BMA) fails to maintain an acceptable level of prediction errors. The reson is that the motion model, or spatial transformation, assumed in block matching cannot approximate the motion in the real world precisely with a small number of parameters. In order to overcome the drawback of the conventional block matching algorithm, several triangle-based methods which utilize triangular patches insead of blocks have been proposed. To estimate the motions of image sequences, these methods usually have been based on the combination of optical flow equation, affine transform, and iteration. But the compuataional cost of these methods is expensive. This paper presents a fast motion estimation algorithm using a multiple linear regression model to solve the defects of the BMA and the triange-based methods. After describing the basic 2-D triangle-based method, the details of the proposed multiple linear regression model are presented along with the motion estimation results from one standard video sequence, representative of MPEG-4 class A data. The simulationresuls show that in the proposed method, the average PSNR is improved about 1.24 dB in comparison with the BMA method, and the computational cost is reduced about 25% in comparison with the 2-D triangle-based method.

  • PDF

Development of a Multiple Linear Regression Model to Analyze Traffic Volume Error Factors in Radar Detectors

  • Kim, Do Hoon;Kim, Eung Cheol
    • 한국측량학회지
    • /
    • 제39권5호
    • /
    • pp.253-263
    • /
    • 2021
  • Traffic data collected using advanced equipment are highly valuable for traffic planning and efficient road operation. However, there is a problem regarding the reliability of the analysis results due to equipment defects, errors in the data aggregation process, and missing data. Unlike other detectors installed for each vehicle lane, radar detectors can yield different error types because they detect all traffic volume in multilane two-way roads via a single installation external to the roadway. For the traffic data of a radar detector to be representative of reliable data, the error factors of the radar detector must be analyzed. This study presents a field survey of variables that may cause errors in traffic volume collection by targeting the points where radar detectors are installed. Video traffic data are used to determine the errors in traffic measured by a radar detector. This study establishes three types of radar detector traffic errors, i.e., artificial, mechanical, and complex errors. Among these types, it is difficult to determine the cause of the errors due to several complex factors. To solve this problem, this study developed a radar detector traffic volume error analysis model using a multiple linear regression model. The results indicate that the characteristics of the detector, road facilities, geometry, and other traffic environment factors affect errors in traffic volume detection.

다중 선형 회귀 기반 기계 학습을 이용한 인공지지체의 사각 기공 형태 진단 모델에 관한 연구 (A Study on Square Pore Shape Discrimination Model of Scaffold Using Machine Learning Based Multiple Linear Regression)

  • 이송연;허용정
    • 반도체디스플레이기술학회지
    • /
    • 제19권4호
    • /
    • pp.59-64
    • /
    • 2020
  • In this paper, we found the solution using data based machine learning regression method to check the pore shape, to solve the problem of the experiment quantity occurring when producing scaffold with the 3d printer. Through experiments, we learned secured each print condition and pore shape. We have produced the scaffold from scaffold pore shape defect prediction model using multiple linear regression method. We predicted scaffold pore shapes of unsecured print condition using the manufactured scaffold pore shape defect prediction model. We randomly selected 20 print conditions from various predicted print conditions. We print scaffold five times under same print condition. We measured the pore shape of scaffold. We compared printed average pore shape with predicted pore shape. We have confirmed the prediction model precision is 99 %.

Bayesian inference for an ordered multiple linear regression with skew normal errors

  • Jeong, Jeongmun;Chung, Younshik
    • Communications for Statistical Applications and Methods
    • /
    • 제27권2호
    • /
    • pp.189-199
    • /
    • 2020
  • This paper studies a Bayesian ordered multiple linear regression model with skew normal error. It is reasonable that the kind of inherent information available in an applied regression requires some constraints on the coefficients to be estimated. In addition, the assumption of normality of the errors is sometimes not appropriate in the real data. Therefore, to explain such situations more flexibly, we use the skew-normal distribution given by Sahu et al. (The Canadian Journal of Statistics, 31, 129-150, 2003) for error-terms including normal distribution. For Bayesian methodology, the Markov chain Monte Carlo method is employed to resolve complicated integration problems. Also, under the improper priors, the propriety of the associated posterior density is shown. Our Bayesian proposed model is applied to NZAPB's apple data. For model comparison between the skew normal error model and the normal error model, we use the Bayes factor and deviance information criterion given by Spiegelhalter et al. (Journal of the Royal Statistical Society Series B (Statistical Methodology), 64, 583-639, 2002). We also consider the problem of detecting an influential point concerning skewness using Bayes factors. Finally, concluding remarks are discussed.

경제지표를 활용한 다중선형회귀 모델 기반 국제 휘발유 가격 예측 (A study of Predicting International Gasoline Prices based on Multiple Linear Regression with Economic Indicators)

  • 한명은;김지연;이현희;김세인;박민서
    • 문화기술의 융합
    • /
    • 제10권1호
    • /
    • pp.159-164
    • /
    • 2024
  • 국내 석유 시장은 국제 석유 가격의 변동에 매우 민감하기 때문에 그 변동성에 대한 파악과 대처가 중요하다. 특히, 높은 소비량을 보이는 휘발유의 가격이 어떠한 요인에 인해 변화하는지 명확하게 파악하는 것이 필요하다. 국제 휘발유 가격은 휘발유 수급, 지정학적 사건, 미국 달러화 가치 변동 등 글로벌 요인에 영향을 받는다. 그러나 기존의 연구들은 휘발유의 수급에만 초점에 맞추어 진행하였다는 한계가 존재한다. 본 연구에서는 다양한 머신러닝 기반의 회귀 모델을 활용하여 거시적 경제지표와 국제 휘발유 가격 간의 인과관계를 탐색한다. 첫째, 다양한 세계 경제지표 데이터를 수집한다. 둘째, 데이터 전처리를 진행한다. 셋째, 다중선형회귀, Ridge 회귀, Lasso(Least Absolute Shrinkage and Selection Operator) 회귀 모델을 활용하여 모델링한다. 실험 결과, 테스트 데이터 셋에서 다중선형회귀 모델이 가장 높은 정확도(97.3%)를 보였다. 우리는 국제 휘발유 가격의 예측은 국내 경제 안정성과 에너지 정책 결정에 도움이 될 수 있을 것으로 기대한다.

MULTIPLE OUTLIER DETECTION IN LOGISTIC REGRESSION BY USING INFLUENCE MATRIX

  • Lee, Gwi-Hyun;Park, Sung-Hyun
    • Journal of the Korean Statistical Society
    • /
    • 제36권4호
    • /
    • pp.457-469
    • /
    • 2007
  • Many procedures are available to identify a single outlier or an isolated influential point in linear regression and logistic regression. But the detection of influential points or multiple outliers is more difficult, owing to masking and swamping problems. The multiple outlier detection methods for logistic regression have not been studied from the points of direct procedure yet. In this paper we consider the direct methods for logistic regression by extending the $Pe\tilde{n}a$ and Yohai (1995) influence matrix algorithm. We define the influence matrix in logistic regression by using Cook's distance in logistic regression, and test multiple outliers by using the mean shift model. To show accuracy of the proposed multiple outlier detection algorithm, we simulate artificial data including multiple outliers with masking and swamping.

A model to characterize the effect of particle size of fly ash on the mechanical properties of concrete by the grey multiple linear regression

  • Cui, Yunpeng;Liu, Jun;Wang, Licheng;Liu, Runqing;Pang, Bo
    • Computers and Concrete
    • /
    • 제26권2호
    • /
    • pp.175-183
    • /
    • 2020
  • Fly ash has become an important component of concrete as supplementary cementitious material with the development of concrete technology. To make use of fly ash efficiently, four types of fly ash with particle size distributions that are in conformity with four functions, namely, S.Tsivilis, Andersen, Normal and F distribution, respectively, were prepared. The four particle size distributions as functions of the strength and pore structure of concrete were thereafter constructed and investigated. The results showed that the compressive and flexural strength of concrete with the fly ash that conforming to S.Tsivilis, Normal, F distribution increased by 5-10 MPa and 1-2 MPa, respectively, compared to the reference sample at 28 d. The pore structure of the concrete was improved, in which the total porosity of concrete decreased by 2-5% at 28 d. With regarding to the fly ash with Andersen distribution, it was however not conducive to the strength development of concrete. Regression model based on the grey multiple linear regression theory was proved to be efficient to predict the strength of concrete, according to the characteristic parameters of particle size and pore structure of the fly ash.

디스플레이 FAB 생산능력 예측 개선 사례 연구 (A Case Study on the Improvement of Display FAB Production Capacity Prediction)

  • 길준필;최진영
    • 산업경영시스템학회지
    • /
    • 제43권2호
    • /
    • pp.137-145
    • /
    • 2020
  • Various elements of Fabrication (FAB), mass production of existing products, new product development and process improvement evaluation might increase the complexity of production process when products are produced at the same time. As a result, complex production operation makes it difficult to predict production capacity of facilities. In this environment, production forecasting is the basic information used for production plan, preventive maintenance, yield management, and new product development. In this paper, we tried to develop a multiple linear regression analysis model in order to improve the existing production capacity forecasting method, which is to estimate production capacity by using a simple trend analysis during short time periods. Specifically, we defined overall equipment effectiveness of facility as a performance measure to represent production capacity. Then, we considered the production capacities of interrelated facilities in the FAB production process during past several weeks as independent regression variables in order to reflect the impact of facility maintenance cycles and production sequences. By applying variable selection methods and selecting only some significant variables, we developed a multiple linear regression forecasting model. Through a numerical experiment, we showed the superiority of the proposed method by obtaining the mean residual error of 3.98%, and improving the previous one by 7.9%.

다변량 선형회귀분석을 이용한 증발접시계수 산정방법 적용성 검토 (Evaluation of applicability of pan coefficient estimation method by multiple linear regression analysis)

  • 임창수
    • 한국수자원학회논문집
    • /
    • 제55권3호
    • /
    • pp.229-243
    • /
    • 2022
  • 우리나라 11개 기상관측지역의 월별 기상자료가 증발접시계수에 미치는 영향을 분석하고, 증발접시계수 산정을 위한 4가지 형태의 다변량 선형회귀모형의 적용성을 검토하였다. 개발된 증발접시계수 산정모형의 적용성을 평가하기 위해서 기존에 다른 연구자들에 의해서 제안된 6가지의 모형과 비교 평가하였다. 우리나라 11개 기상관측지역에서 증발접시계수는 1, 2, 3, 7, 11, 12월은 기온에 가장 큰 영향을 받고, 다른 월들은 일사량에 가장 큰 영향을 받는 것으로 나타났다. 전반적으로 모든 월에서 풍속과 상대습도는 기온이나 일사량과 비교해서 증발접시계수에 큰 영향을 미치지 않는 것으로 나타났다. 모든 지역과 월에서 각 지역별로 5개의 독립변수(풍속, 상대습도, 기온, 일조시간과 가조시간의 비, 일사량)를 적용하여 유도된 모형이 가장 양호한 증발량 산정 결과를 보였다. 모형 검증결과에 의하면 다변량 선형회귀분석을 적용하여 증발접시계수를 산정하는 경우 일부 지역과 월에서 제한적으로 적용할 수 있을 것으로 판단된다.