• Title/Summary/Keyword: Multiple regression model

Search Result 2,523, Processing Time 0.03 seconds

Typhoon Path and Prediction Model Development for Building Damage Ratio Using Multiple Regression Analysis (태풍타입별 피해 분석 및 다중회귀분석을 활용한 태풍피해예측모델 개발 연구)

  • Yang, Seong-Pil;Son, Kiyoung;Lee, Kyoung-Hun;Kim, Ji-Myong
    • Journal of the Korea Institute of Building Construction
    • /
    • v.16 no.5
    • /
    • pp.437-445
    • /
    • 2016
  • Since typhoon is a critical meteorological disaster, some advanced countries have developed typhoon damage prediction models. However, although South Korea is vulnerable to typhoons, there is still shortage of study in typhoon damage prediction model reflecting the vulnerability of domestic building and features of disaster. Moreover, many studies have been only focused on the characteristics and typhoon and regional characteristics without various influencing factors. Therefore, the objective of this study is to analyze typhoon damage by path and develop to prediction model for building damage ratio by using multiple regression analysis. This study classifies the building damages by typhoon paths to identify influencing factors then the correlation analysis is conducted between building damage ratio and their factors. In addition, a multiple regression analysis is applied to develop a typhoon damage prediction model. Four categories; typhoon information, geography, construction environment, and socio-economy, are used as the independent variables. The results of this study will be used as fundamental material for the typhoon damage prediction model development of South Korea.

Optimum Model for Analyzing Lifetime Profitability of Holstein Cows

  • Shadparvar, A.A.;Nikbin, S.
    • Asian-Australasian Journal of Animal Sciences
    • /
    • v.21 no.6
    • /
    • pp.769-775
    • /
    • 2008
  • This study was on the relative net income (RNI) for 18,286 Iranian Holstein cows from 799 herds, with first freshening between 1991 and 2000. Two kinds of production system, which differed mainly in milk pricing system and feed cost, were considered. Four different models adopted from the literature were examined to find the optimum model. They differed by the cost of rearing and growth after first calving and they needed different amounts of economic data at the farm level. Results showed that four measures of RNI were highly correlated (>0.96) and could be used equally to measure lifetime profitability of cows. Therefore, in herds without a regular system for recording economic and management data, use of the simplest model is recommended. Multiple regression analysis revealed that RNI was affected by age at first freshening, milk yield and days of productive life (DPL), regardless of production system, and a similar breeding goal could be defined for the two systems. Multiple regression analysis of RNI showed that in order to obtain an unbiased estimate of economic value for DPL, the per day milk yield, not total lifetime milk yield, should be included in the regression model along with DPL. Regression analysis suggested that it is possible to predict RNI using information on age at first freshening along with the length of first lactation and per day milk yield with a coefficient of determination ranging from 0.44 to 0.47.

Simultaneous outlier detection and variable selection via difference-based regression model and stochastic search variable selection

  • Park, Jong Suk;Park, Chun Gun;Lee, Kyeong Eun
    • Communications for Statistical Applications and Methods
    • /
    • v.26 no.2
    • /
    • pp.149-161
    • /
    • 2019
  • In this article, we suggest the following approaches to simultaneous variable selection and outlier detection. First, we determine possible candidates for outliers using properties of an intercept estimator in a difference-based regression model, and the information of outliers is reflected in the multiple regression model adding mean shift parameters. Second, we select the best model from the model including the outlier candidates as predictors using stochastic search variable selection. Finally, we evaluate our method using simulations and real data analysis to yield promising results. In addition, we need to develop our method to make robust estimates. We will also to the nonparametric regression model for simultaneous outlier detection and variable selection.

An Incremental Regression Model for Time Series Data Prediction (시계열 데이터 예측을 위한 점진적인 회귀분석 모델)

  • Kim Sung-Hyun;Lee Yong-Mi;Jin Long;Seo Sung-Bo;Ryu Keun-Ho
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2006.05a
    • /
    • pp.23-26
    • /
    • 2006
  • 기존의 데이터 마이닝 예측 기법 중 회귀분석은 학습 단계에서 생성된 모델을 변경 없이 새로운 데이터에 적용하였다. 그러나 시계열 데이터에 모델 변경 없이 동일하게 적용하면 시간이 지남에 따라 정확도가 낮아지는 단점이 있다. 따라서 이 논문에서는 시간에 따라 변화하는 시계열데이터의 특성을 고려하여 점진적으로 회귀 모델을 갱신하는 기법을 제안한다. 이 기법은 입력되는 모든 데이터를 회귀 모델에 적용하여 점진적으로 모델을 갱신한다. 제안된 기법의 타당성은 RME(Relative Mean Error)와 RMSE(Root Mean Square Error)를 이용하여 측정하였다. 정확도 측정 실험 결과 제안 기법인 IMQR(Incremental Multiple Quadratic Regression) 기법이 MLR(Multiple Linear Regression), MQR(Multiple Quadratic Regression), SVR(Support Vector Regression) 기법에 비해 RME 가 평균 2%, RMSE 가 평균 0.02 정도 우수한 결과를 얻었다.

  • PDF

A study of Predicting International Gasoline Prices based on Multiple Linear Regression with Economic Indicators (경제지표를 활용한 다중선형회귀 모델 기반 국제 휘발유 가격 예측)

  • Myeongeun Han;Jiyeon Kim;Hyunhee Lee;Sein Kim;Minseo Park
    • The Journal of the Convergence on Culture Technology
    • /
    • v.10 no.1
    • /
    • pp.159-164
    • /
    • 2024
  • The domestic petroleum market is highly sensitive to changes in international oil prices. So, it is important to identify and respond to those changes. In particular, it is necessary to clearly understand the factors causing the price fluctuations of gasoline, which exhibits high consumption. International gasoline prices are influenced by global factors such as gasoline supplies, geopolitical events, and fluctuations in the U.S. dollar. However, previous studies have only focused on gasoline supplies. In this study, we explore the causal relationship between economic indicators and international gasoline prices using various machine learning-based regression models. First, we collect data on various global economic indicators. Second, we perform data preprocessing. Third, we model using Multiple linear regression, Ridge regression, and Lasso(Least Absolute Shrinkage and Selection Operator) regression. The multiple linear regression model showed the highest accuracy at 96.73% in test sets. As a result, Our Multiple linear regression model showed the highest accuracy at 96.73% in test sets. We will expect that our proposed model will be helpful for domestic economic stability and energy policy decisions.

Development of the Algorithm for Optimizing Wavelength Selection in Multiple Linear Regression

  • Hoeil Chung
    • Near Infrared Analysis
    • /
    • v.1 no.1
    • /
    • pp.1-7
    • /
    • 2000
  • A convenient algorithm for optimizing wavelength selection in multiple linear regression (MLR) has been developed. MOP (MLP Optimization Program) has been developed to test all possible MLR calibration models in a given spectral range and finally find an optimal MLR model with external validation capability. MOP generates all calibration models from all possible combinations of wavelength, and simultaneously calculates SEC (Standard Error of Calibration) and SEV (Standard Error of Validation) by predicting samples in a validation data set. Finally, with determined SEC and SEV, it calculates another parameter called SAD (Sum of SEC, SEV, and Absolute Difference between SEC and SEV: sum(SEC+SEV+Abs(SEC-SEV)). SAD is an useful parameter to find an optimal calibration model without over-fitting by simultaneously evaluating SEC, SEV, and difference of error between calibration and validation. The calibration model corresponding to the smallest SAD value is chosen as an optimum because the errors in both calibration and validation are minimal as well as similar in scale. To evaluate the capability of MOP, the determination of benzene content in unleaded gasoline has been examined. MOP successfully found the optimal calibration model and showed the better calibration and independent prediction performance compared to conventional MLR calibration.

Motion estimation method using multiple linear regression model (다중선형회귀모델을 이용한 움직임 추정방법)

  • 김학수;임원택;이재철;이규원;박규택
    • Journal of the Korean Institute of Telematics and Electronics S
    • /
    • v.34S no.10
    • /
    • pp.98-103
    • /
    • 1997
  • Given the small bit allocation for motion information in very low bit-rate coding, motion estimation using the block matching algorithm(BMA) fails to maintain an acceptable level of prediction errors. The reson is that the motion model, or spatial transformation, assumed in block matching cannot approximate the motion in the real world precisely with a small number of parameters. In order to overcome the drawback of the conventional block matching algorithm, several triangle-based methods which utilize triangular patches insead of blocks have been proposed. To estimate the motions of image sequences, these methods usually have been based on the combination of optical flow equation, affine transform, and iteration. But the compuataional cost of these methods is expensive. This paper presents a fast motion estimation algorithm using a multiple linear regression model to solve the defects of the BMA and the triange-based methods. After describing the basic 2-D triangle-based method, the details of the proposed multiple linear regression model are presented along with the motion estimation results from one standard video sequence, representative of MPEG-4 class A data. The simulationresuls show that in the proposed method, the average PSNR is improved about 1.24 dB in comparison with the BMA method, and the computational cost is reduced about 25% in comparison with the 2-D triangle-based method.

  • PDF

Development of a Multiple Linear Regression Model to Analyze Traffic Volume Error Factors in Radar Detectors

  • Kim, Do Hoon;Kim, Eung Cheol
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.39 no.5
    • /
    • pp.253-263
    • /
    • 2021
  • Traffic data collected using advanced equipment are highly valuable for traffic planning and efficient road operation. However, there is a problem regarding the reliability of the analysis results due to equipment defects, errors in the data aggregation process, and missing data. Unlike other detectors installed for each vehicle lane, radar detectors can yield different error types because they detect all traffic volume in multilane two-way roads via a single installation external to the roadway. For the traffic data of a radar detector to be representative of reliable data, the error factors of the radar detector must be analyzed. This study presents a field survey of variables that may cause errors in traffic volume collection by targeting the points where radar detectors are installed. Video traffic data are used to determine the errors in traffic measured by a radar detector. This study establishes three types of radar detector traffic errors, i.e., artificial, mechanical, and complex errors. Among these types, it is difficult to determine the cause of the errors due to several complex factors. To solve this problem, this study developed a radar detector traffic volume error analysis model using a multiple linear regression model. The results indicate that the characteristics of the detector, road facilities, geometry, and other traffic environment factors affect errors in traffic volume detection.

A New Deletion Criterion of Principal Components Regression with Orientations of the Parameters

  • Lee, Won-Woo
    • Journal of the Korean Statistical Society
    • /
    • v.16 no.2
    • /
    • pp.55-70
    • /
    • 1987
  • The principal components regression is one of the substitues for least squares method when there exists multicollinearity in the multiple linear regression model. It is observed graphically that the performance of the principal components regression is strongly dependent upon the values of the parameters. Accordingly, a new deletion criterion which determines proper principal components to be deleted from the analysis is developed and its usefulness is checked by simulations.

  • PDF

Construction of Urban Crime Prediction Model based on Census Using GWR (GWR을 이용한 센서스 기반 도시범죄 특성 분석 및 예측모델 구축)

  • YOO, Young-Woo;BAEK, Tae-Kyung
    • Journal of the Korean Association of Geographic Information Studies
    • /
    • v.20 no.4
    • /
    • pp.65-76
    • /
    • 2017
  • The purpose of this study was to present a prediction model that reflects crime risk area analysis, including factors and spatial characteristics, as a precursor to preparing an alternative plan for crime prevention and design. This analysis of criminal cases in high-risk areas revealed clusters in which approximately 25% of the cases within the study area occurred, distributed evenly throughout the region. This means that using a multiple linear regression model might overestimate the crime rate in some regions and underestimate in others. It also suggests that the number of deserted houses in an analyzed region has a negative relationship with the dependent variable, based on the multiple linear regression model results, and can also have different influences depending on the region. These results reveal that closure signs in a study area affect the dependent variable differently, depending on the region, rather than a simple or direct relationship with the dependent variable, as indicated by the results of the multiple linear regression model.