• Title/Summary/Keyword: Multiple regression analysis

Search Result 9,440, Processing Time 0.037 seconds

A Comparison of Construction Cost Estimation Using Multiple Regression Analysis and Neural Network in Elementary School Project

  • Cho, Hong-Gyu;Kim, Kyong-Gon;Kim, Jang-Young;Kim, Gwang-Hee
    • Journal of the Korea Institute of Building Construction
    • /
    • v.13 no.1
    • /
    • pp.66-74
    • /
    • 2013
  • In the early stages of a construction project, the most important thing is to predict construction costs in a rational way. For this reason, many studies have been performed on the estimation of construction costs for apartment housing and office buildings at early stage using artificial intelligence, statistics, and the like. In this study, cost data held by a provincial Office of Education on elementary schools constructed from 2004 to 2007 were used to compare the multiple regression model with an artificial neural network model. A total of 96 historical data were classified into 76 historical data for constructing models and 20 historical data for comparing the constructed regression model with the artificial neural network model. The results of an analysis of predicted construction costs were that the error rate of the artificial neural network model is lower than that of the multiple regression model.

Deletion diagnostics in fitting a given regression model to a new observation

  • Kim, Myung Geun
    • Communications for Statistical Applications and Methods
    • /
    • v.23 no.3
    • /
    • pp.231-239
    • /
    • 2016
  • A graphical diagnostic method based on multiple case deletions in a regression context is introduced by using the sampling distribution of the difference between two least squares estimators with and without multiple cases. Principal components analysis plays a key role in deriving this diagnostic method. Multiple case deletions of test statistic are also considered when a new observation is fitted to a given regression model. The result is useful for detecting influential observations in econometric data analysis, for example in checking whether the consumption pattern at a later time is the same as the one found before or not, as well as for investigating the influence of cases in the usual regression model. An illustrative example is given.

Comparative Study of Age Estimation Accuracy in Gustafsonss Method and Prediction Formula by Multiple Regression (다변인회귀분석법과 Gustafson 방법에 의한 연령감정 정확도의 비교연구)

  • 곽경환;김종열
    • Journal of Oral Medicine and Pain
    • /
    • v.10 no.1
    • /
    • pp.73-89
    • /
    • 1985
  • This study comprised 157 extracted teeth, 73 of the teeth originated from mates and 84 from females, the age range was 12-79 years. The correlation coefficient of each Gustafson's criteria in relation to age was carried out. Age estimation were performed on 157 teeth according to the method by Gustafson and by use of multiple regression, as used by Johanson, after evaluating the six criteria of Gustafson by multiple regression computer analysis. Two prediction formulas and standard deviations were compared with each other. The results were as follows : 1. The author found that six Gustafson's criteria had strong correlation with age except root resorption, and correlation coefficients were r = 0.79 (Transparent dentin), r=0.72 (Secondary dentin), r 0.69 (Periodontal change), r=0.63(Attrition), r = 0.39 (Root resorption), respecti vely. 2. The age estimation formula by Gustafson's method was calculated as follows: Y 8.88 + 3.52X r =0.87, r2 = 0.76, SD = 8.18, F = 483.56, P < 0.01 The age estimation formula by multiple regression was calculated as follows: Y 8.57 + 6.37T + 6.37T + 4.63P + 2.70S + 2.40C + 3.08A + 1.34R r= 0.89, r2 = 0.78, SD = 7.82, F = 91.62, P < 0.01, Durbin-Watson Coefficient = 1.09 3. In comparison of two estimation formulas, the formula by multiple regression, the method of Johanson, was found to be slightly more reliable than Gustafson's method. Gustafson's method SD = 8.18, Multiple regression (Johanson's method) SD = 7.82 4. It was reaffirmed that Gustafson's six criteria could be a independent variable in multiple regression analysis.

  • PDF

Assessment of slope stability using multiple regression analysis

  • Marrapu, Balendra M.;Jakka, Ravi S.
    • Geomechanics and Engineering
    • /
    • v.13 no.2
    • /
    • pp.237-254
    • /
    • 2017
  • Estimation of slope stability is a very important task in geotechnical engineering. However, its estimation using conventional and soft computing methods has several drawbacks. Use of conventional limit equilibrium methods for the evaluation of slope stability is very tedious and time consuming, while the use of soft computing approaches like Artificial Neural Networks and Fuzzy Logic are black box approaches. Multiple Regression (MR) analysis provides an alternative to conventional and soft computing methods, for the evaluation of slope stability. MR models provide a simplified equation, which can be used to calculate critical factor of safety of slopes without adopting any iterative procedure, thereby reducing the time and complexity involved in the evaluation of slope stability. In the present study, a multiple regression model has been developed and tested its accuracy in the estimation of slope stability using real field data. Here, two separate multiple regression models have been developed for dry and wet slopes. Further, the accuracy of these developed models have been compared and validated with respect to conventional limit equilibrium methods in terms of Mean Square Error (MSE) & Coefficient of determination ($R^2$). As the developed MR models here are not based on any region specific data and covers wide range of parametric variations, they can be directly applied to any real slopes.

A Study on Factors Affecting the Use of Ambulatory Physician Services (의사방문수 결정요인 분석)

  • 박현애;송건용
    • Health Policy and Management
    • /
    • v.4 no.2
    • /
    • pp.58-76
    • /
    • 1994
  • In order to study factors affecting the use of the ambulatory physician services. Andersen's model for health utilization was modified by adding the health behavior component and examined with three different approaches. Three different approaches were the multiople regression model, logistic regression model, and LISREL model. For multiple regression, dependent variable was reported illness-related visits to a physician during past one year and independent variables are variaous variables measuring predisposing factor, enabling factor, need factor and health behavior. For the logistic regression, dependent variable was visit or no-visit to a physician during past one year and independent variables were same as the multiple regression analysis. For the LISREL, five endogenous variables of health utiliztion, predisposing factor, enabling factor, need factor, and health behavior and 20 exogeneous variables which measures five endogenous variables were used. According to the multiple regression analysis, chronic illness, health status, perceived health status of the need factor; residence, sex, age, marital status, education of the predisposing factor ; health insurance, usual source for medical care of enabling factor were the siginificant exploratory variables for the health utilization. Out of the logistic regression analysis, health status, chronic illness, residence, marital status, education, drinking, use of health aid were found to be significant exploratory variables. From LISREL, need factor affect utilization most following by predisposing factor, enabling factor and health behavior. For LISREL model, age, education, and residence for predisposing factor; health status, chronic illess, and perceived health status for need factor; medical insurance for enabling factor; and doing any kind of health behavior for the health behavior were found as the significant observed variables for each theoretical variables.

  • PDF

Traffic Accident Models of 3-Legged Signalized Intersections in the Case of Cheongju (3지 신호교차로의 교통사고 발생모형 - 청주시를 사례로 -)

  • Park, Byung-Ho;Han, Sang-Uk;Kim, Tae-Young
    • Journal of the Korean Society of Safety
    • /
    • v.24 no.2
    • /
    • pp.94-99
    • /
    • 2009
  • This study deals with the traffic accidents at the 3-legged signalized intersections in Cheongu. The goals are to analyze the geometric, traffic and operational conditions of intersections and to develop a various functional forms that predict the accidents. The models are developed through the correlation analysis, the multiple linear, the multiple nonlinear, Poisson and negative binomial regression analysis. In this study, two multiple linear, two multiple nonlinear and two negative binomial regression models were calibrated. These models were all analyzed to be statistically significant. All the models include 2 common variables(traffic volume and lane width) and model-specific variables. These variables are, therefore, evaluated to be critical to the accident reduction of Cheongju.

Multivariate Statistical Analysis and Prediction for the Flash Points of Binary Systems Using Physical Properties of Pure Substances (순수 성분의 물성 자료를 이용한 2성분계 혼합물의 인화점에 대한 다변량 통계 분석 및 예측)

  • Lee, Bom-Sock;Kim, Sung-Young
    • Journal of the Korean Institute of Gas
    • /
    • v.11 no.3
    • /
    • pp.13-18
    • /
    • 2007
  • The multivariate statistical analysis, using the multiple linear regression(MLR), have been applied to analyze and predict the flash points of binary systems. Prediction for the flash points of flammable substances is important for the examination of the fire and explosion hazards in the chemical process design. In this paper, the flash points are predicted by MLR based on the physical properties of pure substances and the experimental flash points data. The results of regression and prediction by MLR are compared with the values calculated by Raoult's law and Van Laar equation.

  • PDF

A Study on Forecast of Oyster Production using Time Series Models (시계열모형을 이용한 굴 생산량 예측 가능성에 관한 연구)

  • Nam, Jong-Oh;Noh, Seung-Guk
    • Ocean and Polar Research
    • /
    • v.34 no.2
    • /
    • pp.185-195
    • /
    • 2012
  • This paper focused on forecasting a short-term production of oysters, which have been farmed in Korea, with distinct periodicity of production by year, and different production level by month. To forecast a short-term oyster production, this paper uses monthly data (260 observations) from January 1990 to August 2011, and also adopts several econometrics methods, such as Multiple Regression Analysis Model (MRAM), Seasonal Autoregressive Integrated Moving Average (SARIMA) Model, and Vector Error Correction Model (VECM). As a result, first, the amount of short-term oyster production forecasted by the multiple regression analysis model was 1,337 ton with prediction error of 246 ton. Secondly, the amount of oyster production of the SARIMA I and II models was forecasted as 12,423 ton and 12,442 ton with prediction error of 11,404 ton and 11,423 ton, respectively. Thirdly, the amount of oyster production based on the VECM was estimated as 10,425 ton with prediction errors of 9,406 ton. In conclusion, based on Theil inequality coefficient criterion, short-term prediction of oyster by the VECM exhibited a better fit than ones by the SARIMA I and II models and Multiple Regression Analysis Model.

Flash Point Measurement of n-Propanol+n-Hexanol and n-Butanol+n-Hexanol Systems Using Seta Flash Closed Cup Tester (Seta Flash 밀폐식 장치를 이용한 n-Propanol+n-Hexanol계와 n-Butanol+n-Hexanol계의 인화점 측정)

  • Ha, Dong-Myeong;Lee, Sungjin
    • Journal of the Korean Society of Safety
    • /
    • v.34 no.1
    • /
    • pp.34-39
    • /
    • 2019
  • Flash point is the important indicator to determine fire and explosion hazards of liquid solutions. In this study, flash points of n-propanol+n-hexanol and n-butanol+n-hexanol systems were obtained by Seta flash tester. The methods based on UNIFAC equation and multiple regression analysis were used to calculate flash point. The calculated flash point was compared with the experimental flash point. Absolute average errors of flash points calculated by UNIFAC equation are $2.9^{\circ}C$ and $0.6^{\circ}C$ for n-propanol+n-hexanol and n-butanol+n-hexanol, respectively. Absolute average errors of flash points calculated by multiple regression analysis are $0.5^{\circ}C$ and $0.2^{\circ}C$ for n-propanol+ n-hexanol and n-butanol+n-hexanol, respectively. As can be seen from AAE, the values calculated by multiple regression analysis are noticed to be better than the values by the method based on UNIFAC eauation.

Multivariate statistical analysis of the comparative antioxidant activity of the total phenolics and tannins in the water and ethanol extracts of dried goji berry (Lycium chinense) fruits

  • Kim, Joo-Shin;Kimm, Haklin Alex
    • Korean Journal of Food Science and Technology
    • /
    • v.51 no.3
    • /
    • pp.227-236
    • /
    • 2019
  • Antioxidant activity in water and ethanol extracts of dried Lycium chinense fruit, as a result of the total phenolic and tannin content, was measured using a number of chemical and biochemical assays for radical scavenging and inhibition of lipid peroxidation, with the analysis being extended by applying a bootstrapping statistical method. Previous statistical analyses mostly provided linear correlation and regression analyses between antioxidant activity and increasing concentrations of phenolics and tannins in a concentration-dependent mode. The present study showed that multiple component or multivariate analysis by applying multiple regression analysis or regression planes proved more informative than linear regression analysis of the relationship between the concentration of individual components and antioxidant activity. In this paper, we represented the multivariate analysis of antioxidant activities of both phenolic and tannin contents combined in the water and ethanol extracts, which revealed the hidden observations that were not evident from linear statistical analysis.