• Title/Summary/Keyword: regression models

Search Result 3,656, Processing Time 0.027 seconds

Assessment of slope stability using multiple regression analysis

  • Marrapu, Balendra M.;Jakka, Ravi S.
    • Geomechanics and Engineering
    • /
    • v.13 no.2
    • /
    • pp.237-254
    • /
    • 2017
  • Estimation of slope stability is a very important task in geotechnical engineering. However, its estimation using conventional and soft computing methods has several drawbacks. Use of conventional limit equilibrium methods for the evaluation of slope stability is very tedious and time consuming, while the use of soft computing approaches like Artificial Neural Networks and Fuzzy Logic are black box approaches. Multiple Regression (MR) analysis provides an alternative to conventional and soft computing methods, for the evaluation of slope stability. MR models provide a simplified equation, which can be used to calculate critical factor of safety of slopes without adopting any iterative procedure, thereby reducing the time and complexity involved in the evaluation of slope stability. In the present study, a multiple regression model has been developed and tested its accuracy in the estimation of slope stability using real field data. Here, two separate multiple regression models have been developed for dry and wet slopes. Further, the accuracy of these developed models have been compared and validated with respect to conventional limit equilibrium methods in terms of Mean Square Error (MSE) & Coefficient of determination ($R^2$). As the developed MR models here are not based on any region specific data and covers wide range of parametric variations, they can be directly applied to any real slopes.

Introduction to variational Bayes for high-dimensional linear and logistic regression models (고차원 선형 및 로지스틱 회귀모형에 대한 변분 베이즈 방법 소개)

  • Jang, Insong;Lee, Kyoungjae
    • The Korean Journal of Applied Statistics
    • /
    • v.35 no.3
    • /
    • pp.445-455
    • /
    • 2022
  • In this paper, we introduce existing Bayesian methods for high-dimensional sparse regression models and compare their performance in various simulation scenarios. Especially, we focus on the variational Bayes approach proposed by Ray and Szabó (2021), which enables scalable and accurate Bayesian inference. Based on simulated data sets from sparse high-dimensional linear regression models, we compare the variational Bayes approach with other Bayesian and frequentist methods. To check the practical performance of the variational Bayes in logistic regression models, a real data analysis is conducted using leukemia data set.

Developing the Pedestrian Accident Models of Intersections using Tobit Model (토빗모형을 이용한 교차로 보행자 사고모형 개발)

  • Lee, Seung Ju;Lim, Jin Kang;Park, Byung Ho
    • Journal of the Korean Society of Safety
    • /
    • v.29 no.5
    • /
    • pp.154-159
    • /
    • 2014
  • This study deals with the pedestrian accidents of intersections in case of Cheongju. The objective is to develop the pedestrian accident models using Tobit regression model. In pursuing the above, the pedestrian accident data from 2007 to 2011 were collected from TAAS data set of Road Traffic Authority. To analyze the accident, Poisson, negative binomial and Tobit regression models were utilized in this study. The dependent variable were the number of accident by intersection. Independent variables are traffic volume, intersection geometric structure and the transportation facility. The main results were as follows. First, Tobit model was judged to be more appropriate model than other models. Also, these models were analyzed to be statistically significant. Second, such the main variables related to accidents as traffic volume, pedestrian volume, number of traffic island, crossing length and the pedestrian countdown signal systems were adopted in the above model.

Bayesian Analysis for the Zero-inflated Regression Models (영과잉 회귀모형에 대한 베이지안 분석)

  • Jang, Hak-Jin;Kang, Yun-Hee;Lee, S.;Kim, Seong-W.
    • The Korean Journal of Applied Statistics
    • /
    • v.21 no.4
    • /
    • pp.603-613
    • /
    • 2008
  • We often encounter the situation that discrete count data have a large portion of zeros. In this case, it is not appropriate to analyze the data based on standard regression models such as the poisson or negative binomial regression models. In this article, we consider Bayesian analysis for two commonly used models. They are zero-inflated poisson and negative binomial regression models. We use the Bayes factor as a model selection tool and computation is proceeded via Markov chain Monte Carlo methods. Crash count data are analyzed to support theoretical results.

Tree-Structured Nonlinear Regression

  • Chang, Young-Jae;Kim, Hyeon-Soo
    • The Korean Journal of Applied Statistics
    • /
    • v.24 no.5
    • /
    • pp.759-768
    • /
    • 2011
  • Tree algorithms have been widely developed for regression problems. One of the good features of a regression tree is the flexibility of fitting because it can correctly capture the nonlinearity of data well. Especially, data with sudden structural breaks such as the price of oil and exchange rates could be fitted well with a simple mixture of a few piecewise linear regression models. Now that split points are determined by chi-squared statistics related with residuals from fitting piecewise linear models and the split variable is chosen by an objective criterion, we can get a quite reasonable fitting result which goes in line with the visual interpretation of data. The piecewise linear regression by a regression tree can be used as a good fitting method, and can be applied to a dataset with much fluctuation.

Estimation methods and interpretation of competing risk regression models (경쟁 위험 회귀 모형의 이해와 추정 방법)

  • Kim, Mijeong
    • The Korean Journal of Applied Statistics
    • /
    • v.29 no.7
    • /
    • pp.1231-1246
    • /
    • 2016
  • Cause-specific hazard model (Prentice et al., 1978) and subdistribution hazard model (Fine and Gray, 1999) are mostly used for the right censored survival data with competing risks. Some other models for survival data with competing risks have been subsequently introduced; however, those models have not been popularly used because the models cannot provide reliable statistical estimation methods or those are overly difficult to compute. We introduce simple and reliable competing risk regression models which have been recently proposed as well as compare their methodologies. We show how to use SAS and R for the data with competing risks. In addition, we analyze survival data with two competing risks using five different models.

Algorithm for Finding the Best Principal Component Regression Models for Quantitative Analysis using NIR Spectra (근적외 스펙트럼을 이용한 정량분석용 최적 주성분회귀모델을 얻기 위한 알고리듬)

  • Cho, Jung-Hwan
    • Journal of Pharmaceutical Investigation
    • /
    • v.37 no.6
    • /
    • pp.377-395
    • /
    • 2007
  • Near infrared(NIR) spectral data have been used for the noninvasive analysis of various biological samples. Nonetheless, absorption bands of NIR region are overlapped extensively. It is very difficult to select the proper wavelengths of spectral data, which give the best PCR(principal component regression) models for the analysis of constituents of biological samples. The NIR data were used after polynomial smoothing and differentiation of 1st order, using Savitzky-Golay filters. To find the best PCR models, all-possible combinations of available principal components from the given NIR spectral data were derived by in-house programs written in MATLAB codes. All of the extensively generated PCR models were compared in terms of SEC(standard error of calibration), $R^2$, SEP(standard error of prediction) and SECP(standard error of calibration and prediction) to find the best combination of principal components of the initial PCR models. The initial PCR models were found by SEC or Malinowski's indicator function and a priori selection of spectral points were examined in terms of correlation coefficients between NIR data at each wavelength and corresponding concentrations. For the test of the developed program, aqueous solutions of BSA(bovine serum albumin) and glucose were prepared and analyzed. As a result, the best PCR models were found using a priori selection of spectral points and the final model selection by SEP or SECP.

Design models for predicting shear resistance of studs in solid concrete slabs based on symbolic regression with genetic programming

  • Degtyarev, Vitaliy V.;Hicks, Stephen J.;Hajjar, Jerome F.
    • Steel and Composite Structures
    • /
    • v.43 no.3
    • /
    • pp.293-309
    • /
    • 2022
  • Accurate design models for predicting the shear resistance of headed studs in solid concrete slabs are essential for obtaining economical and safe steel-concrete composite structures. In this study, symbolic regression with genetic programming (GPSR) was applied to experimental data to formulate new descriptive equations for predicting the shear resistance of studs in solid slabs using both normal and lightweight concrete. The obtained GPSR-based nominal resistance equations demonstrated good agreement with the test results. The equations indicate that the stud shear resistance is insensitive to the secant modulus of elasticity of concrete, which has been included in many international standards following the pioneering work of Ollgaard et al. In contrast, it increases when the stud height-to-diameter ratio increases, which is not reflected by the design models in the current international standards. The nominal resistance equations were subsequently refined for use in design from reliability analyses to ensure that the target reliability index required by the Eurocodes was achieved. Resistance factors for the developed equations were also determined following US design practice. The stud shear resistance predicted by the proposed models was compared with the predictions from 13 existing models. The accuracy of the developed models exceeds the accuracy of the existing equations. The proposed models produce predictions that can be used with confidence in design, while providing significantly higher stud resistances for certain combinations of variables than those computed with the existing equations given by many standards.

Comparison of Different Multiple Linear Regression Models for Real-time Flood Stage Forecasting (실시간 수위 예측을 위한 다중선형회귀 모형의 비교)

  • Choi, Seung Yong;Han, Kun Yeun;Kim, Byung Hyun
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.32 no.1B
    • /
    • pp.9-20
    • /
    • 2012
  • Recently to overcome limitations of conceptual, hydrological and physics based models for flood stage forecasting, multiple linear regression model as one of data-driven models have been widely adopted for forecasting flood streamflow(stage). The objectives of this study are to compare performance of different multiple linear regression models according to regression coefficient estimation methods and determine most effective multiple linear regression flood stage forecasting models. To do this, the time scale was determined through the autocorrelation analysis of input data and different flood stage forecasting models developed using regression coefficient estimation methods such as LS(least square), WLS(weighted least square), SPW(stepwise) was applied to flood events in Jungrang stream. To evaluate performance of established models, fours statistical indices were used, namely; Root mean square error(RMSE), Nash Sutcliffe efficiency coefficient (NSEC), mean absolute error (MAE), adjusted coefficient of determination($R^{*2}$). The results show that the flood stage forecasting model using SPW(stepwise) parameter estimation can carry out the river flood stage prediction better in comparison with others, and the flood stage forecasting model using LS(least square) parameter estimation is also found to be slightly better than the flood stage forecasting model using WLS(weighted least square) parameter estimation.

LACTATION CURVE OF HOLSTEIN FRIESIAN COWS IN THE KINGDOM OF SAUDI ARABIA

  • Ali, A.K.A.;Al-Jumaah, R.S.;Hayes, E.
    • Asian-Australasian Journal of Animal Sciences
    • /
    • v.9 no.4
    • /
    • pp.439-447
    • /
    • 1996
  • Monthly test day production for 12,020 records, were collected from six of the largest specialized dairy farms located in central region of the Kingdom of Saudi Arabia. The records described lactating cows in four parities and two seasons of calving. Monthly test day records were fitted using Wood's model $At{{^b}{_e}}^{-ct}$ with multiple and additive error term. Linear and non-linear regression models were used to find the estimates of the parameters necessary to draw the lactation curves. The shape of the lactation curves of different parities showed that third lactation has the heighest peak (43.08 kg) for linear regression model and (42.08 kg) for non-linear regression model. Fourth lactation has the lowest peak (24.00kg) for linear regression model and (25.64 kg) for non-linear regression models. Cows of second and third lactations reached the peak at 58 day for both linear and non-linear regression models. Cows of first lactation were more persistent and had late peak at 68 and 67 days for both models respectively. While, third lactation cows were lower persistent and had early peak at 58 day for both models. Cows calved at winter months have higher starting values (A), higher ascending slope (b) and higher decending slope (c). Least square means of milk yield of the first four parities and for overall data were 6,653, 7,659, 7,482, 6,988 and 7,614 kg respectively. The corresponding lactation period were 358, 367, 350, 363 and 364 days respectively.