• Title/Summary/Keyword: linear regression equation

Search Result 484, Processing Time 0.024 seconds

Estimation of Air Temperature Changes due to Future Urban Growth in the Seoul Metropolitan Area (수도권지역 미래 도시성장에 따른 기온변화 추정)

  • Kim, Yoo-Keun;Kim, Hyun-Su;Jeong, Ju-Hee;Song, Sang-Keun
    • Journal of Environmental Science International
    • /
    • v.19 no.2
    • /
    • pp.237-245
    • /
    • 2010
  • The relationship between air temperatures and the fraction of urban areas (FUA) and their linear regression equation were estimated using land-use data provided by the water management information system (WAMIS) and air temperatures by the Korea Meteorology Administration (KMA) in the Seoul metropolitan area (SMA) during 1975 through 2000. The future FUA in the SMA (from 2000 to 2030) was also predicted by the urban growth model (i.e., SLEUTH) in conjunction with several dataset (e.g., urban, roads, etc.) in the WAMIS. The estimated future FUA was then used as input data for the linear regression equation to estimate an annual mean minimum air temperature in the future (e.g., 2025 and 2030). The FUA in the SMA in 2000 simulated by the SLEUTH showed good agreement with the observations (a high accuracy (73%) between them). The urban growth in the SMA was predicted to increase by 16% of the total areas in 2025 and by 24% in 2030. From the linear regression equation, the annual mean minimum air temperature in the SMA increased about $0.02^{\circ}C$/yr and it was expected to increase up to $8.3^{\circ}C$ in 2025 and $8.7^{\circ}C$ in 2030.

Physiological Cost Index of Walking in Healthy Children (건강한 아동이 걸을 때에 생리학적 소비지수)

  • Lee, Hyang-Sook;Kim, Bong-Ok
    • Physical Therapy Korea
    • /
    • v.9 no.1
    • /
    • pp.43-51
    • /
    • 2002
  • Physiological Cost Index (PCI) of walking has been widely used to predict oxygen consumption in healthy subjects or patients. The purpose of this study was to evaluate the predictability of physiological cost index of walking for the amount of exercise and cardiac function. Walking exercise was conducted in 67 healthy children (age 4-12) with a self-selected comfortable walking speed on the level surface. Walking speed was calculated, and heart rate was measured before and immediately after the walking. PCI was calculated for statistical analysis. The results were as follows; 1) The walking speed tends to increase and PCI of walking tends to decrease with age. There was significant difference in walking speed and PCI of walking among three age groups (p<.05). The change of walking heart rate tends to decrease with age, however, there was no significant difference among three age groups. 2) Linear regression equation between walking speed and age was 'Y (walking speed) = 2.124X (age) + 48.286' ($R^2$=.337), (p=.00). 3) The walking heart rate tends to decrease with age. Linear regression equation between walking heart rate and age was 'Y (walking heart rate) = 143.346 - 2.63X (age)' ($R^2$=.3425), (p=.00). 4) The walking heart rate decreased as body surface area (BSA) increased. Linear regression equation between walking heart rate and BSA was 'Y (walking heart rate) = 149.830 - 27.115X (BSA)' ($R^2$=.3066), (p=.00). In conclusion, these equations and PCI could be useful to quantify the variation of energy expenditure of children with pathological gait when compared with age-matched healthy children.

  • PDF

Unified methods for variable selection and outlier detection in a linear regression

  • Seo, Han Son
    • Communications for Statistical Applications and Methods
    • /
    • v.26 no.6
    • /
    • pp.575-582
    • /
    • 2019
  • The problem of selecting variables in the presence of outliers is considered. Variable selection and outlier detection are not separable problems because each observation affects the fitted regression equation differently and has a different influence on each variable. We suggest a simultaneous method for variable selection and outlier detection in a linear regression model. The suggested procedure uses a sequential method to detect outliers and uses all possible subset regressions for model selections. A simplified version of the procedure is also proposed to reduce the computational burden. The procedures are compared to other variable selection methods using real data sets known to contain outliers. Examples show that the proposed procedures are effective and superior to robust algorithms in selecting the best model.

Interrelation of Retention Factor of Amino-Acids by QSPR and Linear Regression

  • Lee, Seung-Ki;Polyakova, Yulia;Row, Kyung-Ho
    • Bulletin of the Korean Chemical Society
    • /
    • v.24 no.12
    • /
    • pp.1757-1762
    • /
    • 2003
  • The interrelation between retention factors of several L-amino acids and their physico-chemical and structural properties can be determined in chromatographic research. In this paper we describe a predictor for retention factors with various properties of the L-amino acids. Eighteen L-amino acids are included in this study, and retention factors are measured experimentally by RP-HPLC. Binding energy ($E_b$), hydrophobicity (log P), molecular refractivity (MR), polarizability (${\alpha}$), total energy ($E_t$), water solubility (log S), connectivity index (${\chi}$) of different orders and Wiener index (w) are theoretically calculated. Relationships between these properties and retention factors are established, and their predictive and interpretive ability are evaluated. The equation of the relationship between retention factors and various descriptors of L-amino acids is suggested as linear and multiple linear form, and the correlation coefficients estimated are relatively higher than 0.90.

Multivariate Statistical Analysis and Prediction for the Flash Points of Binary Systems Using Physical Properties of Pure Substances (순수 성분의 물성 자료를 이용한 2성분계 혼합물의 인화점에 대한 다변량 통계 분석 및 예측)

  • Lee, Bom-Sock;Kim, Sung-Young
    • Journal of the Korean Institute of Gas
    • /
    • v.11 no.3
    • /
    • pp.13-18
    • /
    • 2007
  • The multivariate statistical analysis, using the multiple linear regression(MLR), have been applied to analyze and predict the flash points of binary systems. Prediction for the flash points of flammable substances is important for the examination of the fire and explosion hazards in the chemical process design. In this paper, the flash points are predicted by MLR based on the physical properties of pure substances and the experimental flash points data. The results of regression and prediction by MLR are compared with the values calculated by Raoult's law and Van Laar equation.

  • PDF

A Bayesian Regression Model to Estimate the Deterioration Rate of Track Irregularities (궤도틀림 진전율 추정을 위한 베이지안 회귀분석 모형 연구)

  • Park, Bum Hwan
    • Journal of the Korean Society for Railway
    • /
    • v.19 no.4
    • /
    • pp.547-554
    • /
    • 2016
  • This study considered how to estimate the deterioration rate of the track quality index, which represents track geometric irregularity. Most existing studies have used a simple linear regression and regarded the slope of the regression equation as the progress rate. In this paper, we present a Bayesian approach to estimate the track irregularity progress. This Bayesian approach has many advantages, among which the biggest is that it can formally include the prior distribution of parameters which can be derived from historic data or from expert experiences; then, the rate can be expressed as a probability distribution. We investigated the possibility of applying the Bayesian method to the estimation of the deterioration rate by comparing our bayesian approach to the conventional linear regression approach.

Bootstrap Estimation for GEE Models (일반화추정방정식(GEE)에 대한 부스트랩의 적용)

  • Park, Chong-Sun;Jeon, Yong-Moon
    • The Korean Journal of Applied Statistics
    • /
    • v.24 no.1
    • /
    • pp.207-216
    • /
    • 2011
  • Bootstrap is a resampling technique to find an estimate of parameters or to evaluate the estimate. This technique has been used in estimating parameters in linear model(LM) and generalized linear model(GLM). In this paper, we explore the possibility of applying Bootstrapping Residuals, Pairs, and an Estimating Equation that are most widely used in LM and GLM to the generalized estimating equation(GEE) algorithm for modelling repeatedly measured regression data sets. We compared three bootstrapping methods with coefficient and standard error estimates of GEE models from one simulated and one real data set. Overall, the estimates obtained from bootstrap methods are quite comparable, except that estimates from bootstrapping pairs are somewhat different from others. We conjecture that the strange behavior of estimates from bootstrapping pairs comes from the inconsistency of those estimates. However, we need a more thorough simulation study to generalize it since those results are coming from only two small data sets.

Optimal Solution of Classification (Prediction) Problem

  • Mohammad S. Khrisat
    • International Journal of Computer Science & Network Security
    • /
    • v.23 no.9
    • /
    • pp.129-133
    • /
    • 2023
  • Classification or prediction problem is how to solve it using a specific feature to obtain the predicted class. A wheat seeds specifications 4 3 classes of seeds will be used in a prediction process. A multi linear regression will be built, and a prediction error ratio will be calculated. To enhance the prediction ratio an ANN model will be built and trained. The obtained results will be examined to show how to make a prediction tool capable to compute a predicted class number very close to the target class number.

An Filtering Automatic Technique of LiDAR Data by Multiple Linear Regression Analysis (다중선형 회귀분석에 의한 LiDAR 자료의 필터링 자동화 기법)

  • Choi, Seung-Pil;Cho, Ji-Hyun;Kim, Jun-Seong
    • Journal of Korean Society for Geospatial Information Science
    • /
    • v.19 no.4
    • /
    • pp.109-118
    • /
    • 2011
  • In this research estimated accuracies that were results in all the area of filtering of the plane equation that was used by whole data set, and regional of filtering that was driven by the plane equation for each vertual Grid. All of this estimates were based by all the area of filtering that deduced the plane equation by multiple linear regression analysis that was used by ground data set. Therefore, accuracy of all the area of filtering that used whole data set has been dropped about 2~3% when average of accuracy of all the area of filtering was based on ground data set while accuracy of Regional of filtering dropped 2~4% when based on virtual Grid. Moreover, as virtual Grid which was set 3~4 cm was difference about 2% of accuracy from standard data. Thus, it leads conclusion of set 3~4 times bigger size in virtual Grid filtering over LiDAR scan gap will be more appropriated. Hence, the result of this research allow us to conclude that there was difference in average accuracy has been noticed when we applied each different approaches, I strongly suggest that it need to research more about real topography for further filtering accuracy.

Sequential prediction of TBM penetration rate using a gradient boosted regression tree during tunneling

  • Lee, Hang-Lo;Song, Ki-Il;Qi, Chongchong;Kim, Kyoung-Yul
    • Geomechanics and Engineering
    • /
    • v.29 no.5
    • /
    • pp.523-533
    • /
    • 2022
  • Several prediction model of penetration rate (PR) of tunnel boring machines (TBMs) have been focused on applying to design stage. In construction stage, however, the expected PR and its trends are changed during tunneling owing to TBM excavation skills and the gap between the investigated and actual geological conditions. Monitoring the PR during tunneling is crucial to rescheduling the excavation plan in real-time. This study proposes a sequential prediction method applicable in the construction stage. Geological and TBM operating data are collected from Gunpo cable tunnel in Korea, and preprocessed through normalization and augmentation. The results show that the sequential prediction for 1 ring unit prediction distance (UPD) is R2≥0.79; whereas, a one-step prediction is R2≤0.30. In modeling algorithm, a gradient boosted regression tree (GBRT) outperformed a least square-based linear regression in sequential prediction method. For practical use, a simple equation between the R2 and UPD is proposed. When UPD increases R2 decreases exponentially; In particular, UPD at R2=0.60 is calculated as 28 rings using the equation. Such a time interval will provide enough time for decision-making. Evidently, the UPD can be adjusted depending on other project and the R2 value targeted by an operator. Therefore, a calculation process for the equation between the R2 and UPD is addressed.