• 제목/요약/키워드: multivariate linear models

검색결과 68건 처리시간 0.02초

EPB-TBM performance prediction using statistical and neural intelligence methods

  • Ghodrat Barzegari;Esmaeil Sedghi;Ata Allah Nadiri
    • Geomechanics and Engineering
    • /
    • 제37권3호
    • /
    • pp.197-211
    • /
    • 2024
  • This research studies the effect of geotechnical factors on EPB-TBM performance parameters. The modeling was performed using simple and multivariate linear regression methods, artificial neural networks (ANNs), and Sugeno fuzzy logic (SFL) algorithm. In ANN, 80% of the data were randomly allocated to training and 20% to network testing. Meanwhile, in the SFL algorithm, 75% of the data were used for training and 25% for testing. The coefficient of determination (R2) obtained between the observed and estimated values in this model for the thrust force and cutterhead torque was 0.19 and 0.52, respectively. The results showed that the SFL outperformed the other models in predicting the target parameters. In this method, the R2 obtained between observed and predicted values for thrust force and cutterhead torque is 0.73 and 0.63, respectively. The sensitivity analysis results show that the internal friction angle (φ) and standard penetration number (SPT) have the greatest impact on thrust force. Also, earth pressure and overburden thickness have the highest effect on cutterhead torque.

On a Bayesian Estimation of Multivariate Regression Models with Constrained Coefficient Matrix

  • Kim, Hea-Jung
    • 품질경영학회지
    • /
    • 제26권4호
    • /
    • pp.151-165
    • /
    • 1998
  • Consider the linear multivariate regression model $Y=X_1B_1+X_2B_2+U$, where Vec(U)~N(0, $\sum \bigotimes I_N$). This paper is concerned with Bayes infreence of the model when it is suspected that the elements of $B_2$ are constrained in the form of intervals. The use of the Gibbs sampler as a method for calculating Bayesian marginal posterior desnities of the parameters under a generalized conjugate prior is developed. It is shown that the a, pp.oach is straightforward to specify distributionally and to implement computationally, with output readily adopted for required inference summaries. The method developed is a, pp.ied to a real problem.

  • PDF

병원의 전문화 전략과 운영성과 간의 관계: 근골격계 및 결합조직 질환을 중심으로 (The Relationship between Hospital Specialization and Operational Performance: Focusing on Diseases of the Musculoskeletal System and Connective Tissue)

  • 서슬기;김양균
    • 한국병원경영학회지
    • /
    • 제25권3호
    • /
    • pp.53-66
    • /
    • 2020
  • This study is aimed at investigated and compared the differences in the affect of hospital specialization according to hospital size using claims data of the Health Insurance and Review Assessment National Inpatient Sample in 2018 for diseases of the musculoskeletal system and connective tissue. To this end, we used multivariate hierarchical linear models(a.k.a., multi-level models) using two-tier data from 106,599 patients discharged after diseases of the musculoskeletal system and connective tissue from 734 hospitals. Multivariate results indicate that patients who were discharged with diseases of the musculoskeletal system and connective tissue from specialized hospitals with 200 beds or less stayed shorter and paid less inpatient charge than those who were discharged from less specialized hospitals. But for hospitals with 201-300 beds, no positive impact relationship was found between hospital specialization and operational performance. This finding may be limited evidence that the affect of a hospital's specialization strategy may vary depending on the size of the hospital. We discussed several managerial and health policy implications below.

Development and Validation of Generalized Linear Regression Models to Predict Vessel Enhancement on Coronary CT Angiography

  • Masuda, Takanori;Nakaura, Takeshi;Funama, Yoshinori;Sato, Tomoyasu;Higaki, Toru;Kiguchi, Masao;Matsumoto, Yoriaki;Yamashita, Yukari;Imada, Naoyuki;Awai, Kazuo
    • Korean Journal of Radiology
    • /
    • 제19권6호
    • /
    • pp.1021-1030
    • /
    • 2018
  • Objective: We evaluated the effect of various patient characteristics and time-density curve (TDC)-factors on the test bolus-affected vessel enhancement on coronary computed tomography angiography (CCTA). We also assessed the value of generalized linear regression models (GLMs) for predicting enhancement on CCTA. Materials and Methods: We performed univariate and multivariate regression analysis to evaluate the effect of patient characteristics and to compare contrast enhancement per gram of iodine on test bolus (${\Delta}HUTEST$) and CCTA (${\Delta}HUCCTA$). We developed GLMs to predict ${\Delta}HUCCTA$. GLMs including independent variables were validated with 6-fold cross-validation using the correlation coefficient and Bland-Altman analysis. Results: In multivariate analysis, only total body weight (TBW) and ${\Delta}HUTEST$ maintained their independent predictive value (p < 0.001). In validation analysis, the highest correlation coefficient between ${\Delta}HUCCTA$ and the prediction values was seen in the GLM (r = 0.75), followed by TDC (r = 0.69) and TBW (r = 0.62). The lowest Bland-Altman limit of agreement was observed with GLM-3 (mean difference, $-0.0{\pm}5.1$ Hounsfield units/grams of iodine [HU/gI]; 95% confidence interval [CI], -10.1, 10.1), followed by ${\Delta}HUCCTA$ ($-0.0{\pm}5.9HU/gI$; 95% CI, -11.9, 11.9) and TBW ($1.1{\pm}6.2HU/gI$; 95% CI, -11.2, 13.4). Conclusion: We demonstrated that the patient's TBW and ${\Delta}HUTEST$ significantly affected contrast enhancement on CCTA images and that the combined use of clinical information and test bolus results is useful for predicting aortic enhancement.

다변량 경시적 자료 분석을 위한 공분산 행렬의 모형화 비교 연구 (Comparison study of modeling covariance matrix for multivariate longitudinal data)

  • 곽나영;이근백
    • 응용통계연구
    • /
    • 제33권3호
    • /
    • pp.281-296
    • /
    • 2020
  • 같은 개체로부터 반복 측정한 자료를 경시적 자료(longitudinal data)라고 한다. 이러한 자료를 분석하려면 흔히 사용되는 횡단 자료 분석과는 다른 분석 방법이 필요하다. 즉, 경시적 자료에서 공변량의 효과를 추정할 때에는 반복 측정된 결과 간의 상관성을 고려해야 하며, 따라서 공분산행렬을 모형화 하는 것이 매우 중요하다. 그러나 추정해야 할 모수가 많고, 추정된 공분산행렬이 양정치성을 만족해야 하므로 공분산 행렬의 모형화는 쉽지 않다. 특히 다변량 경시적 자료분석을 위한 공분산행렬의 모형화는 더욱더 심층적인 방법론을 사용해야 한다. 본 논문은 다변량 경시적 자료분석을 위한 공분산행렬을 모형화하기 위해 두 가지 방법론을 고찰한다. 두 방법 모두 수정된 콜레스키 분해(modified Cholesky decomposition)를 이용하여 시간에 따른 응답변수들의 상관관계를 설명하고 있다. 하지만 같은 시간에서 관측된 응답변수들간의 상관관계를 설명하는 방법이 다르다. 첫 번째 방법론에서는 향상된 선형 공분산 모형(enhanced linear covariance models)을 사용하여 공분산행렬이 양정치성을 만족하도록 한다. 두 번째 방법론에서는 분산-공분산 분해(variance-correlation decomposition)와 초구분해(hypersphere decomposition)을 이용하여 공분산 행렬을 모형화 한다. 이 두 방법론의 성능을 비교하고자 모의실험을 진행한다.

도시공원녹지에 대한 실외위락기능과 만족도의 계량적 평가에 관한 연구 (A Study on the Quantitative Evaluation of Outdoor-Recreational Function and User Satisfaction with Urban Park and Open Space)

  • 박승범
    • 한국조경학회지
    • /
    • 제18권4호
    • /
    • pp.127-140
    • /
    • 1991
  • The Primary purpose of this study is to investigate factors and variables which have significant effects on user satisfaction with recreational facilities in Taejong-Dae recreational complex, thereby establishing indices of planning and development of urban parks and open space. To test the causal models of this research, the date were gathered by self-administered questionnaires from 967 households in Pusan City which were selected by the multi-stage probability sampling methood. The analysis of the multi-stage primarily consists of two phase : The first analysis dealt exploratory factor analysis which identified major factors involved in satisfaction with recreational activities and facilities in Taejong-Dae recreational complex and the second analysis tested the fit of the causal models of this research by employing LISREL methodology. There are three advantages of using LISREL over other multivariate analysis methods : First, measurement error is allowed and calculated in LISREL, otherwise there is a risk of seriously misleading estimates of coefficients ; Second, LISREL deals with latent variables or unmeasured variables ; Third, it enables to test causal relations among variables. The factors analysis identified that five factors are involved in satisfaction with recreational facilities. The five factors of satisfaction with recreational facilities are space for repose and relaxation, active recreation facilities such as pool and zoo, physical exercise facility, convenience and maintenance facility, and linear facility, and linear facility for walking. The second phase analysis tested the fit of the causal models for satisfaction with recreational facilities to the data and identified statistically significant causal linkage among overall satisfaction with Taejong-Dae recreational complex, other endogenous factors and exogenous variables. Overall fits of both causal models were very good. Among endogenous factors, facility for repose and relaxation. linear facility for walking, active recreation facility, facility for convenience and maintenance were identified as having significant effects on overall satisfaction. Exogenous variables which have significant effects on endogenous variables wer also identified. These significant relationships indicate important factors and variables that should be considered in planning and development of the recreational complex. On the basis of these significant causal relationships, implications for planning and the delovepment of Taejong-Dae recreational complex were suggested.

  • PDF

로그선형모델을 이용한 팔당호 유입지류 수질의 연속성 시뮬레이션과 경향 분석 (Continuity Simulation and Trend Analysis of Water Qualities in Incoming Flows to Lake Paldang by Log Linear Models)

  • 나은혜;박석순
    • 생태와환경
    • /
    • 제36권3호통권104호
    • /
    • pp.336-343
    • /
    • 2003
  • 본 연구에서는 남, 북한강 그리고 경안천으로부터 팔당호에 유입되는 유기물 및 영양물질농도의 연속성 예측을 위하여 단순로그선형모델과 다변수 로그선형모델이 함께 적용되었으며, F-검정과 결정계수에 기초하여 산정된 모델의 유의성과 유효성이 검토되었다. 검토 결과 단순로그선형모델은 산정된 9개의 모델 중 4개 모델만이 통계적으로 유의한 반면 다변수 로그선형모델의 경우에는 9개 모델 모두 통계적으로 유의한 것으로 나타났다. 모델의 유효성을 평가하는 결정계수 또한 다변수 로그선형모델의 경우에 더 높게 예측되었다. 즉 팔당호 유입 수질 농도의 연속성 예측과 경향성 파악을 위해서는 다변수 로그선형모델의 적용이 더 적합한 것으로 판단되었다. 다변수 로그선형모델 결과에 기초하여 팔당호 유입수질의 유량 의존성, 경향성, 계절성을 분석하였다. 분석결과 모든 지류에서 유량이 증가함에 따라 팔당호로 유입되는 BOD 농도는 감소하는 것으로 나타났으며, TN과 TP의 경우에는 BOD와 달리 유량이 증가하더라도 농포는 증가 또는 감소하지 않는 것으로 나타났다. 따라서 3개 지류 유역에서 유기물을 배출하는 주요 오염원은 점오염원인 반면 영양물질의 경우에는 점오염원 뿐만 아니라 비점오염원 역시 주요 배출원으로 작용하고 있는 것으로 판단된다. 경향성을 분석한 결과 1995턴부터2000년까지 모든 지류에서 팔당호로 유입되는 BOD농도의 증감 경향은 보이지 않았다. 남한강과 북한강으로부터 팔당호로 유입되는 TP의 경우 1988년부터 1994년까지 점진적인 증가 추세를 보이는 것으로 보고된 바 있으나 본 연구의 대상 기간인 1995년 이후에는 이러한 증가 추세는 관찰되지 않았으며, 반면 경안천으로부터 유입되는 TP농도는 연간 10%정도 증가하고 있는 것으로 예측되었다. 한편 북한강으로부터 팔당호로 유입되는 TN농도는 연간 10%정도 감소하는 반면 남한강과 경안천으로부터의 유입 농도는 각각 연간 3%와 7%씩 증가하는 경향을 보였다. 수질 농도의 계절별 변화 경향을 분석한 결과 팔당호로 유입되는 3개 지류의 유기물 및 영양물질 농도는 모두 계절성을 갖는 것으로 분석되었으며, 이 중 가장 작은 변동폭을 갖는 수질항목은 총 질소인 것으로 나타났다.

A Comparative Study of Estimation by Analogy using Data Mining Techniques

  • Nagpal, Geeta;Uddin, Moin;Kaur, Arvinder
    • Journal of Information Processing Systems
    • /
    • 제8권4호
    • /
    • pp.621-652
    • /
    • 2012
  • Software Estimations provide an inclusive set of directives for software project developers, project managers, and the management in order to produce more realistic estimates based on deficient, uncertain, and noisy data. A range of estimation models are being explored in the industry, as well as in academia, for research purposes but choosing the best model is quite intricate. Estimation by Analogy (EbA) is a form of case based reasoning, which uses fuzzy logic, grey system theory or machine-learning techniques, etc. for optimization. This research compares the estimation accuracy of some conventional data mining models with a hybrid model. Different data mining models are under consideration, including linear regression models like the ordinary least square and ridge regression, and nonlinear models like neural networks, support vector machines, and multivariate adaptive regression splines, etc. A precise and comprehensible predictive model based on the integration of GRA and regression has been introduced and compared. Empirical results have shown that regression when used with GRA gives outstanding results; indicating that the methodology has great potential and can be used as a candidate approach for software effort estimation.

Bayesian Analysis of Multivariate Threshold Animal Models Using Gibbs Sampling

  • Lee, Seung-Chun;Lee, Deukhwan
    • Journal of the Korean Statistical Society
    • /
    • 제31권2호
    • /
    • pp.177-198
    • /
    • 2002
  • The estimation of variance components or variance ratios in linear model is an important issue in plant or animal breeding fields, and various estimation methods have been devised to estimate variance components or variance ratios. However, many traits of economic importance in those fields are observed as dichotomous or polychotomous outcomes. The usual estimation methods might not be appropriate for these cases. Recently threshold linear model is considered as an important tool to analyze discrete traits specially in animal breeding field. In this note, we consider a hierarchical Bayesian method for the threshold animal model. Gibbs sampler for making full Bayesian inferences about random effects as well as fixed effects is described to analyze jointly discrete traits and continuous traits. Numerical example of the model with two discrete ordered categorical traits, calving ease of calves from born by heifer and calving ease of calf from born by cow, and one normally distributed trait, birth weight, is provided.

경제⋅사회지표의 다변량 통계 분석을 활용한 국가 간 산업재해 사고사망 상대수준 비교 (Comparison of National Occupational Accident Fatality Rates using Statistical Analysis on Economic and Social Indicators)

  • 김경훈;이수동
    • 한국안전학회지
    • /
    • 제37권6호
    • /
    • pp.128-135
    • /
    • 2022
  • The comparative evaluation of occupational accident fatality rates (OAFRs) of different countries is complicated owing to the differences in their level of socio-economic development. However, such evaluation is necessary to assess the national occupational safety and health system of a country. This study proposes a statistical method to compare the OAFRs of countries taking into consideration the difference in their level of socio-economic development. We first collected data on the socio-economic indicators and OAFRs of 11 countries over a 30-year period. Next, based on literature survey and statistical correlation analysis, we selected the significant independent variables and built multiple linear regression models to predict OAFR. We also determined the groups of countries having heterogeneous relationships between the independent variables and OAFRs, which are represented by the regression models. The proposed method is demonstrated by comparing the OAFR of Korea with the OAFRs of 10 other developed countries.