• Title/Summary/Keyword: robust regression

Search Result 365, Processing Time 0.03 seconds

An ensemble learning based Bayesian model updating approach for structural damage identification

  • Guangwei Lin;Yi Zhang;Enjian Cai;Taisen Zhao;Zhaoyan Li
    • Smart Structures and Systems
    • /
    • v.32 no.1
    • /
    • pp.61-81
    • /
    • 2023
  • This study presents an ensemble learning based Bayesian model updating approach for structural damage diagnosis. In the developed framework, the structure is initially decomposed into a set of substructures. The autoregressive moving average (ARMAX) model is established first for structural damage localization based structural motion equation. The wavelet packet decomposition is utilized to extract the damage-sensitive node energy in different frequency bands for constructing structural surrogate models. Four methods, including Kriging predictor (KRG), radial basis function neural network (RBFNN), support vector regression (SVR), and multivariate adaptive regression splines (MARS), are selected as candidate structural surrogate models. These models are then resampled by bootstrapping and combined to obtain an ensemble model by probabilistic ensemble. Meanwhile, the maximum entropy principal is adopted to search for new design points for sample space updating, yielding a more robust ensemble model. Through the iterations, a framework of surrogate ensemble learning based model updating with high model construction efficiency and accuracy is proposed. The specificities of the method are discussed and investigated in a case study.

Assessment of wall convergence for tunnels using machine learning techniques

  • Mahmoodzadeh, Arsalan;Nejati, Hamid Reza;Mohammadi, Mokhtar;Ibrahim, Hawkar Hashim;Mohammed, Adil Hussein;Rashidi, Shima
    • Geomechanics and Engineering
    • /
    • v.31 no.3
    • /
    • pp.265-279
    • /
    • 2022
  • Tunnel convergence prediction is essential for the safe construction and design of tunnels. This study proposes five machine learning models of deep neural network (DNN), K-nearest neighbors (KNN), Gaussian process regression (GPR), support vector regression (SVR), and decision trees (DT) to predict the convergence phenomenon during or shortly after the excavation of tunnels. In this respect, a database including 650 datasets (440 for training, 110 for validation, and 100 for test) was gathered from the previously constructed tunnels. In the database, 12 effective parameters on the tunnel convergence and a target of tunnel wall convergence were considered. Both 5-fold and hold-out cross validation methods were used to analyze the predicted outcomes in the ML models. Finally, the DNN method was proposed as the most robust model. Also, to assess each parameter's contribution to the prediction problem, the backward selection method was used. The results showed that the highest and lowest impact parameters for tunnel convergence are tunnel depth and tunnel width, respectively.

Robust ridge regression for nonlinear mixed effects models with applications to quantitative high throughput screening assay data (비선형 혼합효과모형에서의 로버스트 능형회귀 방법과 정량적 고속 대량 스크리닝 자료에의 응용)

  • Yoo, Jiseon;Lim, Changwon
    • The Korean Journal of Applied Statistics
    • /
    • v.31 no.1
    • /
    • pp.123-137
    • /
    • 2018
  • A nonlinear mixed effects model is mainly used to analyze repeated measurement data in various fields. A nonlinear mixed effects model consists of two stages: the first-stage individual-level model considers intra-individual variation and the second-stage population model considers inter-individual variation. The individual-level model, which is the first stage of the nonlinear mixed effects model, estimates the parameters of the nonlinear regression model. It is the same as the general nonlinear regression model, and usually estimates parameters using the least squares estimation method. However, the least squares estimation method may have a problem that the estimated value of the parameters and standard errors become extremely large if the assumed nonlinear function is not explicitly revealed by the data. In this paper, a new estimation method is proposed to solve this problem by introducing the ridge regression method recently proposed in the nonlinear regression model into the first-stage individual-level model of the nonlinear mixed effects model. The performance of the proposed estimator is compared with the performance with the standard estimator through a simulation study. The proposed methodology is also illustrated using quantitative high throughput screening data obtained from the US National Toxicology Program.

Analysis of AI interview data using unified non-crossing multiple quantile regression tree model (통합 비교차 다중 분위수회귀나무 모형을 활용한 AI 면접체계 자료 분석)

  • Kim, Jaeoh;Bang, Sungwan
    • The Korean Journal of Applied Statistics
    • /
    • v.33 no.6
    • /
    • pp.753-762
    • /
    • 2020
  • With an increasing interest in integrating artificial intelligence (AI) into interview processes, the Republic of Korea (ROK) army is trying to lead and analyze AI-powered interview platform. This study is to analyze the AI interview data using a unified non-crossing multiple quantile tree (UNQRT) model. Compared to the UNQRT, the existing models, such as quantile regression and quantile regression tree model (QRT), are inadequate for the analysis of AI interview data. Specially, the linearity assumption of the quantile regression is overly strong for the aforementioned application. While the QRT model seems to be applicable by relaxing the linearity assumption, it suffers from crossing problems among estimated quantile functions and leads to an uninterpretable model. The UNQRT circumvents the crossing problem of quantile functions by simultaneously estimating multiple quantile functions with a non-crossing constraint and is robust from extreme quantiles. Furthermore, the single tree construction from the UNQRT leads to an interpretable model compared to the QRT model. In this study, by using the UNQRT, we explored the relationship between the results of the Army AI interview system and the existing personnel data to derive meaningful results.

Eye Gaze Tracking System Under Natural Head Movements (머리 움직임이 자유로운 안구 응시 추정 시스템)

  • ;Matthew, Sked;Qiang, Ji
    • Journal of the Institute of Electronics Engineers of Korea SC
    • /
    • v.41 no.5
    • /
    • pp.57-64
    • /
    • 2004
  • We proposed the eye gaze tracking system under natural head movements, which consists of one narrow-view field CCD camera, two mirrors which of reflective angles are controlled and active infra-red illumination. The mirrors' angles were computed by geometric and linear algebra calculations to put the pupil images on the optical axis of the camera. Our system allowed the subjects head to move 90cm horizontally and 60cm vertically, and the spatial resolutions were about 6$^{\circ}$ and 7$^{\circ}$, respectively. The frame rate for estimating gaze points was 10~15 frames/sec. As gaze mapping function, we used the hierarchical generalized regression neural networks (H-GRNN) based on the two-pass GRNN. The gaze accuracy showed 94% by H-GRNN improved 9% more than 85% of GRNN even though the head or face was a little rotated. Our system does not have a high spatial gaze resolution, but it allows natural head movements, robust and accurate gaze tracking. In addition there is no need to re-calibrate the system when subjects are changed.

The Decline of Health-Related Quality of Life Associated with Some Diseases in Korean Adults (우리나라 성인에서 일부 질환과 연관된 건강관련 삶의 질 감소)

  • Kil, Seol-Ryoung;Lee, Sang-Il;Yun, Sung-Cheol;An, Hyung-Mi;Jo, Min-Woo
    • Journal of Preventive Medicine and Public Health
    • /
    • v.41 no.6
    • /
    • pp.434-441
    • /
    • 2008
  • Objectives: This study was conducted to measure the decline in the health-related quality of life (HRQoL) associated with some diseases in South Korean adults. Methods: The EQ-5D health states in the 2005 National Health and Nutrition Examination Survey (NHNES) and the Korean EQ-5D valuation set were used to obtain the EQ-5D indexes of the study subjects. Each disease group was defined when the subjects reported to the NHNES that they were diagnosed with the corresponding disease during the previous 1 year by physicians. Since the distributions of the EQ-5D indexes in each subgroup were negatively skewed, median regression analysis was used to estimate the effects of specific diseases on the HRQoL. Median regression analysis produced estimates that approximated the median of the EQ-5D indexes and there are more robust for analyzing data with many outliers. Results: A total of 16,692 subjects (6,667 patients and 10,025 people without any disease) were included in the analysis. As a result of the median regression analysis, stroke had the strongest impact on the HRQoL for both males and females, followed by osteoporosis, osteoarthritis, rheumatic arthritis, and herniation of an intervertebral disc. While asthma had a significant impact on the HRQoL only in men, cataract, temporo-mandibular dysfunction, and peptic ulcer significantly affected the HRQoL only in women. Conclusions: Stroke and musculoskeletal diseases were associated with the largest losses of the HRQoL in Korean adults.

Estimation of Moisture Content in Comminuted Miscanthus based on the Intensity of Reflected Light

  • Cho, Yongjin;Lee, Dong Hoon
    • Journal of Biosystems Engineering
    • /
    • v.40 no.3
    • /
    • pp.296-304
    • /
    • 2015
  • Purpose: The balance between miscanthus production and its cost effectiveness depends greatly on its moisture content during post processing. The objective of this research was to measure the moisture content using a non-destructive and non-contact methodology for in situ applications. Methods: The moisture content of comminuted miscanthus was controlled using a closed chamber, a humidifier, a precision weigher, and a real-time monitoring software developed in this research. A CMOS sensor equipped with $50{\times}$ magnifier lens was used to capture magnified images of the conditioned materials with moisture content level from 5 to 30%. The hypothesis is that when light is incident on the comminuted particles in an inclined manner, higher moisture content results in light being reflected with a higher intensity. Results: A linear regression analysis for an initiative hypothesis based on general histogram analysis yielded insufficient correlations with low significance level (<0.31) for the determination coefficient. A significant relationship (94% confidence level) was determined at level 108 in a reverse accumulative histogram proposed based on a revised hypothesis. A linear regression model with the value at level 108 in the reverse accumulative histogram for a magnified image as the independent variable and the moisture content of comminuted miscanthus as the dependent variable was proposed as the estimation model. The calibrated linear regression model with a slope of 92.054 and an offset of 32.752 yielded 0.94 for the determination coefficient (RMSE = 0.2%). The validation test showed a significant relationship at the 74% confidence level with RMSE 6.4% (n = 36). Conclusions: To compensate the inconsistent significance between calibration and validation, an estimation model robust against various systematic interferences is necessary. The economic efficiency of miscanthus, which is a promising energy resource, can be improved by the real-time measurement of its crucial material properties.

Robust selection rules of k in ridge regression (능형회귀에서의 로버스트한 k의 선택 방법)

  • 임용빈
    • The Korean Journal of Applied Statistics
    • /
    • v.6 no.2
    • /
    • pp.371-381
    • /
    • 1993
  • When the multicollinearity presents in the standard linear regression model, ridge regression might be used to mitigate the effects of collinearity. As the prediction-oriented criterion, the integrated mean sqare error criterion $J_w(k)$ was introduced by Lim, Choi & Park(1980). By noting the equivalent relationship between the $C_k$ criterion and $J_w(k)$ with a special choice of weight function $W(x)$, we propose a more reasonable selection rule of k w.r.t. the $C_k$ criterion than that given in Myers(1986). Next, to find the $\beta(k)$ which behaves reasonably well w.r.t. competing criteria, we adopt the minimax principle in the sense of maximizing the worst relative efficiency of k among competing criteria.

  • PDF

A Study on the Development of Stress Testing Model for Korean Banks: Optimal Design of Monte Carlo Simulation and BIS Forecasting (국내은행 스트레스테스트 모형개선에 관한 연구: 최적 몬테카를로 시뮬레이션 탐색과 BIS예측을 중심으로)

  • Chaehwan Won;Jinyul Yang
    • Asia-Pacific Journal of Business
    • /
    • v.14 no.1
    • /
    • pp.149-169
    • /
    • 2023
  • Purpose - The main purpose of this study is to develop the stress test model for Korean banks by exploring the optimal Monte Carlo simulation and BIS forecasting model. Design/methodology/approach - This study selects 15 Korean banks as sample financial firms and collects relevant 76 quarterly data for the period between year 2000 and 2018 from KRX(Korea Excange), Bank of Korea, and FnGuide. The Regression analysis, Unit-root test, and Monte Carlo simulation are hired to analyze the data. Findings - First, most of the sample banks failed to keep 8% BIS ratio for the adverse and severely Adverse Scenarios, implying that Korean banks must make every effort to realize better BIS ratios under adverse market conditions. Second, we suggest the better Monte Carlo simulation model for the Korean banks by finding that the more appropriate volatility should be different depending on variables rather than simple two-sigma which has been used in the previous studies. Third, we find that the stepwise regression model is better fitted than simple regression model in forecasting macro-economic variables for the BIS variables. Fourth, we find that, for the more robust and significant statistical results in designing stress tests, Korean banks are required to construct more valid time-series and cross-sectional data-base. Research implications or Originality - The above results all together show that the optimal volatility in designing optimal Monte Carlo simulation varies depending on the country, and many Korean banks fail to pass sress test under the adverse and severely adverse scenarios, implying that Korean banks need to make improvement in the BIS ratio.

Using Bayesian tree-based model integrated with genetic algorithm for streamflow forecasting in an urban basin

  • Nguyen, Duc Hai;Bae, Deg-Hyo
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2021.06a
    • /
    • pp.140-140
    • /
    • 2021
  • Urban flood management is a crucial and challenging task, particularly in developed cities. Therefore, accurate prediction of urban flooding under heavy precipitation is critically important to address such a challenge. In recent years, machine learning techniques have received considerable attention for their strong learning ability and suitability for modeling complex and nonlinear hydrological processes. Moreover, a survey of the published literature finds that hybrid computational intelligent methods using nature-inspired algorithms have been increasingly employed to predict or simulate the streamflow with high reliability. The present study is aimed to propose a novel approach, an ensemble tree, Bayesian Additive Regression Trees (BART) model incorporating a nature-inspired algorithm to predict hourly multi-step ahead streamflow. For this reason, a hybrid intelligent model was developed, namely GA-BART, containing BART model integrating with Genetic algorithm (GA). The Jungrang urban basin located in Seoul, South Korea, was selected as a case study for the purpose. A database was established based on 39 heavy rainfall events during 2003 and 2020 that collected from the rain gauges and monitoring stations system in the basin. For the goal of this study, the different step ahead models will be developed based in the methods, including 1-hour, 2-hour, 3-hour, 4-hour, 5-hour, and 6-hour step ahead streamflow predictions. In addition, the comparison of the hybrid BART model with a baseline model such as super vector regression models is examined in this study. It is expected that the hybrid BART model has a robust performance and can be an optional choice in streamflow forecasting for urban basins.

  • PDF