• Title/Summary/Keyword: multiple regression techniques

Search Result 250, Processing Time 0.035 seconds

Application of Multiple Linear Regression Analysis and Tree-Based Machine Learning Techniques for Cutter Life Index(CLI) Prediction (커터수명지수 예측을 위한 다중선형회귀분석과 트리 기반 머신러닝 기법 적용)

  • Ju-Pyo Hong;Tae Young Ko
    • Tunnel and Underground Space
    • /
    • v.33 no.6
    • /
    • pp.594-609
    • /
    • 2023
  • TBM (Tunnel Boring Machine) method is gaining popularity in urban and underwater tunneling projects due to its ability to ensure excavation face stability and minimize environmental impact. Among the prominent models for predicting disc cutter life, the NTNU model uses the Cutter Life Index(CLI) as a key parameter, but the complexity of testing procedures and rarity of equipment make measurement challenging. In this study, CLI was predicted using multiple linear regression analysis and tree-based machine learning techniques, utilizing rock properties. Through literature review, a database including rock uniaxial compressive strength, Brazilian tensile strength, equivalent quartz content, and Cerchar abrasivity index was built, and derived variables were added. The multiple linear regression analysis selected input variables based on statistical significance and multicollinearity, while the machine learning prediction model chose variables based on their importance. Dividing the data into 80% for training and 20% for testing, a comparative analysis of the predictive performance was conducted, and XGBoost was identified as the optimal model. The validity of the multiple linear regression and XGBoost models derived in this study was confirmed by comparing their predictive performance with prior research.

Fault Detection in Semiconductor Manufacturing Using Statistical Method

  • Lim, Woo-Yup;Jeon, Sung-Ik;Han, Seung-Soo;Soh, Dae-Wha;Hong, Sang-Jeen
    • Proceedings of the Korean Institute of Electrical and Electronic Material Engineers Conference
    • /
    • 2009.11a
    • /
    • pp.44-44
    • /
    • 2009
  • Fault detection is necessary for yield enhancement and cost reduction in semiconductor manufacturing. Sensory data acquired from the semiconductor processing tool is too large to analyze for the purpose of fault detection and classification(FDC). We studied the techniques of fault detection using statistical method. Multiple regression analysis smoothly detected faults and can be easy made a model. For real-time and fast computing time, the huge data was analyzed by each step. We also considered interaction and critical factors in tool parameters and process.

  • PDF

QSPR Study of the Absorption Maxima of Azobenzene Dyes

  • Xu, Jie;Wang, Lei;Liu, Li;Bai, Zikui;Wang, Luoxin
    • Bulletin of the Korean Chemical Society
    • /
    • v.32 no.11
    • /
    • pp.3865-3872
    • /
    • 2011
  • A quantitative structure-property relationship (QSPR) study was performed for the prediction of the absorption maxima of azobenzene dyes. The entire set of 191 azobenzenes was divided into a training set of 150 azobenzenes and a test set of 41 azobenzenes according to Kennard and Stones algorithm. A seven-descriptor model, with squared correlation coefficient ($R^2$) of 0.8755 and standard error of estimation (s) of 14.476, was developed by applying stepwise multiple linear regression (MLR) analysis on the training set. The reliability of the proposed model was further illustrated using various evaluation techniques: leave-many-out crossvalidation procedure, randomization tests, and validation through the test set.

Parent's Supportive Parenting and Adolescent Sexual Values (부모의 지지적 양육행동과 청소년의 성가치관)

  • Min, Ha-Yeoung;Kim, Koung-Hwa
    • Korean Journal of Child Studies
    • /
    • v.26 no.6
    • /
    • pp.59-71
    • /
    • 2005
  • The purpose of this study was to investigate relationship between parent's supportive parenting and adolescent sexual values. The subjects were 137 adolescents who attended high school in Keoungbok. Statistical techniques were Factor Analysis, Crosstabs, Two-way ANOVA, Scheffe' test, Multiple Regression. The results of this were as follows. First, Adolescents who more perceived supportive parenting from a parent were more likely to consult with parents about one's own sexual problems. Second, There was significant difference in adolescent sexual values by parent's supportive parenting levels or gender. Adolescents who perceived more supportive parenting from parent, or who were boys were more likely to have positive sexual values. But there was no significant interaction effect of supportive parenting level and gender on adolescent sexual values. Finally, The Multiple Regression analysis showed that gender was the stronger predictor of adolescent sexual values than parent's supportive parenting.

  • PDF

A Hybrid Algorithm for Identifying Multiple Outlers in Linear Regression

  • Kim, Bu-yong;Kim, Hee-young
    • Communications for Statistical Applications and Methods
    • /
    • v.9 no.1
    • /
    • pp.291-304
    • /
    • 2002
  • This article is concerned with an effective algorithm for the identification of multiple outliers in linear regression. It proposes a hybrid algorithm which employs the least median of squares estimator, instead of the least squares estimator, to construct an Initial clean subset in the stepwise forward search scheme. The performance of the proposed algorithm is evaluated and compared with the existing competitor via an extensive Monte Carlo simulation. The algorithm appears to be superior to the competitor for the most of scenarios explored in the simulation study. Particularly it copes with the masking problem quite well. In addition, the orthogonal decomposition and Its updating techniques are considered to improve the computational efficiency and numerical stability of the algorithm.

A prediction method of ice breaking resistance using a multiple regression analysis

  • Cho, Seong-Rak;Lee, Sungsu
    • International Journal of Naval Architecture and Ocean Engineering
    • /
    • v.7 no.4
    • /
    • pp.708-719
    • /
    • 2015
  • The two most important tasks of icebreakers are first to secure a sailing route by breaking the thick sea ice and second to sail efficiently herself for purposes of exploration and transportation in the polar seas. The resistance of icebreakers is a priority factor at the preliminary design stage; not only must their sailing efficiency be satisfied, but the design of the propulsion system will be directly affected. Therefore, the performance of icebreakers must be accurately calculated and evaluated through the use of model tests in an ice tank before construction starts. In this paper, a new procedure is developed, based on model tests, to estimate a ship's ice breaking resistance during continuous ice-breaking in ice. Some of the factors associated with crushing failures are systematically considered in order to correctly estimate her ice-breaking resistance. This study is intended to contribute to the improvement of the techniques for ice resistance prediction with ice breaking ships.

A Comparative Study of the Results of the Regression Analysis by Linear Programming (선형계획법을 이용한 회귀분석 결과의 비교 연구)

  • Kim, Gwang-Su;Jeong, Ji-An;Lee, Jin-Gyu
    • Journal of Korean Society for Quality Management
    • /
    • v.21 no.1
    • /
    • pp.161-170
    • /
    • 1993
  • This study attempts to present the linear regression analysis that involves more than one regressor variable, because regression analysis is the most widely used statistical technique for describing, predicting and estimating the relationships between given data. The model of multiple linear regression may be solved directly by the two linear programming methods, i.e., to minimize the sum of the absolute deviation (MSD) and to minimize the maximum deviation(MMD). In addition, some results was compared to each techniques for accuracy and tested to the validity of statistical meaning.

  • PDF

Development of Medical Cost Prediction Model Based on the Machine Learning Algorithm (머신러닝 알고리즘 기반의 의료비 예측 모델 개발)

  • Han Bi KIM;Dong Hoon HAN
    • Journal of Korea Artificial Intelligence Association
    • /
    • v.1 no.1
    • /
    • pp.11-16
    • /
    • 2023
  • Accurate hospital case modeling and prediction are crucial for efficient healthcare. In this study, we demonstrate the implementation of regression analysis methods in machine learning systems utilizing mathematical statics and machine learning techniques. The developed machine learning model includes Bayesian linear, artificial neural network, decision tree, decision forest, and linear regression analysis models. Through the application of these algorithms, corresponding regression models were constructed and analyzed. The results suggest the potential of leveraging machine learning systems for medical research. The experiment aimed to create an Azure Machine Learning Studio tool for the speedy evaluation of multiple regression models. The tool faciliates the comparision of 5 types of regression models in a unified experiment and presents assessment results with performance metrics. Evaluation of regression machine learning models highlighted the advantages of boosted decision tree regression, and decision forest regression in hospital case prediction. These findings could lay the groundwork for the deliberate development of new directions in medical data processing and decision making. Furthermore, potential avenues for future research may include exploring methods such as clustering, classification, and anomaly detection in healthcare systems.

COMPARISON OF OPERATIVE TECHNIQUES BETWEEN FEMALE AND MALE DENTISTS IN CLASS 2 AND CLASS 5 RESIN COMPOSITE RESTORATIONS (2급/ 5급 와동 복합레진 수복 술식에 대한 남녀 치과 의사의 비교)

  • Chang, Ju-Hea;Kim, Hae-Young;Son, Ho-Hyun
    • Restorative Dentistry and Endodontics
    • /
    • v.35 no.2
    • /
    • pp.116-124
    • /
    • 2010
  • This study aimed to assess whether the gender of the dental practitioner affects operative techniques in class 2 and class 5 resin composite restorations. In 2008, a nationwide survey was given to Korean dentists. Total 12,193 e-mails were distributed, 2,632 were opened by recipients, and 840 responses were collected. Of the respondents, 78.9% were male and 21.1% were female. The gender distribution in the age groups between respondents and the total population did not differ (p > 0.05). A chi-square test was used to compare technical differences between female and male dentists. A multiple logistic regression analysis was performed to assess the association between gender and operative techniques in resin composite restoration. For class 2 resin composite restoration, female dentists were 1.87 times more likely than male dentists to do multiple incremental fillings (four layers or more) and 2.72 times more likely than males to spend 30 minutes or more for the treatment (p < 0.05). For class 5 resin composite restoration, female dentists were 2.69 times more likely than their male counterparts to use a cavity base or liner, 1.83 times more likely to do multiple incremental fillings (four layers or more) and 1.63 times more likely to spend 20 minutes or more for the procedure (p < 0.05). The gender factor was influential to individual operative techniques in restorative treatment.

Evaluation of Surrogate Monitoring Parameters for SS and T-P Using Multiple Linear Regression and Random Forest (다중 선형 회귀 분석과 랜덤 포레스트를 이용한 SS, T-P 대리모니터링 기법 평가)

  • Jeung, Minhyuk;Beom, Jina;Choi, Dongho;Kim, Young-joo;Her, Younggu;Yoon, Kwangsik
    • Journal of The Korean Society of Agricultural Engineers
    • /
    • v.63 no.2
    • /
    • pp.51-60
    • /
    • 2021
  • Effective nonpoint source (NPS) pollution management requires frequent water quality monitoring, which is, however, often costly to be implemented in practice. Statistical techniques and machine learning methods allow us to identify and focus on fundamental environmental variables that have close relationships with NPS pollutants of interest. This study developed surrogate models to predict the concentrations of suspended sediment (SS) and total phosphorus (T-P) from turbidity and runoff discharge rates using multiple linear regression (MLR) and random forest (RF) methods. The RF models provided acceptable performance in predicting SS and T-P, especially when runoff discharge rates were high. The RF models outperformed the MLR models in all the cases. Such finding highlights the potential of RF techniques and models as a tool to identify fundamental environmental variables that are measured in relatively inexpensive ways or freely available but still able to provide information required to quantify the concentrations of NP S pollutants. The analysis of relative importance rates showed that the temporal variations of SS and T-P concentrations could be more effectively explained by that of turbidity than runoff discharge rate. This study demonstrated that the advanced statistical techniques such as machine learning could help to improve the efficiency of NPS pollutants monitoring.