• 제목/요약/키워드: multiple regression techniques

검색결과 258건 처리시간 0.026초

커터수명지수 예측을 위한 다중선형회귀분석과 트리 기반 머신러닝 기법 적용 (Application of Multiple Linear Regression Analysis and Tree-Based Machine Learning Techniques for Cutter Life Index(CLI) Prediction)

  • 홍주표;고태영
    • 터널과지하공간
    • /
    • 제33권6호
    • /
    • pp.594-609
    • /
    • 2023
  • TBM 공법은 굴착면 안정성 확보 및 주변환경에 비치는 영향을 최소화하기 때문에 도심지나 하·해저터널 등에서 적용 사례가 증가하는 추세이다. 디스크 커터의 수명을 예측하는 대표적인 모델 중 NTNU모델은 커터수명지수(Cutter Life Index, CLI)를 주요 매개 변수로 활용하지만 복잡한 시험절차와 시험장비의 희귀성으로 측정에 어려움이 있다. 본 연구에서는 다중선형회귀분석과 트리 기반의 머신러닝 기법으로 암석물성을 활용하여 CLI를 예측하였다. 문헌 조사를 통해 암석의 일축압축강도, 압열인장강도, 등 가석영함량과 세르샤 마모지수 등을 포함한 데이터베이스를 구축하였고 파생변수를 계산하여 추가하였다. 다중선형회귀분석은 통계적 유의성과 다중공선성을 고려하여 입력 변수를 선정하였고 머신러닝 예측 모델은 변수 중요도를 기반으로 입력 변수를 선정하였다. 학습용과 검증용 데이터를 8:2로 나누어 모델 간 예측 성능을 비교한 결과 XGBoost가 최적의 모델로 선정되었다. 본 연구에서 도출된 다중선형회귀모델과 XGBoost모델을 선행 연구와 예측 성능을 비교하여 타당성을 확인하였다.

Fault Detection in Semiconductor Manufacturing Using Statistical Method

  • Lim, Woo-Yup;Jeon, Sung-Ik;Han, Seung-Soo;Soh, Dae-Wha;Hong, Sang-Jeen
    • 한국전기전자재료학회:학술대회논문집
    • /
    • 한국전기전자재료학회 2009년도 추계학술대회 논문집
    • /
    • pp.44-44
    • /
    • 2009
  • Fault detection is necessary for yield enhancement and cost reduction in semiconductor manufacturing. Sensory data acquired from the semiconductor processing tool is too large to analyze for the purpose of fault detection and classification(FDC). We studied the techniques of fault detection using statistical method. Multiple regression analysis smoothly detected faults and can be easy made a model. For real-time and fast computing time, the huge data was analyzed by each step. We also considered interaction and critical factors in tool parameters and process.

  • PDF

QSPR Study of the Absorption Maxima of Azobenzene Dyes

  • Xu, Jie;Wang, Lei;Liu, Li;Bai, Zikui;Wang, Luoxin
    • Bulletin of the Korean Chemical Society
    • /
    • 제32권11호
    • /
    • pp.3865-3872
    • /
    • 2011
  • A quantitative structure-property relationship (QSPR) study was performed for the prediction of the absorption maxima of azobenzene dyes. The entire set of 191 azobenzenes was divided into a training set of 150 azobenzenes and a test set of 41 azobenzenes according to Kennard and Stones algorithm. A seven-descriptor model, with squared correlation coefficient ($R^2$) of 0.8755 and standard error of estimation (s) of 14.476, was developed by applying stepwise multiple linear regression (MLR) analysis on the training set. The reliability of the proposed model was further illustrated using various evaluation techniques: leave-many-out crossvalidation procedure, randomization tests, and validation through the test set.

부모의 지지적 양육행동과 청소년의 성가치관 (Parent's Supportive Parenting and Adolescent Sexual Values)

  • 민하영;김경화
    • 아동학회지
    • /
    • 제26권6호
    • /
    • pp.59-71
    • /
    • 2005
  • The purpose of this study was to investigate relationship between parent's supportive parenting and adolescent sexual values. The subjects were 137 adolescents who attended high school in Keoungbok. Statistical techniques were Factor Analysis, Crosstabs, Two-way ANOVA, Scheffe' test, Multiple Regression. The results of this were as follows. First, Adolescents who more perceived supportive parenting from a parent were more likely to consult with parents about one's own sexual problems. Second, There was significant difference in adolescent sexual values by parent's supportive parenting levels or gender. Adolescents who perceived more supportive parenting from parent, or who were boys were more likely to have positive sexual values. But there was no significant interaction effect of supportive parenting level and gender on adolescent sexual values. Finally, The Multiple Regression analysis showed that gender was the stronger predictor of adolescent sexual values than parent's supportive parenting.

  • PDF

A Hybrid Algorithm for Identifying Multiple Outlers in Linear Regression

  • Kim, Bu-yong;Kim, Hee-young
    • Communications for Statistical Applications and Methods
    • /
    • 제9권1호
    • /
    • pp.291-304
    • /
    • 2002
  • This article is concerned with an effective algorithm for the identification of multiple outliers in linear regression. It proposes a hybrid algorithm which employs the least median of squares estimator, instead of the least squares estimator, to construct an Initial clean subset in the stepwise forward search scheme. The performance of the proposed algorithm is evaluated and compared with the existing competitor via an extensive Monte Carlo simulation. The algorithm appears to be superior to the competitor for the most of scenarios explored in the simulation study. Particularly it copes with the masking problem quite well. In addition, the orthogonal decomposition and Its updating techniques are considered to improve the computational efficiency and numerical stability of the algorithm.

A prediction method of ice breaking resistance using a multiple regression analysis

  • Cho, Seong-Rak;Lee, Sungsu
    • International Journal of Naval Architecture and Ocean Engineering
    • /
    • 제7권4호
    • /
    • pp.708-719
    • /
    • 2015
  • The two most important tasks of icebreakers are first to secure a sailing route by breaking the thick sea ice and second to sail efficiently herself for purposes of exploration and transportation in the polar seas. The resistance of icebreakers is a priority factor at the preliminary design stage; not only must their sailing efficiency be satisfied, but the design of the propulsion system will be directly affected. Therefore, the performance of icebreakers must be accurately calculated and evaluated through the use of model tests in an ice tank before construction starts. In this paper, a new procedure is developed, based on model tests, to estimate a ship's ice breaking resistance during continuous ice-breaking in ice. Some of the factors associated with crushing failures are systematically considered in order to correctly estimate her ice-breaking resistance. This study is intended to contribute to the improvement of the techniques for ice resistance prediction with ice breaking ships.

선형계획법을 이용한 회귀분석 결과의 비교 연구 (A Comparative Study of the Results of the Regression Analysis by Linear Programming)

  • 김광수;정지안;이진규
    • 품질경영학회지
    • /
    • 제21권1호
    • /
    • pp.161-170
    • /
    • 1993
  • This study attempts to present the linear regression analysis that involves more than one regressor variable, because regression analysis is the most widely used statistical technique for describing, predicting and estimating the relationships between given data. The model of multiple linear regression may be solved directly by the two linear programming methods, i.e., to minimize the sum of the absolute deviation (MSD) and to minimize the maximum deviation(MMD). In addition, some results was compared to each techniques for accuracy and tested to the validity of statistical meaning.

  • PDF

머신러닝 알고리즘 기반의 의료비 예측 모델 개발 (Development of Medical Cost Prediction Model Based on the Machine Learning Algorithm)

  • Han Bi KIM;Dong Hoon HAN
    • Journal of Korea Artificial Intelligence Association
    • /
    • 제1권1호
    • /
    • pp.11-16
    • /
    • 2023
  • Accurate hospital case modeling and prediction are crucial for efficient healthcare. In this study, we demonstrate the implementation of regression analysis methods in machine learning systems utilizing mathematical statics and machine learning techniques. The developed machine learning model includes Bayesian linear, artificial neural network, decision tree, decision forest, and linear regression analysis models. Through the application of these algorithms, corresponding regression models were constructed and analyzed. The results suggest the potential of leveraging machine learning systems for medical research. The experiment aimed to create an Azure Machine Learning Studio tool for the speedy evaluation of multiple regression models. The tool faciliates the comparision of 5 types of regression models in a unified experiment and presents assessment results with performance metrics. Evaluation of regression machine learning models highlighted the advantages of boosted decision tree regression, and decision forest regression in hospital case prediction. These findings could lay the groundwork for the deliberate development of new directions in medical data processing and decision making. Furthermore, potential avenues for future research may include exploring methods such as clustering, classification, and anomaly detection in healthcare systems.

2급/ 5급 와동 복합레진 수복 술식에 대한 남녀 치과 의사의 비교 (COMPARISON OF OPERATIVE TECHNIQUES BETWEEN FEMALE AND MALE DENTISTS IN CLASS 2 AND CLASS 5 RESIN COMPOSITE RESTORATIONS)

  • 장주혜;김혜영;손호현
    • Restorative Dentistry and Endodontics
    • /
    • 제35권2호
    • /
    • pp.116-124
    • /
    • 2010
  • 본 연구에서는 복합 레진을 이용한 2급/5급 와동의 직접수복에 있어서 치과의사의 성별에 따른 술식의 차이를 비교하였다. 2008년 대한치과의사협회에 등록된 치과의사 12,193명을 대상으로 이 메일을 통한 설문조사를 실시하였다. 이 메일 수신이 확인된 2,632명 중 840명이 응답하였으며 응답자의 남녀 비율은(남 78.9%, 여 21.1%) 전체 치과의사의 남녀 비율과 유의한 차이를 보이지 않았다(p > 0.05). Chi-square test 와 multiple logistic regression analysis 를 이용하여 남녀간 술식의 차 이를 검증하였다. 2급 와동 수복에서 여자치과의사는 4회 이상의 적층 분할 수복을 하는 경향이 남자치과의사에 비해 1.87배 높았으며, 술식 당 30분 이상 소요하는 경향은 2.72배 높았다(p < 0.05). 5급 와동 수복에서 여자치과의사는 베이스를 사용하는 경향이 1.83배 높았으며, 술식 당 20분 이상 소요하는 경향은 1.63배 높았다(p < 0.05). 본 설문조사에 따르면 남녀 성별에 따라 복합 레진 수복 술식의 차이가 존재하는 것으로 나타났다.

다중 선형 회귀 분석과 랜덤 포레스트를 이용한 SS, T-P 대리모니터링 기법 평가 (Evaluation of Surrogate Monitoring Parameters for SS and T-P Using Multiple Linear Regression and Random Forest)

  • 정민혁;범진아;최동호;김영주;허용구;윤광식
    • 한국농공학회논문집
    • /
    • 제63권2호
    • /
    • pp.51-60
    • /
    • 2021
  • Effective nonpoint source (NPS) pollution management requires frequent water quality monitoring, which is, however, often costly to be implemented in practice. Statistical techniques and machine learning methods allow us to identify and focus on fundamental environmental variables that have close relationships with NPS pollutants of interest. This study developed surrogate models to predict the concentrations of suspended sediment (SS) and total phosphorus (T-P) from turbidity and runoff discharge rates using multiple linear regression (MLR) and random forest (RF) methods. The RF models provided acceptable performance in predicting SS and T-P, especially when runoff discharge rates were high. The RF models outperformed the MLR models in all the cases. Such finding highlights the potential of RF techniques and models as a tool to identify fundamental environmental variables that are measured in relatively inexpensive ways or freely available but still able to provide information required to quantify the concentrations of NP S pollutants. The analysis of relative importance rates showed that the temporal variations of SS and T-P concentrations could be more effectively explained by that of turbidity than runoff discharge rate. This study demonstrated that the advanced statistical techniques such as machine learning could help to improve the efficiency of NPS pollutants monitoring.