• Title/Summary/Keyword: Multiple regression model

Search Result 2,523, Processing Time 0.033 seconds

Relationship between periodontal disease and stroke history in the geriatric population - Using logistic regression model with 3-step adjustment considering effect of confounder (Confounder를 고려한 3단계의 logistic regression model을 통한 노인인구에 있어서의 치주질환과 뇌경색 경험 유무와의 상관관계에 대한 연구)

  • Lee, Hyo-Jung
    • The Journal of the Korean dental association
    • /
    • v.44 no.10 s.449
    • /
    • pp.658-670
    • /
    • 2006
  • 1980년대 후반기부터 치주질환과 뇌경색(ischemic stroke)자료의 연관성을 모색하는 시도가 있어왔다. 이번 연구의 목적은 치주질환과 뇌경색 유무와의 어떤 관계가 있는지를 60세 이상의 노인을 대상으로 조사, 통계 분석하였다. 자료는 미국의 총 국민조사 격인 The Third Nation Health and Nutrition Examination Survey (NHANES III)를 이용하였다. 이번 연구에서 unadjusted logistic model 통계법을 이용하여 치아 상실수와 뇌경색 경험이 통계학적으로 유의한 수치의 상관성이 있음을 알게 되었다. 또한 나이와 흡연유무를 고려, 조정한 후 multiple logistic model 통계법으로 잔존치아가 적을수록 더욱 뇌경색에 걸릴 확률이 높음을 보였다. 그러나 두 질병에 동시에 선택된 중요한 위험인자 (risk factor)를 모두 고려, 조정 한 후에는 통계학적인 유의성을 찾지 못했다. 치은퇴축, 치주낭 깊이, 치석, 탐침시 출혈과 뇌경색 경험은 각각의 비교법에서 약간의 상관성을 보이나, 모든 통계법을 통해 일괄된 결과를 얻을 수는 없었다.

  • PDF

A Study on Prediction Model of Scaffold Appearance Defect Using Machine Learning (기계 학습을 이용한 인공지지체 외형 불량 예측 모델에 관한 연구)

  • Lee, Song-Yeon;Huh, Yong Jeong
    • Journal of the Semiconductor & Display Technology
    • /
    • v.19 no.2
    • /
    • pp.26-30
    • /
    • 2020
  • In this paper, we studied the problem if the experiment number occurring in order to identify defect in scaffold. We need to change each of the 5 print factor to predict defect when printing disk type scaffold using FDM 3d printer. So then the number of scaffold print will be more than 100,000 times. This experiment number is difficult to perform in the field. In order to solve this problem, we have produced a prediction model based on machine learning multiple linear regression using print conditions and defect scaffold data for print conditions. The prediction model produced was verified through experiments. The verification confirmed that the error was less than 0.5 %. We have confirmed that satisfied within the target margin of error 5 %.

A Generation and Accuracy Evaluation of Common Metadata Prediction Model Using Public Bicycle Data and Imputation Method

  • Kim, Jong-Chan;Jung, Se-Hoon
    • Journal of Korea Multimedia Society
    • /
    • v.25 no.2
    • /
    • pp.287-296
    • /
    • 2022
  • Today, air pollution is becoming a severe issue worldwide and various policies are being implemented to solve environmental pollution. In major cities, public bicycles are installed and operated to reduce pollution and solve transportation problems, and operational information is collected in real time. However, research using public bicycle operation information data has not been processed. This study uses the daily weather data of Korea Meteorological Agency and real-time air pollution data of Korea Environment Corporation to predict the amount of daily rental bicycles. Cross- validation, principal component analysis and multiple regression analysis were used to determine the independent variables of the predictive model. Then, the study selected the elements that satisfy the significance level, constructed a model, predicted the amount of daily rental bicycles, and measured the accuracy.

GA-optimized Support Vector Regression for an Improved Emotional State Estimation Model

  • Ahn, Hyunchul;Kim, Seongjin;Kim, Jae Kyeong
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.8 no.6
    • /
    • pp.2056-2069
    • /
    • 2014
  • In order to implement interactive and personalized Web services properly, it is necessary to understand the tangible and intangible responses of the users and to recognize their emotional states. Recently, some studies have attempted to build emotional state estimation models based on facial expressions. Most of these studies have applied multiple regression analysis (MRA), artificial neural network (ANN), and support vector regression (SVR) as the prediction algorithm, but the prediction accuracies have been relatively low. In order to improve the prediction performance of the emotion prediction model, we propose a novel SVR model that is optimized using a genetic algorithm (GA). Our proposed algorithm-GASVR-is designed to optimize the kernel parameters and the feature subsets of SVRs in order to predict the levels of two aspects-valence and arousal-of the emotions of the users. In order to validate the usefulness of GASVR, we collected a real-world data set of facial responses and emotional states via a survey. We applied GASVR and other algorithms including MRA, ANN, and conventional SVR to the data set. Finally, we found that GASVR outperformed all of the comparative algorithms in the prediction of the valence and arousal levels.

Validation of Nursing Care Sensitive Outcomes related to Knowledge (지식에 관한 간호결과도구의 타당성 조사)

  • 이은주
    • Journal of Korean Academy of Nursing
    • /
    • v.33 no.5
    • /
    • pp.625-632
    • /
    • 2003
  • Purpose: The purpose of this study was to assess the importance and sensitivity to nursing interventions of four nursing sensitive nursing outcomes selected from the Nursing Outcomes Classification (NOC). Outcomes for this study were 'Knowledge: Diet', 'Knowledge: Disease Process', 'Knowledge: Energy Conservation', and 'Knowledge: Health Behaviors'. Method: Data were collected from 183 nurses working in 2 university hospitals. Fehring method was used to estimate outcome and indicators' content and sensitivity validity. Multiple and stepwise regression were used to evaluate relationships between each outcome and its indicators. Result: Results confirmed the importance and nursing sensitivity of outcomes and their indicators. Key indicators of each outcomes were found by multiple regression. 'Knowledge: Diet' was suggested for adding new indicators because the variance explained by indicators was relatively low. Not all of the indicators selected for stepwise regression model were rated for highly in Fehring method. The R² statistics of the stepwise regression models were between 18 and 63% in importance by selected indicators and between 34 and 68% in contribution by selected indicators. Conclusion: This study refined what outcomes and indicators will be useful in clinical practice. Further research will be required for the revision of outcome and indicators of NOC. However, this study refined what outcomes and indicators will be useful in clinical practice.

Machine learning-based Fine Dust Prediction Model using Meteorological data and Fine Dust data (기상 데이터와 미세먼지 데이터를 활용한 머신러닝 기반 미세먼지 예측 모형)

  • KIM, Hye-Lim;MOON, Tae-Heon
    • Journal of the Korean Association of Geographic Information Studies
    • /
    • v.24 no.1
    • /
    • pp.92-111
    • /
    • 2021
  • As fine dust negatively affects disease, industry and economy, the people are sensitive to fine dust. Therefore, if the occurrence of fine dust can be predicted, countermeasures can be prepared in advance, which can be helpful for life and economy. Fine dust is affected by the weather and the degree of concentration of fine dust emission sources. The industrial sector has the largest amount of fine dust emissions, and in industrial complexes, factories emit a lot of fine dust as fine dust emission sources. This study targets regions with old industrial complexes in local cities. The purpose of this study is to explore the factors that cause fine dust and develop a predictive model that can predict the occurrence of fine dust. weather data and fine dust data were used, and variables that influence the generation of fine dust were extracted through multiple regression analysis. Based on the results of multiple regression analysis, a model with high predictive power was extracted by learning with a machine learning regression learner model. The performance of the model was confirmed using test data. As a result, the models with high predictive power were linear regression model, Gaussian process regression model, and support vector machine. The proportion of training data and predictive power were not proportional. In addition, the average value of the difference between the predicted value and the measured value was not large, but when the measured value was high, the predictive power was decreased. The results of this study can be developed as a more systematic and precise fine dust prediction service by combining meteorological data and urban big data through local government data hubs. Lastly, it will be an opportunity to promote the development of smart industrial complexes.

A study on estimation of lowflow indices in ungauged basin using multiple regression (다중회귀분석을 이용한 미계측 유역의 갈수지수 산정에 관한 연구)

  • Lim, Ga Kyun;Jeung, Se Jin;Kim, Byung Sik;Chae, Soo Kwon
    • Journal of Korea Water Resources Association
    • /
    • v.53 no.12
    • /
    • pp.1193-1201
    • /
    • 2020
  • This study aims to develop a regression model that estimates a low-flow index that can be applied to ungauged basins. A total of 30 midsized basins in South Korea use long-term runoff data provided by the National Integrated Water Management System (NIWMS) to calculate average low-flow, average minimum streamflow, and low-flow index duration and frequency. This information is used in the correlation analysis with 18 basin factors and 3 climate change factors to identify the basin area, average basin altitude, average basin slope, water system density, runoff curve number, annual evapotranspiration, and annual precipitation in the low-flow index regression model. This study evaluates the model's accuracy by using the root-mean-square error (RMSE) and the mean absolute error (MAE) for 10 ungauged, verified basins and compares them with the previous model's low-flow calculations to determine the effectiveness of the newly developed model. Comparative analysis indicates that the new regression model produces average low-flow, attributed to the consideration of varied basin and hydrologic factors during the new model's development.

Development of Vehicular Load Model using Heavy Truck Weight Distribution (II) - Multiple Truck Effects and Model Development (중차량중량분포를 이용한 차량하중모형 개발(II) - 연행차량 효과 분석 및 모형 개발)

  • Hwang, Eui-Seung
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.29 no.3A
    • /
    • pp.199-207
    • /
    • 2009
  • In this paper, new vehicular load model is developed for reliability-based bridge design code. Rational load model and statistical properties of loads are important for developing reliability-based design code. In the previous paper, truck weight data collected at eight locations using WIM or BWIM system are analyzed to calculate the maximum truck weights for specified bridge lifetime. Probability distributions of upper 20% total truck weight are assumed as Extreme Type I (Gumbel Distribution) and 100 years maximum weights are estimated by linear regression. In this study, effects of multiple presence of trucks are analyzed. Probability of multiple presence of trucks are estimated and corresponding multiple truck weights are calculated using the same probability distribution function as in the previous paper. New vehicular live load model are proposed for span length from 10 m to 200 m. New model is compared with current Korean model and various load models of other countries.

Relationship between Shear Strength and Component Content of Fault Cores (단층핵 구성물질의 함량과 전단강도 사이의 상관성 분석)

  • Yun, Hyun-Seok;Moon, Seong-Woo;Seo, Yong-Seok
    • Economic and Environmental Geology
    • /
    • v.52 no.1
    • /
    • pp.65-79
    • /
    • 2019
  • In this study, simple regression and multiple regression analyses were performed to analyze the relationship between breccia and clay content and shear strength in fault cores. The results of the simple regression analysis performed for each rock (andesitic rock, granite, and sedimentary rock) and three levels of normal stress (${\sigma}_n=54$, 108, 162 kPa), reveal that the shear strength is proportional to breccia content and inversely proportional to clay content. Furthermore, as normal stress increases, the shear strength is influenced by the change in component content, correlating more strongly with clay content than with breccia content. In the multiple regression analysis, which considers both breccia and clay content, the shear strength is found to be more sensitive to the change in breccia content than to that of clay. As a result, the most suitable regression model for each rock is proposed by comparing the coefficients of determination ($R^2$) estimated from the simple regression analysis with those from the multiple regression analysis. The proposed models show high coefficients of determination of $R^2=0.624-0.830$.

Optimal fractions in terms of a prediction-oriented measure

  • Lee, Won-Woo
    • Journal of the Korean Statistical Society
    • /
    • v.22 no.2
    • /
    • pp.209-217
    • /
    • 1993
  • The multicollinearity problem in a multiple linear regression model may present deleterious effects on predictions. Thus, its is desirable to consider the optimal fractions with respect to the unbiased estimate of the mean squares errors of the predicted values. Interstingly, the optimal fractions can be also illuminated by the Bayesian inerpretation of the general James-Stein estimators.

  • PDF