• Title/Summary/Keyword: 로지스틱 회귀모델

Search Result 190, Processing Time 0.024 seconds

Association between Oral Health Status and Rheumatoid Arthritis (구강건강상태와 류마티스 관절염의 관련성)

  • Choi, Eun Sil;Cho, Han-A
    • Journal of dental hygiene science
    • /
    • v.15 no.5
    • /
    • pp.612-619
    • /
    • 2015
  • The purpose of this study was to examine the association between oral health status and rheumatoid arthritis (RA). The study used a nationally representative sample of Koreans (2013 Korea National Health and Nutrition Examination Survey) aged 19 years over (n=6,113). Dependent variable was RA, which was assessed with oral health status. Independent variable was oral health status (periodontal status, missing tooth). The chi-square test and logistic regression analysis were performed to identify the association between oral health status and RA. Results of logistic regression analysis for association between periodontal status and RA was no significant. Results of logistic regression analysis for association between missing tooth and RA was statistically significant. The odds ratio (OR) for RA participants was 3.03 (95% confidence interval [CI], 1.47~6.23) in missing tooth 19~28 than missing tooth 0~8. The OR for RA participants was 2.08 (95% CI, 1.06~4.08) in missing tooth 9~18 than missing tooth 0~8. After adjustment for confounders (socio-demographic factors, health behaviors), results of logistic regression analysis was no significant. More missing tooth among adults was greater the risk of RA. By promoting the improvement of oral hygiene and oral health would contribute to reduce the risks associated with systemic diseases. Future study is needed to examine the detailed causal relations between oral health status and RA bidirectionally.

Reinforcing Method for the Protective Capacities of Dispersal and Combat Facilities using Logistic Regression (로지스틱 회귀모형을 활용한 소산 및 전투시설의 방호성능 보강방안 연구)

  • Park, Young Jun;Park, Sangjin;Yu, Yeong-Jin;Kim, Taehui;Son, Kiyoung
    • Journal of the Korea Institute of Building Construction
    • /
    • v.16 no.1
    • /
    • pp.77-85
    • /
    • 2016
  • This study provides the numerical model to assess retrofit and strengthen levels in the dispersal and combat facilities. First of all, it is verified that direct-hitting projectiles are more destructive to the structures rather than close-falling bombs with explosion tests. The protective capacity of dispersal and combat facilities, which are modeled with soil uncertainty and structural field data, is analyzed through finite element method. With structural survivability and facility data, the logistic regression model is drawn. This model could be used to determine the level of the retrofit and strengthen in the dispersal and combat facilities of contact areas. For more reliable model, it could be better to identify more significant factors and adapt non-linear model. In addition, for adapting this model on the spot, appropriate strengthen levels should be determined by hands on staffs associated with military facilities.

A Study on the Drug Classification Using Machine Learning Techniques (머신러닝 기법을 이용한 약물 분류 방법 연구)

  • Anmol Kumar Singh;Ayush Kumar;Adya Singh;Akashika Anshum;Pradeep Kumar Mallick
    • Advanced Industrial SCIence
    • /
    • v.3 no.2
    • /
    • pp.8-16
    • /
    • 2024
  • This paper shows the system of drug classification, the goal of this is to foretell the apt drug for the patients based on their demographic and physiological traits. The dataset consists of various attributes like Age, Sex, BP (Blood Pressure), Cholesterol Level, and Na_to_K (Sodium to Potassium ratio), with the objective to determine the kind of drug being given. The models used in this paper are K-Nearest Neighbors (KNN), Logistic Regression and Random Forest. Further to fine-tune hyper parameters using 5-fold cross-validation, GridSearchCV was used and each model was trained and tested on the dataset. To assess the performance of each model both with and without hyper parameter tuning evaluation metrics like accuracy, confusion matrices, and classification reports were used and the accuracy of the models without GridSearchCV was 0.7, 0.875, 0.975 and with GridSearchCV was 0.75, 1.0, 0.975. According to GridSearchCV Logistic Regression is the most suitable model for drug classification among the three-model used followed by the K-Nearest Neighbors. Also, Na_to_K is an essential feature in predicting the outcome.

A Comparative Analysis of Risk Assessment Models for Asbestos Demolition (석면 해체 작업의 위험성평가모델 비교 분석)

  • Kim, Dong-Gyu;Kim, Min-Seung;Lee, Su-Min;Kim, Yu-Jin;Han, Seung-Woo
    • Proceedings of the Korean Institute of Building Construction Conference
    • /
    • 2022.11a
    • /
    • pp.99-100
    • /
    • 2022
  • As the danger of exposure to the asbestos has been revealed, the importance of demolition asbestos in existing buildings has been raised. Extensive body of study has been conducted to evaluate the risk of demolition asbestos, but there were confined types of variables caused by not reflecting categorical information and limitations in collecting quantitative information. Thus, this study aims to derive a model that predicts the risk in workplace of demolition asbestos by collecting categorical and continuous variables. For this purpose, categorical and continuous variables were collected from asbestos demolition reports, and the risk assessment score was set as the dependent variable. In this study, the influence of each variable was identified using logistic regression, and the risk prediction model methodologies were compared through decision tree regression and artificial neural network. As a result, a conditional risk prediction model was derived to evaluate the risk of demolition asbestos, and this model is expected to be used to ensure the safety of asbestos demolition workers.

  • PDF

Landslide Susceptibility Mapping Using Ensemble FR and LR models at the Inje Area, Korea (FR과 LR 앙상블 모형을 이용한 산사태 취약성 지도 제작 및 검증)

  • Kim, Jin Soo;Park, So Young
    • Journal of Korean Society for Geospatial Information Science
    • /
    • v.25 no.1
    • /
    • pp.19-27
    • /
    • 2017
  • This research was aimed to analyze landslide susceptibility and compare the prediction accuracy using ensemble frequency ratio (FR) and logistic regression at the Inje area, Korea. The landslide locations were identified with the before and after aerial photographs of landslide occurrence that were randomly selected for training (70%) and validation (30%). The total twelve landslide-related factors were elevation, slope, aspect, distance to drainage, topographic wetness index, stream power index, soil texture, soil sickness, timber age, timber diameter, timber density, and timber type. The spatial relationship between landslide occurrence and landslide-related factors was analyzed using FR and ensemble model. The produced LSI maps were validated and compared using relative operating characteristics (ROC) curve. The prediction accuracy of produced ensemble LSI map was about 2% higher than FR LSI map. The LSI map produced in this research could be used to establish land use planning and mitigate the damages caused by disaster.

Machine Learning Methods to Predict Vehicle Fuel Consumption

  • Ko, Kwangho
    • Journal of the Korea Society of Computer and Information
    • /
    • v.27 no.9
    • /
    • pp.13-20
    • /
    • 2022
  • It's proposed and analyzed ML(Machine Learning) models to predict vehicle FC(Fuel Consumption) in real-time. The test driving was done for a car to measure vehicle speed, acceleration, road gradient and FC for training dataset. The various ML models were trained with feature data of speed, acceleration and road-gradient for target FC. There are two kind of ML models and one is regression type of linear regression and k-nearest neighbors regression and the other is classification type of k-nearest neighbors classifier, logistic regression, decision tree, random forest and gradient boosting in the study. The prediction accuracy is low in range of 0.5 ~ 0.6 for real-time FC and the classification type is more accurate than the regression ones. The prediction error for total FC has very low value of about 0.2 ~ 2.0% and regression models are more accurate than classification ones. It's for the coefficient of determination (R2) of accuracy score distributing predicted values along mean of targets as the coefficient decreases. Therefore regression models are good for total FC and classification ones are proper for real-time FC prediction.

High-Efficiency Homomorphic Encryption Techniques for Privacy-Preserving Data Learning (프라이버시 보존 데이터 학습을 위한 고효율 동형 암호 기법)

  • Hye Yeon Shim;Yu-Ran Jeon;Il-Gu Lee
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2024.05a
    • /
    • pp.419-422
    • /
    • 2024
  • 최근 인공지능 기술의 발전과 함께 기계학습과 빅데이터를 융합한 서비스가 증가하게 되었고, 무분별한 데이터 수집과 학습으로 인한 개인정보 유출 위험도가 커졌다. 따라서 프라이버시를 보호하면서 기계학습을 수행할 수 있는 기술이 중요해졌다. 동형암호 기술은 정보 주체자의 개인정보 기밀성을 유지하면서 기계학습을 할 수 있는 방법 중 하나이다. 그러나 평문 크기에 비례하여 암호문 크기와 연산 결과의 노이즈가 커지는 동형암호의 특징으로 인해 기계학습 모델의 예측 정확도가 감소하고 학습 시간이 오래 소요되는 문제가 발생한다. 본 논문에서는 부분 동형암호화된 데이터셋으로 로지스틱 회귀 모델을 학습할 수 있는 기법을 제안한다. 실험 결과에 따르면 제안하는 기법이 종래 기법보다 예측 정확도를 59.4% 향상시킬 수 있었고, 학습 소요 시간을 63.6% 개선할 수 있었다.

Regression Models for Determining the Patent Royalty Rates using Infringement Damage Awards and Inter-Partes Review Cases (손해배상액과 무효심판 판례를 이용한 특허 로열티율 산정 회귀모형)

  • Yang, Dong Hong;Kang, Gunseog;Kim, Sung-Chul
    • The Journal of Society for e-Business Studies
    • /
    • v.23 no.1
    • /
    • pp.47-63
    • /
    • 2018
  • This study suggested quantitative models to calculate a royalty rate as an important input factor of the relief from royalty method which has the characteristics of income approach method and market approach method that are generally used in the valuation of intangible assets. This study built a royalty rate regression model by referring to the patent infringement damages cases based on royalties, i.e., by using the royalty rates as a dependent variable and the patent indexes of the corresponding patent right as independent variables. Then, a logistic regression model was constructed by referring to inter-partes review cases of patent rights, i.e. by using not-unpatentable results as a dependent variable and the patent indexes of the corresponding patent right as independent variables. A final royalty rate was calculated by matching the royalty rate from the royalty rate regression model with a not-unpatentable probability from the logistic regression model. The suggested royalty rate was compared with the royalty rate obtained by the traditional methods to check its reliability.

Determinants of Leverage for Manufacturing Firms Listed in the KOSDAQ Stock Market (한국 KOSDAQ 상장기업들의 자본구조 결정요인 분석)

  • Kim, Han-Joon
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.13 no.5
    • /
    • pp.2096-2109
    • /
    • 2012
  • This study investigates empirical issues that have received little attention in the previous research in the Korean capital market. It is to find any financial determinants on the capital structure for the firms listed in the KOSDAQ(Korea Securities Dealers Automated Quotation). Another test is performed to find any possible discriminating factors by utilizing a robust methodology, which may distinguish between the firms belonging the 'prime section' and the 'venture section' in terms of their financial aspects. Moreover, the null hypothesis that the changing trend or movement of a firm's capital structure with respect to its industry mean (or median) may be random, is also tested. For the book-value based debt ratios, size(INSIZE), growth(GROWTH), Market to book value of equity(MVBV), volatility(VOLATILITY), market value of equity (MVE) and section dummy (SECTION) showed their statistically significant effects on the book-value based leverage ratios, respectively, while size(INSIZE), growth(GROWTH), market value of equity(MVE), beta(BETA) and section dummy (SECTION) showed their statistically significant effects on the market-value based leverage ratios. This study also found an interesting result that a firm belonging to each corresponding industry has a tendency for reversion toward its mean and median leverage ratios over the five-year tested period.

Comparison of Korean Classification Models' Korean Essay Score Range Prediction Performance (한국어 학습 모델별 한국어 쓰기 답안지 점수 구간 예측 성능 비교)

  • Cho, Heeryon;Im, Hyeonyeol;Yi, Yumi;Cha, Junwoo
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.11 no.3
    • /
    • pp.133-140
    • /
    • 2022
  • We investigate the performance of deep learning-based Korean language models on a task of predicting the score range of Korean essays written by foreign students. We construct a data set containing a total of 304 essays, which include essays discussing the criteria for choosing a job ('job'), conditions of a happy life ('happ'), relationship between money and happiness ('econ'), and definition of success ('succ'). These essays were labeled according to four letter grades (A, B, C, and D), and a total of eleven essay score range prediction experiments were conducted (i.e., five for predicting the score range of 'job' essays, five for predicting the score range of 'happiness' essays, and one for predicting the score range of mixed topic essays). Three deep learning-based Korean language models, KoBERT, KcBERT, and KR-BERT, were fine-tuned using various training data. Moreover, two traditional probabilistic machine learning classifiers, naive Bayes and logistic regression, were also evaluated. Experiment results show that deep learning-based Korean language models performed better than the two traditional classifiers, with KR-BERT performing the best with 55.83% overall average prediction accuracy. A close second was KcBERT (55.77%) followed by KoBERT (54.91%). The performances of naive Bayes and logistic regression classifiers were 52.52% and 50.28% respectively. Due to the scarcity of training data and the imbalance in class distribution, the overall prediction performance was not high for all classifiers. Moreover, the classifiers' vocabulary did not explicitly capture the error features that were helpful in correctly grading the Korean essay. By overcoming these two limitations, we expect the score range prediction performance to improve.