• 제목/요약/키워드: decision tree

검색결과 1,615건 처리시간 0.028초

Support Vector Machines을 이용한 개인신용평가 : 중국 금융기관을 중심으로 (An Application of Support Vector Machines to Personal Credit Scoring: Focusing on Financial Institutions in China)

  • 딩쉬엔저;이영찬
    • 산업융합연구
    • /
    • 제16권4호
    • /
    • pp.33-46
    • /
    • 2018
  • 개인신용평가는 은행이 대출을 승인할 때 수익성 있는 의사결정을 적절히 유도할 수 있는 효과적인 도구이다. 최근 많은 분류 알고리즘 및 모델이 개인신용평가에 사용되고 있다. 개인신용평가 기법은 대체로 통계적 방법과 비 통계적 방법으로 구분된다. 통계적 방법에는 선형회귀분석, 판별분석, 로지스틱 회귀분석, 의사결정나무 등이 포함된다. 비 통계적 방법에는 선형계획법, 신경망, 유전자 알고리즘 및 Support Vector Machines 등이 포함된다. 그러나 신용평가모형 개발을 위해 어떠한 방법이 최선인지에 관해서는 일관된 결론을 내리기는 어렵다. 본 논문에서는 중국 금융기관의 개인 신용 데이터를 사용하여 가장 대표적인 신용평가 기법인 로지스틱 회귀분석, 신경망 그리고 Support Vector Machines의 성능을 비교하고자 한다. 구체적으로, 세 가지 모형을 각각 구축하여 고객을 분류하고 분석 결과를 비교하였다. 분석결과에 따르면, Support Vector Machines이 로지스틱 회귀분석과 신경망보다 더 나은 성능을 가지는 것으로 나타났다.

Decision based uncertainty model to predict rockburst in underground engineering structures using gradient boosting algorithms

  • Kidega, Richard;Ondiaka, Mary Nelima;Maina, Duncan;Jonah, Kiptanui Arap Too;Kamran, Muhammad
    • Geomechanics and Engineering
    • /
    • 제30권3호
    • /
    • pp.259-272
    • /
    • 2022
  • Rockburst is a dynamic, multivariate, and non-linear phenomenon that occurs in underground mining and civil engineering structures. Predicting rockburst is challenging since conventional models are not standardized. Hence, machine learning techniques would improve the prediction accuracies. This study describes decision based uncertainty models to predict rockburst in underground engineering structures using gradient boosting algorithms (GBM). The model input variables were uniaxial compressive strength (UCS), uniaxial tensile strength (UTS), maximum tangential stress (MTS), excavation depth (D), stress ratio (SR), and brittleness coefficient (BC). Several models were trained using different combinations of the input variables and a 3-fold cross-validation resampling procedure. The hyperparameters comprising learning rate, number of boosting iterations, tree depth, and number of minimum observations were tuned to attain the optimum models. The performance of the models was tested using classification accuracy, Cohen's kappa coefficient (k), sensitivity and specificity. The best-performing model showed a classification accuracy, k, sensitivity and specificity values of 98%, 93%, 1.00 and 0.957 respectively by optimizing model ROC metrics. The most and least influential input variables were MTS and BC, respectively. The partial dependence plots revealed the relationship between the changes in the input variables and model predictions. The findings reveal that GBM can be used to anticipate rockburst and guide decisions about support requirements before mining development.

교통사고 정보를 이용한 과실비율 산정 모델 개발 (Development of a Model for Calculating the Negligence Ratio Using Traffic Accident Information)

  • 한음;박기옥;강희진;이요셉;윤일수
    • 한국ITS학회 논문지
    • /
    • 제21권6호
    • /
    • pp.36-56
    • /
    • 2022
  • 국내에서 발생하는 교통사고는 손해보험협회에서 작성한 「자동차사고 과실비율 인정기준」에 따라 과실비율을 산정하며, 이를 통해 보험사의 합의나 판결이 내려진다. 하지만, 과실비율 산정에 있어 분쟁이 빈번하게 일어나고 있다. 따라서, 교통사고 발생 시 경찰공무원에 의해 작성되는 교통사고 정보를 이용하여 「자동차사고 과실비율 인정기준」 상의 교통사고 유형을 신속하게 확인할 수 있다면, 보다 효과적인 대응이 가능할 것으로 사료된다. 이에 본 연구에서는 경찰에 의해 작성된 교통사고 정보를 학습시켜 「자동차사고 과실비율 인정기준」 에서 제시하는 교통사고 유형으로 분류하는 모델을 개발하고자 한다. 특히, 데이터마이닝을 통해 경찰청 교통사고 데이터에서 「자동차사고 과실비율 인정기준」 의 교통사고 유형으로 분류하는 데 필요한 핵심어들을 추출하였다. 그리고, 키워드를 의사결정나무 및 랜덤 포레스트 모델을 통해 학습시켜 교통사고 유형을 도출하는 모델을 개발하였다.

IoT Enabled Intelligent System for Radiation Monitoring and Warning Approach using Machine Learning

  • Muhammad Saifullah ;Imran Sarwar Bajwa;Muhammad Ibrahim;Mutyyba Asgher
    • International Journal of Computer Science & Network Security
    • /
    • 제23권5호
    • /
    • pp.135-147
    • /
    • 2023
  • Internet of things has revolutionaries every field of life due to the use of artificial intelligence within Machine Learning. It is successfully being used for the study of Radiation monitoring, prediction of Ultraviolet and Electromagnetic rays. However, there is no particular system available that can monitor and detect waves. Therefore, the present study designed in which IOT enables intelligence system based on machine learning was developed for the prediction of the radiation and their effects of human beings. Moreover, a sensor based system was installed in order to detect harmful radiation present in the environment and this system has the ability to alert the humans within the range of danger zone with a buzz, so that humans can move to a safer place. Along with this automatic sensor system; a self-created dataset was also created in which sensor values were recorded. Furthermore, in order to study the outcomes of the effect of these rays researchers used Support Vector Machine, Gaussian Naïve Bayes, Decision Trees, Extra Trees, Bagging Classifier, Random Forests, Logistic Regression and Adaptive Boosting Classifier were used. To sum up the whole discussion it is stated the results give high accuracy and prove that the proposed system is reliable and accurate for the detection and monitoring of waves. Furthermore, for the prediction of outcome, Adaptive Boosting Classifier has shown the best accuracy of 81.77% as compared with other classifiers.

Determination of the stage and grade of periodontitis according to the current classification of periodontal and peri-implant diseases and conditions (2018) using machine learning algorithms

  • Kubra Ertas;Ihsan Pence;Melike Siseci Cesmeli;Zuhal Yetkin Ay
    • Journal of Periodontal and Implant Science
    • /
    • 제53권1호
    • /
    • pp.38-53
    • /
    • 2023
  • Purpose: The current Classification of Periodontal and Peri-Implant Diseases and Conditions, published and disseminated in 2018, involves some difficulties and causes diagnostic conflicts due to its criteria, especially for inexperienced clinicians. The aim of this study was to design a decision system based on machine learning algorithms by using clinical measurements and radiographic images in order to determine and facilitate the staging and grading of periodontitis. Methods: In the first part of this study, machine learning models were created using the Python programming language based on clinical data from 144 individuals who presented to the Department of Periodontology, Faculty of Dentistry, Süleyman Demirel University. In the second part, panoramic radiographic images were processed and classification was carried out with deep learning algorithms. Results: Using clinical data, the accuracy of staging with the tree algorithm reached 97.2%, while the random forest and k-nearest neighbor algorithms reached 98.6% accuracy. The best staging accuracy for processing panoramic radiographic images was provided by a hybrid network model algorithm combining the proposed ResNet50 architecture and the support vector machine algorithm. For this, the images were preprocessed, and high success was obtained, with a classification accuracy of 88.2% for staging. However, in general, it was observed that the radiographic images provided a low level of success, in terms of accuracy, for modeling the grading of periodontitis. Conclusions: The machine learning-based decision system presented herein can facilitate periodontal diagnoses despite its current limitations. Further studies are planned to optimize the algorithm and improve the results.

Cost-Effectiveness Analysis of Home-Based Hospice-Palliative Care for Terminal Cancer Patients

  • Kim, Ye-seul;Han, Euna;Lee, Jae-woo;Kang, Hee-Taik
    • Journal of Hospice and Palliative Care
    • /
    • 제25권2호
    • /
    • pp.76-84
    • /
    • 2022
  • Purpose: We compared cost-effectiveness parameters between inpatient and home-based hospice-palliative care services for terminal cancer patients in Korea. Methods: A decision-analytic Markov model was used to compare the cost-effectiveness of hospice-palliative care in an inpatient unit (inpatient-start group) and at home (home-start group). The model adopted a healthcare system perspective, with a 9-week horizon and a 1-week cycle length. The transition probabilities were calculated based on the reports from the Korean National Cancer Center in 2017 and Health Insurance Review & Assessment Service in 2020. Quality of life (QOL) was converted to the quality-adjusted life week (QALW). Modeling and cost-effectiveness analysis were performed with TreeAge software. The weekly medical cost was estimated to be 2,481,479 Korean won (KRW) for inpatient hospice-palliative care and 225,688 KRW for home-based hospice-palliative care. One-way sensitivity analysis was used to assess the impact of different scenarios and assumptions on the model results. Results: Compared with the inpatient-start group, the incremental cost of the home-start group was 697,657 KRW, and the incremental effectiveness based on QOL was 0.88 QALW. The incremental cost-effectiveness ratio (ICER) of the home-start group was 796,476 KRW/QALW. Based on one-way sensitivity analyses, the ICER was predicted to increase to 1,626,988 KRW/QALW if the weekly cost of home-based hospice doubled, but it was estimated to decrease to -2,898,361 KRW/QALW if death rates at home doubled. Conclusion: Home-based hospice-palliative care may be more cost-effective than inpatient hospice-palliative care. Home-based hospice appears to be affordable even if the associated medical expenditures double.

Sentiment Analysis for COVID-19 Vaccine Popularity

  • Muhammad Saeed;Naeem Ahmed;Abid Mehmood;Muhammad Aftab;Rashid Amin;Shahid Kamal
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제17권5호
    • /
    • pp.1377-1393
    • /
    • 2023
  • Social media is used for various purposes including entertainment, communication, information search, and voicing their thoughts and concerns about a service, product, or issue. The social media data can be used for information mining and getting insights from it. The World Health Organization has listed COVID-19 as a global epidemic since 2020. People from every aspect of life as well as the entire health system have been severely impacted by this pandemic. Even now, after almost three years of the pandemic declaration, the fear caused by the COVID-19 virus leading to higher depression, stress, and anxiety levels has not been fully overcome. This has also triggered numerous kinds of discussions covering various aspects of the pandemic on the social media platforms. Among these aspects is the part focused on vaccines developed by different countries, their features and the advantages and disadvantages associated with each vaccine. Social media users often share their thoughts about vaccinations and vaccines. This data can be used to determine the popularity levels of vaccines, which can provide the producers with some insight for future decision making about their product. In this article, we used Twitter data for the vaccine popularity detection. We gathered data by scraping tweets about various vaccines from different countries. After that, various machine learning and deep learning models, i.e., naive bayes, decision tree, support vector machines, k-nearest neighbor, and deep neural network are used for sentiment analysis to determine the popularity of each vaccine. The results of experiments show that the proposed deep neural network model outperforms the other models by achieving 97.87% accuracy.

Real-time prediction on the slurry concentration of cutter suction dredgers using an ensemble learning algorithm

  • Han, Shuai;Li, Mingchao;Li, Heng;Tian, Huijing;Qin, Liang;Li, Jinfeng
    • 국제학술발표논문집
    • /
    • The 8th International Conference on Construction Engineering and Project Management
    • /
    • pp.463-481
    • /
    • 2020
  • Cutter suction dredgers (CSDs) are widely used in various dredging constructions such as channel excavation, wharf construction, and reef construction. During a CSD construction, the main operation is to control the swing speed of cutter to keep the slurry concentration in a proper range. However, the slurry concentration cannot be monitored in real-time, i.e., there is a "time-lag effect" in the log of slurry concentration, making it difficult for operators to make the optimal decision on controlling. Concerning this issue, a solution scheme that using real-time monitored indicators to predict current slurry concentration is proposed in this research. The characteristics of the CSD monitoring data are first studied, and a set of preprocessing methods are presented. Then we put forward the concept of "index class" to select the important indices. Finally, an ensemble learning algorithm is set up to fit the relationship between the slurry concentration and the indices of the index classes. In the experiment, log data over seven days of a practical dredging construction is collected. For comparison, the Deep Neural Network (DNN), Long Short Time Memory (LSTM), Support Vector Machine (SVM), Random Forest (RF), Gradient Boosting Decision Tree (GBDT), and the Bayesian Ridge algorithm are tried. The results show that our method has the best performance with an R2 of 0.886 and a mean square error (MSE) of 5.538. This research provides an effective way for real-time predicting the slurry concentration of CSDs and can help to improve the stationarity and production efficiency of dredging construction.

  • PDF

지역 단위 가뭄단계 판단규칙 개발에 관한 연구 (A preliminary study on the determination of drought stages at the local level)

  • 이종소;전다은;윤현철;감종훈;이상은
    • 한국수자원학회논문집
    • /
    • 제56권12호
    • /
    • pp.929-937
    • /
    • 2023
  • 본 연구는 2022-2023 광주・전남지역 가뭄 사례를 바탕으로 지역 단위에서 가뭄의 심각성을 토대로 가뭄단계를 판단하는 규칙을 개발하기 위해 실시되었다. 전국의 시・군 단위로 발표되는 8가지 가뭄지표 중에서 농업용수(논) 가뭄단계, 생・공용수 가뭄단계, SPI-12, 농업용 저수지 저수율, 예년 대비 가정용수 사용량 변화율, 예년 대비 비가정용수 사용량 변화율 등의 6가지 지표는 담당자・전문가들의 인식과 통계적 상관성을 확인할 수 있었다. 또한 이 가뭄지표를 의사결정트리 알고리즘에 적용하여 가뭄의 심각성을 판단하기 위한 규칙을 도출하였는데, 선행연구에서 제안한 기존의 방법과 유사한 결과를 제시하나, 광주・전남지역 가뭄에서 확인된 시・공간적인 패턴을 설명하는데 있어서 상당한 비교우위를 보였다.

재가노인 사례관리의 욕구사정 정확도 향상을 위한 욕구추출 알고리즘 개발 - 데이터 마이닝 분석기법을 활용하여 - (Development of Needs Extraction Algorithm Fitting for Individuals in Care Management for the Elderly in Home)

  • 김영숙;정국인;박소라
    • 한국사회복지학
    • /
    • 제60권1호
    • /
    • pp.187-209
    • /
    • 2008
  • 본 연구자들은 재가노인의 사례관리 과정에서 가장 핵심적인 요소가 되는 욕구 중심의 통합적 사정을 위한 28개의 욕구가 포함된 사정도구를 개발하였으며, 그 후속 연구로 개발된 욕구사정도구를 활용해 전국 노인복지관 협회 산하 120개 기관의 재가 노인 676명의 사정 데이터를 수집하고 데이터마이닝의 의사결정 나무분석 기법을 활용하여 욕구에 적합한 사회복지 서비스를 제공하기 위한 욕구추출 알고리즘을 개발하였다. 본 연구를 통해 재가노인의 욕구 28개에 대한 욕구추출 알고리즘은 <표3>에 요약하였다. 욕구 8번 "외출 시 도움을 원한다."의 의사결정모형을 예로 들면, 호소 23번을 주요 변인으로 외부이동 도움을 요청할 경우 80.3%와 요청하지 않을 경우 11.4%로 구분되었다. 이용자가 외부 이동에 대한 호소가 있고, 수발자가 있는 경우 87.9%로 욕구가 증가하였지만, 수발자가 없는 이용자의 경우 47.4%로 감소하였다. 노인이 외부이동 지원에 대한 요청과 수발자가 있으며, 청소하기의 완전도움이 필요한 경우, 외부이동 도움에 대한 욕구는 94.2%로 나타났다. 그러나 이용자가 외부이동의 도움을 요청하지 않더라도, ADL의 목욕하기에 완전도움으로 응답한 경우 외출도움의 욕구는 11.4%에서 80.0%로 급격히 증가하는 것을 확인할 수 있다. 그러나 ADL 목욕하기의 기능이 부분도움 또는 완전자립의 경우 외출도움이 필요하다고 분류될 가능성은 7.7%로 낮게 나타났다. 위와 같은 의사결정모형은 최대 나무 깊이는 5수준을 정지규칙으로 하여, 부모마디와 자식마디의 사례 수를 각각 50과 25로 지정하였다. 이를 통해 "외출 시 도움을 원한다"라는 욕구의 경우 182.13%의 효과적인 의사결정을 하고 있다. 본 연구의 결과로 제시한 알고리즘은 재가노인의 욕구를 추출함에 있어서 체계적이고 과학적인 기초자료로 활용될 수 있다.

  • PDF