• Title/Summary/Keyword: Decision Tree Technique

Search Result 206, Processing Time 0.03 seconds

A Study on the Database Marketing using Data Mining in the Traditional Medicine (데이터마이닝을 활용한 한방분야에서의 데이터베이스 마케팅에 대한 연구)

  • Lee Sang-Young;Lee Yun-Seok
    • Journal of the Korea Society of Computer and Information
    • /
    • v.10 no.5 s.37
    • /
    • pp.271-280
    • /
    • 2005
  • This study is to elicit the factors affected on the medical examination in the tra야tional medicine using the technical method of the decision tree and characterize the Patient subject by clustering analysis technique. And to draw results from the association analysis between the form of diseases in the re-hospitalized Patient group. The obtained results were analyzed for their effect on the hospital Profits. Thus. through application of the database marketing to the data mining technique in the tradition리 medicine, the characteristics of patient clients for the objective induction of factors affected on the hospital Fronts can be identified. Practical application of the database marketing as presented in this study will bring about a fundamental efficiency of hospital management and vitalization.

  • PDF

Identification of shear transfer mechanisms in RC beams by using machine-learning technique

  • Zhang, Wei;Lee, Deuckhang;Ju, Hyunjin;Wang, Lei
    • Computers and Concrete
    • /
    • v.30 no.1
    • /
    • pp.43-74
    • /
    • 2022
  • Machine learning technique is recently opening new opportunities to identify the complex shear transfer mechanisms of reinforced concrete (RC) beam members. This study employed 1224 shear test specimens to train decision tree-based machine learning (ML) programs, by which strong correlations between shear capacity of RC beams and key input parameters were affirmed. In addition, shear contributions of concrete and shear reinforcement (the so-called Vc and Vs) were identified by establishing three independent ML models trained under different strategies with various combinations of datasets. Detailed parametric studies were then conducted by utilizing the well-trained ML models. It appeared that the presence of shear reinforcement can make the predicted shear contribution from concrete in RC beams larger than the pure shear contribution of concrete due to the intervention effect between shear reinforcement and concrete. On the other hand, the size effect also brought a significant impact on the shear contribution of concrete (Vc), whereas, the addition of shear reinforcements can effectively mitigate the size effect. It was also found that concrete tends to be the primary source of shear resistance when shear span-depth ratio a/d<1.0 while shear reinforcements become the primary source of shear resistance when a/d>2.0.

Prediction Model for Gastric Cancer via Class Balancing Techniques

  • Danish, Jamil ;Sellappan, Palaniappan;Sanjoy Kumar, Debnath;Muhammad, Naseem;Susama, Bagchi ;Asiah, Lokman
    • International Journal of Computer Science & Network Security
    • /
    • v.23 no.1
    • /
    • pp.53-63
    • /
    • 2023
  • Many researchers are trying hard to minimize the incidence of cancers, mainly Gastric Cancer (GC). For GC, the five-year survival rate is generally 5-25%, but for Early Gastric Cancer (EGC), it is almost 90%. Predicting the onset of stomach cancer based on risk factors will allow for an early diagnosis and more effective treatment. Although there are several models for predicting stomach cancer, most of these models are based on unbalanced datasets, which favours the majority class. However, it is imperative to correctly identify cancer patients who are in the minority class. This research aims to apply three class-balancing approaches to the NHS dataset before developing supervised learning strategies: Oversampling (Synthetic Minority Oversampling Technique or SMOTE), Undersampling (SpreadSubsample), and Hybrid System (SMOTE + SpreadSubsample). This study uses Naive Bayes, Bayesian Network, Random Forest, and Decision Tree (C4.5) methods. We measured these classifiers' efficacy using their Receiver Operating Characteristics (ROC) curves, sensitivity, and specificity. The validation data was used to test several ways of balancing the classifiers. The final prediction model was built on the one that did the best overall.

Fundamental Research on the Development of a Risk Based Decision Support System for Maritime Accident Response: Focused on Oil Tanker Grounding (위험도기반 해양사고 초기대응 지원 시스템 개발 기초연구: 유조선 좌초사고를 중심으로)

  • Na, Seong;Lee, Seung-Hyun;Choi, Hyuek-Jin
    • Journal of Navigation and Port Research
    • /
    • v.40 no.6
    • /
    • pp.391-400
    • /
    • 2016
  • A number of maritime accidents, and accident response activities, including the command and control procedures that were implemented at accident scenes, are analyzed to derive useful information about responding to maritime accidents, and to understand how the chain of events developed after the initial accident. In this research, a new concept of a 'risk based accident response support system' is proposed. In order to identify the event chains and associated hazards related to the accident response activities, this study proposes a 'Brainstorming technique for scenario identification', based on the concept of the HAZID technique. A modified version of Event Tree Analysis was used for quantitative risk analysis of maritime accident response activities. PERT/CPM was used to analyze accident response activities and for calculating overall (expected) response activity completion time. Also, the risk based accident response support system proposed in this paper is explained using a simple case study of risk analysis for oil tanker grounding accident response.

The Factors that Affects the Employment Type of The Graduates by Data-mining Approach (데이터마이닝 기법을 활용한 대졸자 고용에 미치는 영향요인 분석)

  • Kim, Hyoung-Rae;Jeon, Do-Hong
    • Journal of the Korea Society of Computer and Information
    • /
    • v.17 no.7
    • /
    • pp.167-174
    • /
    • 2012
  • Data mining technique can be adapted to analysing Employment information in order to discover valuable information out of large data. As the issue employment such as jobless of college graduate, recruitment for women, recruitment for elders etc. became social problem, there are many efforts of various public employment services and studies. The factors that affects the college graduate's employment type (regular, temporary, daily) can be used to guide employment and to prepare employment for college students. In analyzing large number of attributes and the huge amount of data elements, regular statistical methods faces their limitation; therefore, data-mining technique is more suitable for the dataset of about 170 attributes and 20,000 elements. We divide the factors that may affect the employment type into personal factor, school factor, company factor, and experience factor; decision tree algorithm is used to find out the interesting relationship between the attributes of the factors and employment type. Personal factors such as the income of parents and marital status were the most affective factors to the employment type. The learned decision tree was able to classify the employment type with 87% of accuracy. We also assume the level of the school affects the employment type of the graduates.

Developing the high-risk drinking predictive model in Korea using the data mining technique (데이터마이닝 기법을 활용한 한국인의 고위험 음주 예측모형 개발 연구)

  • Park, Il-Su;Han, Jun-Tae
    • Journal of the Korean Data and Information Science Society
    • /
    • v.28 no.6
    • /
    • pp.1337-1348
    • /
    • 2017
  • In this paper, we develop the high-risk drinking predictive model in Korea using the cross-sectional data from Korea Community Health Survey (2014). We perform the logistic regression analysis, the decision tree analysis, and the neural network analysis using the data mining technique. The results of logistic regression analysis showed that men in their forties had a high risk and the risk of office workers and sales workers were high. Especially, current smokers had higher risk of high-risk drinking. Neural network analysis and logistic regression were the most significant in terms of AUROC (area under a receiver operation characteristic curve) among the three models. The high-risk drinking predictive model developed in this study and the selection method of the high-risk intensive drinking group can be the basis for providing more effective health care services such as hazardous drinking prevention education, and improvement of drinking program.

A Study on Korean Local Governments' Operation of Participatory Budgeting System : Classification by Support Vector Machine Technique (한국 지방자치단체의 주민참여예산제도 운영에 관한 연구 - Support Vector Machine 기법을 이용한 유형 구분)

  • Junhyun Han;Jaemin Ryou;Jayon Bae;Chunghyeok Im
    • The Journal of the Convergence on Culture Technology
    • /
    • v.10 no.3
    • /
    • pp.461-466
    • /
    • 2024
  • Korean local governments operates the participatory budgeting system autonomously. This study is to classify these entities into clusters. Among the diverse machine learning methodologies(Neural Network, Rule Induction(CN2), KNN, Decision Tree, Random Forest, Gradient Boosting, SVM, Naïve Bayes), the Support Vector Machine technique emerged as the most efficacious in the analysis of 2022 Korean municipalities data. The first cluster C1 is characterized by minimal committee activity but a substantial allocation of participatory budgeting; another cluster C3 comprises cities that exhibit a passive stance. The majority of cities falls into the final cluster C2 which is noted for its proactive engagement in. Overall, most Korean local government operates the participatory busgeting system in good shape. Only a small number of cities is less active in this system. We anticipate that analyzing time-series data from the past decade in follow-up studies will further enhance the reliability of classifying local government types regarding participatory budgeting.

Estimation of a Nationwide Statistics of Hernia Operation Applying Data Mining Technique to the National Health Insurance Database (데이터마이닝 기법을 이용한 건강보험공단의 수술 통계량 근사치 추정 -허니아 수술을 중심으로-)

  • Kang, Sung-Hong;Seo, Seok-Kyung;Yang, Yeong-Ja;Lee, Ae-Kyung;Bae, Jong-Myon
    • Journal of Preventive Medicine and Public Health
    • /
    • v.39 no.5
    • /
    • pp.433-437
    • /
    • 2006
  • Objectives: The aim of this study is to develop a methodology for estimating a nationwide statistic for hernia operations with using the claim database of the Korea Health Insurance Cooperation (KHIC). Methods: According to the insurance claim procedures, the claim database was divided into the electronic data interchange database (EDI_DB) and the sheet database (Paper_DB). Although the EDI_DB has operation and management codes showing the facts and kinds of operations, the Paper_DB doesn't. Using the hernia matched management code in the EDI_DB, the cases of hernia surgery were extracted. For drawing the potential cases from the Paper_DB, which doesn't have the code, the predictive model was developed using the data mining technique called SEMMA. The claim sheets of the cases that showed a predictive probability of an operation over the threshold, as was decided by the ROC curve, were identified in order to get the positive predictive value as an index of usefulness for the predictive model. Results: Of the claim databases in 2004, 14,386 cases had hernia related management codes with using the EDI system. For fitting the models with applying the data mining technique, logistic regression was chosen rather than the neural network method or the decision tree method. From the Paper_DB, 1,019 cases were extracted as potential cases. Direct review of the sheets of the extracted cases showed that the positive predictive value was 95.3%. Conclusions: The results suggested that applying the data mining technique to the claim database in the KHIC for estimating the nationwide surgical statistics would be useful from the aspect of execution and cost-effectiveness.

Fast Intra-Mode Decision for H.264/AVC using Inverse Tree-Structure (H.264/AVC 표준에서 역트리 구조를 이용하여 고속으로 화면내 모드를 결정하는 방법)

  • Ko, Hyun-Suk;Yoo, Ki-Won;Seo, Jung-Dong;Sohn, Kwang-Hoon
    • Journal of Broadcast Engineering
    • /
    • v.13 no.3
    • /
    • pp.310-318
    • /
    • 2008
  • The H.264/AVC standard achieves higher coding efficiency than previous video coding standards with the rate-distortion optimization (RDO) technique which selects the best coding mode and reference frame for each macroblock. As a result, the complexity of the encoder have been significantly increased. In this paper, a fast intra-mode decision algorithm is proposed to reduce the computational load of intra-mode search, which is based on the inverse tree-structure edge prediction algorithm. First, we obtained the dominant edge for each $4{\times}4$ block from local edge information, then the RDO process is only performed by the mode which corresponds to dominant edge direction. Then, for the $8{\times}8$ (or $16{\times}16$) block stage, the dominant edge is calculated from its four $4{\times}4$ (or $16{\times}16$) blocks' dominant edges without additional calculation and the RDO process is also performed by the mode which is related to dominant edge direction. Experimental results show that proposed scheme can significantly improve the speed of the intra prediction with a negligible loss in the peak signal to noise ratio (PSNR) and a little increase of bits.

Design of Contact Scheduling System(CSS) for Customer Retention (고객유지를 위한 접촉스케줄링시스템의 설계)

  • Lee, Jee-Sik;Cho, You-Jung
    • Journal of Intelligence and Information Systems
    • /
    • v.11 no.3
    • /
    • pp.83-101
    • /
    • 2005
  • Customer retention is one of the major issues in life insurance industry, in which competition is increasingly fierce. There are many things for the life insurers to do many things to retain the customers. One of those things is to make sure to keep in touch with all customers. When an insurance-planner resigned, his/her customers must be taken care of by some planner-assistants. This article outlines the design of Contact Scheduling System (CSS) that supports planner-assistants for contacting the customers. Planner-assistants are unable to share the resigned insurance-planner's experience and knowledge regarding the customer relationship management. The CSS developed by employing both Classification And Regression Tree (CART) technique and Sequential Pattern Mining (SPM) technique has a two-stage process. In the first stage, it segments the customers into eight groups by CART model. Then it generates contact scheduling information consisting of contact-purpose, contact-interval and contact-channel, according to the segment's typical contact pattern. Contact-purpose is derived by schedule-driven, event-driven, or business-rule-driven. Schedule-driven contact is determined by SPM model. In the operation of CSS in a realistic situation, it shows a practicality in supporting planner-assistants to keep in touch with the customers efficiently and effectively.

  • PDF