• Title/Summary/Keyword: Decision Tree Technique

Search Result 209, Processing Time 0.025 seconds

Diabetes prediction mechanism using machine learning model based on patient IQR outlier and correlation coefficient (환자 IQR 이상치와 상관계수 기반의 머신러닝 모델을 이용한 당뇨병 예측 메커니즘)

  • Jung, Juho;Lee, Naeun;Kim, Sumin;Seo, Gaeun;Oh, Hayoung
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.25 no.10
    • /
    • pp.1296-1301
    • /
    • 2021
  • With the recent increase in diabetes incidence worldwide, research has been conducted to predict diabetes through various machine learning and deep learning technologies. In this work, we present a model for predicting diabetes using machine learning techniques with German Frankfurt Hospital data. We apply outlier handling using Interquartile Range (IQR) techniques and Pearson correlation and compare model-specific diabetes prediction performance with Decision Tree, Random Forest, Knn (k-nearest neighbor), SVM (support vector machine), Bayesian Network, ensemble techniques XGBoost, Voting, and Stacking. As a result of the study, the XGBoost technique showed the best performance with 97% accuracy on top of the various scenarios. Therefore, this study is meaningful in that the model can be used to accurately predict and prevent diabetes prevalent in modern society.

A Best Effort Classification Model For Sars-Cov-2 Carriers Using Random Forest

  • Mallick, Shrabani;Verma, Ashish Kumar;Kushwaha, Dharmender Singh
    • International Journal of Computer Science & Network Security
    • /
    • v.21 no.1
    • /
    • pp.27-33
    • /
    • 2021
  • The whole world now is dealing with Coronavirus, and it has turned to be one of the most widespread and long-lived pandemics of our times. Reports reveal that the infectious disease has taken toll of the almost 80% of the world's population. Amidst a lot of research going on with regards to the prediction on growth and transmission through Symptomatic carriers of the virus, it can't be ignored that pre-symptomatic and asymptomatic carriers also play a crucial role in spreading the reach of the virus. Classification Algorithm has been widely used to classify different types of COVID-19 carriers ranging from simple feature-based classification to Convolutional Neural Networks (CNNs). This research paper aims to present a novel technique using a Random Forest Machine learning algorithm with hyper-parameter tuning to classify different types COVID-19-carriers such that these carriers can be accurately characterized and hence dealt timely to contain the spread of the virus. The main idea for selecting Random Forest is that it works on the powerful concept of "the wisdom of crowd" which produces ensemble prediction. The results are quite convincing and the model records an accuracy score of 99.72 %. The results have been compared with the same dataset being subjected to K-Nearest Neighbour, logistic regression, support vector machine (SVM), and Decision Tree algorithms where the accuracy score has been recorded as 78.58%, 70.11%, 70.385,99% respectively, thus establishing the concreteness and suitability of our approach.

Implementation of a Library Function of Scanning RSSI and Indoor Positioning Modules (RSSI 판독 라이브러리 함수 및 옥내 측위 모듈 구현)

  • Yim, Jae-Geol;Jeong, Seung-Hwan;Shim, Kyu-Bark
    • Journal of Korea Multimedia Society
    • /
    • v.10 no.11
    • /
    • pp.1483-1495
    • /
    • 2007
  • Thanks to IEEE 802.11 technique, accessing Internet through a wireless LAN(Local Area Network) is possible in the most of the places including university campuses, shopping malls, offices, hospitals, stations, and so on. Most of the APs(access points) for wireless LAN are supporting 2.4 GHz band 802.11b and 802.11g protocols. This paper is introducing a C# library function which can be used to read RSSIs(Received Signal Strength Indicator) from APs. An LBS(Location Based Service) estimates the current location of the user and provides useful user's location-based services such as navigation, points of interest, and so on. Therefore, indoor, LBS is very desirable. However, an indoor LBS cannot be realized unless indoor position ing is possible. For indoor positioning, techniques of using infrared, ultrasound, signal strength of UDP packet have been proposed. One of the disadvantages of these techniques is that they require special equipments dedicated for positioning. On the other hand, wireless LAN-based indoor positioning does not require any special equipments and more economical. A wireless LAN-based positioning cannot be realized without reading RSSIs from APs. Therefore, our C# library function will be widely used in the field of indoor positioning. In addition to providing a C# library function of reading RSSI, this paper introduces implementation of indoor positioning modules making use of the library function. The methods used in the implementation are K-NN(K Nearest Neighbors), Bayesian and trilateration. K-NN and Bayesian are kind of fingerprinting method. A fingerprint method consists of off-line phase and realtime phase. The process time of realtime phase must be fast. This paper proposes a decision tree method in order to improve the process time of realtime phase. Experimental results of comparing performances of these methods are also discussed.

  • PDF

A case study on an optimal analysis technique of primary measurements for safety management of fill dam (필댐의 안전관리를 위한 주요 계측 데이터의 최적 분석기법에 대한 사례 연구)

  • Jeon, Hyeoncheol;Yun, Seong-Kyu;Kim, Jiseong;Im, En-Sang;Kang, Gichun
    • Journal of Korea Water Resources Association
    • /
    • v.54 no.spc1
    • /
    • pp.1155-1166
    • /
    • 2021
  • In this study, statistical analysis was performed to suggest the optimal analysis techniques for the main measuring instruments of the fill dam, such as seepage, crest settlement, and porewater pressure gauge. In addition, correlation analysis with water level and rainfall data was performed. Based on the results of descriptive statistical analysis for each instrument, porewater pressure gauges could be classified into 3 groups or 2 groups through principal component analysis, In the case of the group having a high correlation with the water level instrument, the correlation between the gauges was also large. In the case of seepage instrument, the distribution showed an extremely asymmetric distribution, so for quantitative analysis, the average seepage during non-precipitation and precipitation could be estimated through decision tree analysis. In the case of the crest settlement instrument, the correlation analysis showed that the correlation between the gauges was large, but the relationship with the water level instrument did not show a significant linear relationship, so EMD analysis was performed to analyze it in more detail. It is judged that principal component analysis, decision tree analysis, and data filtering method can be applied to analyze the behavior of pore water pressure meters, seepage, and crest settlement instrument as major measurement items of fill dam.

Development and Validation of 18F-FDG PET/CT-Based Multivariable Clinical Prediction Models for the Identification of Malignancy-Associated Hemophagocytic Lymphohistiocytosis

  • Xu Yang;Xia Lu;Jun Liu;Ying Kan;Wei Wang;Shuxin Zhang;Lei Liu;Jixia Li;Jigang Yang
    • Korean Journal of Radiology
    • /
    • v.23 no.4
    • /
    • pp.466-478
    • /
    • 2022
  • Objective: 18F-fluorodeoxyglucose (FDG) PET/CT is often used for detecting malignancy in patients with newly diagnosed hemophagocytic lymphohistiocytosis (HLH), with acceptable sensitivity but relatively low specificity. The aim of this study was to improve the diagnostic ability of 18F-FDG PET/CT in identifying malignancy in patients with HLH by combining 18F-FDG PET/CT and clinical parameters. Materials and Methods: Ninety-seven patients (age ≥ 14 years) with secondary HLH were retrospectively reviewed and divided into the derivation (n = 71) and validation (n = 26) cohorts according to admission time. In the derivation cohort, 22 patients had malignancy-associated HLH (M-HLH) and 49 patients had non-malignancy-associated HLH (NM-HLH). Data on pretreatment 18F-FDG PET/CT and laboratory results were collected. The variables were analyzed using the Mann-Whitney U test or Pearson's chi-square test, and a nomogram for predicting M-HLH was constructed using multivariable binary logistic regression. The predictors were also ranked using decision-tree analysis. The nomogram and decision tree were validated in the validation cohort (10 patients with M-HLH and 16 patients with NM-HLH). Results: The ratio of the maximal standardized uptake value (SUVmax) of the lymph nodes to that of the mediastinum, the ratio of the SUVmax of bone lesions or bone marrow to that of the mediastinum, and age were selected for constructing the model. The nomogram showed good performance in predicting M-HLH in the validation cohort, with an area under the receiver operating characteristic curve of 0.875 (95% confidence interval, 0.686-0.971). At an appropriate cutoff value, the sensitivity and specificity for identifying M-HLH were 90% (9/10) and 68.8% (11/16), respectively. The decision tree integrating the same variables showed 70% (7/10) sensitivity and 93.8% (15/16) specificity for identifying M-HLH. In comparison, visual analysis of 18F-FDG PET/CT images demonstrated 100% (10/10) sensitivity and 12.5% (2/16) specificity. Conclusion: 18F-FDG PET/CT may be a practical technique for identifying M-HLH. The model constructed using 18F-FDG PET/CT features and age was able to detect malignancy with better accuracy than visual analysis of 18F-FDG PET/CT images.

FPGA Implementation of an FDTrS/DF Signal Detector for High-density DVD System (고밀도 DVD 시스템을 위한 FDTrS/DF 신호 검출기의 FPGA 구현)

  • 정조훈
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.25 no.10B
    • /
    • pp.1732-1743
    • /
    • 2000
  • In this paper a fixed-delay trellis search with decision feedback (FDTrS/DF) for high-density DVD systems (4.7-15GB) is proposed and implemented with FPGA. The proposed FDTrS/DF is derived by transforming the binary tree search structure into trellis search structure implying that FDTrS/DF performs better than the singnal detection techniques based on tree search structure such as FDTS/DF and SSD/DF. Advantages of FDTrS/DF are significant reductions in hardware complexity due to the unique structure of FDTrS composed of only one trellis stage requiring no traceback procedure usually implemented in the Viterbi detector. Also in this paper the PDFS/DF and SSD/DF orginally proposed for high-density magnetic recording systems are modified for the DVD system and compared with the proposed FDTrS/DF. In order to increase speed in the FPGA implementation the pipelining technique and absolute branch metric (instead of square branch metric) are applied. The proposed FDTrS/DF is shown to provide the best performance among various signal detection techniques such as PRML, DFE, FDTS/DF and SSD/DF even with a small hardware complexity.

  • PDF

Fault Pattern Analysis and Restoration Prediction Model Construction of Pole Transformer Using Data Mining Technique (데이터마이닝 기법을 이용한 주상변압기 고장유형 분석 및 복구 예측모델 구축에 관한 연구)

  • Hwang, Woo-Hyun;Kim, Ja-Hee;Jang, Wan-Sung;Hong, Jung-Sik;Han, Deuk-Su
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.57 no.9
    • /
    • pp.1507-1515
    • /
    • 2008
  • It is essential for electric power companies to have a quick restoration system of the faulted pole transformers which occupy most of transformers to supply stable electricity. However, it takes too much time to restore it when a transformer is out of order suddenly because we now count on operator in investigating causes of failure and making decision of recovery methods. This paper presents the concept of 'Fault pattern analysis and Restoration prediction model using Data mining techniques’, which is based on accumulated fault record of pole transformers in the past. For this, it also suggests external and internal causes of fault which influence the fault pattern of pole transformers. It is expected that we can reduce not only defects in manufacturing procedure by upgrading quality but also the time of predicting fault patterns and recovering when faults occur by using the result.

The Multi-Agent Simulation of Archaic State Formation (다중 에이전트 기반의 고대 국가 형성 시뮬레이션)

  • S. Kim;A. Lazar;R.G. Reynolds
    • Proceedings of the Korea Society for Simulation Conference
    • /
    • 2003.06a
    • /
    • pp.91-100
    • /
    • 2003
  • In this paper we investigate the role that warfare played In the formation of the network of alliances between sites that are associated with the formation of the state in the Valley of Oaxaca, Mexico. A model of state formation proposed by Marcos and Flannery (1996) is used as the basis for an agent-based simulation model. Agents reside in sites and their actions are constrained by knowledge extracted from the Oaxaca Surface Archaeological Survey (Kowalewski 1989). The simulation is run with two different sets of constraint rules for the agents. The first set is based upon the raw data collected in the surface survey. This represents a total of 79 sites and constitutes a minimal level of warfare (raiding) in the Valley. The other site represents the generalization of these constraints to sites with similar locational characteristics. This set corresponds to 987 sites and represents a much more active role for warfare in the Valley. The rules were produced by a data mining technique, Decision Trees, guided by Genetic Algorithms. Simulations were run using the two different rule sets and compared with each other and the archaeological data for the Valley. The results strongly suggest that warfare was a necessary process in the aggregations of resources needed to support the emergence of the state in the Valley.

  • PDF

A Personalized Hand Gesture Recognition System using Soft Computing Technique (소프트 컴퓨팅 기법을 이용한 개인화된 손동작 인식 시스템)

  • Jeon, Mun-Jin;Do, Jun-Hyeong;Lee, Sang-Wan;Park, Gwang-Hyeon;Byeon, Jeung-Nam
    • Proceedings of the Korean Institute of Intelligent Systems Conference
    • /
    • 2007.11a
    • /
    • pp.127-130
    • /
    • 2007
  • 최근 하지가 불편한 노약자나 장애인이 집 안의 다양한 가전기기를 손쉽게 제어할 수 있게 하는 비전 기반의 손동작 인식 기술이 발전해 왔다. 다수의 사용자가 하나의 손동작 인식 시스템을 사용할 경우 사용자마다 손동작 특성이 모두 다르기 때문에 특정 사용자의 인식률이 저하되는 문제가 발생한다. 또한 동일한 사용자라 하더라도 시간에 따라 손동작 특성이 변화할 수 있다. 사용자마다 다른 손동작 특성은 모텔 학습 및 선택 기법을 사용해 효과적으로 다루어질 수 있다. 시간에 따라 변하는 사용자의 특성은 퍼지 개념을 이용해 효과적으로 다루어질 수 있다. 본 논문에서는 다변량 퍼지 의사결정트리를 이용해 사용자 별 인식모텔을 만드는 방법을 제시한다. 또한 새로운 사용자가 시스템을 사용할 경우 가장 적합한 모델을 선택해 인식에 사용하고 인식률을 측정한다.

  • PDF

Environmental Consciousness Data Modeling by Association Rules

  • Park, Hee-Chang;Cho, Kwang-Hyun
    • Journal of the Korean Data and Information Science Society
    • /
    • v.16 no.3
    • /
    • pp.529-538
    • /
    • 2005
  • Data mining is the method to find useful information for large amounts of data in database. It is used to find hidden knowledge by massive data, unexpectedly pattern, relation to new rule. The methods of data mining are association rules, decision tree, clustering, neural network and so on. Association rule mining searches for interesting relationships among items in a riven large data set. Association rules are frequently used by retail stores to assist in marketing, advertising, floor placement, and inventory control. There are three primary quality measures for association rule, support and confidence and lift. We analyze Gyeongnam social indicator survey data using association rule technique for environmental information discovery. We can use to environmental preservation and environmental improvement by association rule outputs.

  • PDF