• Title/Summary/Keyword: Decision Tree analysis

Search Result 725, Processing Time 0.034 seconds

Analysis for Changes of Mode Choice Behavior from Providing Real-time Schedule for Public Transportation by Smartphone Application (스마트폰 애플리케이션을 이용한 대중교통 운행정보 제공에 따른 통행자 수단선택 행태변화 분석)

  • Choi, Sung-Taek;Rho, Jeong-Hyun
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.11 no.6
    • /
    • pp.60-69
    • /
    • 2012
  • Public Transport Information Service which use smartphone Apps has received attention as the way of solution that reduced transport problem. Smartphone can offer real-time information because of a LBS(Location Based Service) system. This study try to find out which factor affect mode choice ratio of public transport, especially smartphone Apps. The result shows that rising oil price, traffic congestion, public information service with smartphone apps, BIS(Bus Information System) factors get 0.39, 0.27, 0.18, 0.16 scores with paired comparison. Younger and student respondents prefer smart phone public information service. Decision Tree shows that the most important decision factor is smartphone information service factor.

Data-Driven Analysis for Future Construction Prediction : Case Study on Seoul (서울시 데이터 기반 필지별 건축행위 발생 예측)

  • Yun, Sung-Bum;Kim, Tae Hyun
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2019.11a
    • /
    • pp.7-8
    • /
    • 2019
  • 지속적인 건축물의 노화와 개발지 부족은 현존하는 건축물의 재건축 및 활용 가능 용지에 신규 건축행위를 유도한다. 서울에서는 근 5년간 25,000여 건의 신축이 발생하였으며, 이에 대한 신규 정책 등 다양한 지원 체계가 활성화되고 있다. 본 연구에서는 2011년부터 2015년까지 발생한 필지별 건축행위 데이터와 추가적 43개의 변수를 활용하여 신규 건축행위가 발생하는 필지에 대한 예측 모델을 구축하고자 한다. 요인도출 기계학습 방식인 의사결정트리 (Decision Tree) 중 CART(Classification And Regression Tree)를 활용하여 신규 건축 예측 모델을 구축하였으며, 86.28%의 정확도와 4개의 주요 신규 건축행위 발생 요인을 도출하였다.

  • PDF

A Study on the Turbidity Estimation Model Using Data Mining Techniques in the Water Supply System (데이터마이닝 기법을 이용한 상수도 시스템 내의 탁도 예측모형 개발에 관한 연구)

  • Park, No-Suk;Kim, Soonho;Lee, Young Joo;Yoon, Sukmin
    • Journal of Korean Society of Environmental Engineers
    • /
    • v.38 no.2
    • /
    • pp.87-95
    • /
    • 2016
  • Turbidity is a key indicator to the user that the 'Discolored Water' phenomenon known to be caused by corrosion of the pipeline in the water supply system. 'Discolored Water' is defined as a state with a turbidity of the degree to which the user visually be able to recognize water. Therefore, this study used data mining techniques in order to estimate turbidity changes in water supply system. Decision tree analysis was applied in data mining techniques to develop estimation models for turbidity changes in the water supply system. The pH and residual chlorine dataset was used as variables of the turbidity estimation model. As a result, the case of applying both variables(pH and residual chlorine) were shown more reasonable estimation results than models only using each variable. However, the estimation model developed in this study were shown to have underestimated predictions for the peak observed values. To overcome this disadvantage, a high-pass filter method was introduced as a pretreatment of estimation model. Modified model using high-pass filter method showed more exactly predictions for the peak observed values as well as improved prediction performance than the conventional model.

Analysis on Geographical Variations of the Prevalence of Hypertension Using Multi-year Data (다년도 자료를 이용한 고혈압 유병률의 지역간 변이 분석)

  • Kim, Yoomi;Cho, Daegon;Hong, Sungok;Kim, Eunju;Kang, Sunghong
    • Journal of the Korean Geographical Society
    • /
    • v.49 no.6
    • /
    • pp.935-948
    • /
    • 2014
  • As chronic diseases have become more prevalent and problematic, effective cares for major chronic diseases have been a locus of the healthcare policy. In this regard, this study examines how region-specific characteristics affect the prevalence of hypertension in South Korea. To analyze, we combined a unique multi-year data set including key indicators of health conditions and health behaviors at the 237 small administrative districts. The data are collected from the Annual Community Health Survey between 2009 and 2011 by Korea Centers for Disease Control and Prevention and other government organizations. For the purpose of investigating regional variations, we estimated using Geographically Weighted Regression (GWR) and decision tree model. Our finding first suggests that using the multi-year data is more legitimate than using the single-year data for the geographical analysis of chronic diseases, because the significant annual differences are observed in most variables. We also find that the prevalence of hypertension is more likely to be positively associated with the prevalence of diabetes and obesity but to be negatively associated with population density. More importantly, noticeable geographical variations in these factors are observed according to the results from the GWR. In line with this result, additional findings from the decision tree model suggest that primary influential factors that affect the hypertension prevalence are indeed heterogeneous across regional groups. Taken as a whole, accounting for geographical variations of health conditions, health behaviors and other socioeconomic factors is very important when the regionally customized healthcare policy is implemented to mitigate the hypertension prevalence. In short, our study sheds light on possible ways to manage the chronic diseases for policy makers in the local government.

  • PDF

Development of Predictive Model for Length of Stay(LOS) in Acute Stroke Patients using Artificial Intelligence (인공지능을 이용한 급성 뇌졸중 환자의 재원일수 예측모형 개발)

  • Choi, Byung Kwan;Ham, Seung Woo;Kim, Chok Hwan;Seo, Jung Sook;Park, Myung Hwa;Kang, Sung-Hong
    • Journal of Digital Convergence
    • /
    • v.16 no.1
    • /
    • pp.231-242
    • /
    • 2018
  • The efficient management of the Length of Stay(LOS) is important in hospital. It is import to reduce medical cost for patients and increase profitability for hospitals. In order to efficiently manage LOS, it is necessary to develop an artificial intelligence-based prediction model that supports hospitals in benchmarking and reduction ways of LOS. In order to develop a predictive model of LOS for acute stroke patients, acute stroke patients were extracted from 2013 and 2014 discharge injury patient data. The data for analysis was classified as 60% for training and 40% for evaluation. In the model development, we used traditional regression technique such as multiple regression analysis method, artificial intelligence technique such as interactive decision tree, neural network technique, and ensemble technique which integrate all. Model evaluation used Root ASE (Absolute error) index. They were 23.7 by multiple regression, 23.7 by interactive decision tree, 22.7 by neural network and 22.7 by esemble technique. As a result of model evaluation, neural network technique which is artificial intelligence technique was found to be superior. Through this, the utility of artificial intelligence has been proved in the development of the prediction LOS model. In the future, it is necessary to continue research on how to utilize artificial intelligence techniques more effectively in the development of LOS prediction model.

Study on Predicting the Designation of Administrative Issue in the KOSDAQ Market Based on Machine Learning Based on Financial Data (머신러닝 기반 KOSDAQ 시장의 관리종목 지정 예측 연구: 재무적 데이터를 중심으로)

  • Yoon, Yanghyun;Kim, Taekyung;Kim, Suyeong
    • Asia-Pacific Journal of Business Venturing and Entrepreneurship
    • /
    • v.17 no.1
    • /
    • pp.229-249
    • /
    • 2022
  • This paper investigates machine learning models for predicting the designation of administrative issues in the KOSDAQ market through various techniques. When a company in the Korean stock market is designated as administrative issue, the market recognizes the event itself as negative information, causing losses to the company and investors. The purpose of this study is to evaluate alternative methods for developing a artificial intelligence service to examine a possibility to the designation of administrative issues early through the financial ratio of companies and to help investors manage portfolio risks. In this study, the independent variables used 21 financial ratios representing profitability, stability, activity, and growth. From 2011 to 2020, when K-IFRS was applied, financial data of companies in administrative issues and non-administrative issues stocks are sampled. Logistic regression analysis, decision tree, support vector machine, random forest, and LightGBM are used to predict the designation of administrative issues. According to the results of analysis, LightGBM with 82.73% classification accuracy is the best prediction model, and the prediction model with the lowest classification accuracy is a decision tree with 71.94% accuracy. As a result of checking the top three variables of the importance of variables in the decision tree-based learning model, the financial variables common in each model are ROE(Net profit) and Capital stock turnover ratio, which are relatively important variables in designating administrative issues. In general, it is confirmed that the learning model using the ensemble had higher predictive performance than the single learning model.

Severe Accident Management Using PSA Event Tree Technology

  • Choi, Young;Jeong, Kwang Sub;Park, SooYong
    • International Journal of Safety
    • /
    • v.2 no.1
    • /
    • pp.50-56
    • /
    • 2003
  • There are a lot of uncertainties in the severe accident phenomena and scenarios in nuclear power plants (NPPs) and one of the major issues for severe accident management is the reduction of these uncertainties. The severe accident management aid system using Probabilistic Safety Assessments (PSA) technology is developed for the management staff in order to reduce the uncertainties. The developed system includes the graphical display for plant and equipment status, previous research results by a knowledge-base technique, and the expected plant behavior using PSA. The plant model used in this paper is oriented to identify plant response and vulnerabilities via analyzing the quantified results, and to set up a framework for an accident management program based on these analysis results. Therefore the developed system may playa central role of information source for decision-making for severe accident management, and will be used as a training tool for severe accident management.

A Study of cost data modeling for Megaproject (메가프로젝트 원가 자료 분석에 관한 연구)

  • Ji, Seong-Min;Cho, Jae-Kyung;Hyun, Chang-Taek
    • Proceedings of the Korean Institute of Building Construction Conference
    • /
    • 2009.11a
    • /
    • pp.253-256
    • /
    • 2009
  • To the success of the megaproject including various and complex facilities, it is needed to establish a database system. Developments in data collection, storage and extracting technology have enabled iPMIS to manage various and complex information about cost and time. Especially, when we consider that both the go and no go decision in feasibility, Cost is an important and clear criteria in megaproject. Thus, Cost data modeling is the basis of the system and is necessary process. This research is focus on the structure and definition about CBS data which is collected from sites. We used four tools which are Function Analysis in VE, Casual loop Diagram in System Dynamics, Decision Tree in Data-mining, and Normalization in SQL to identify its cause and effect relationship on CBS data. Cost data modeling provide iPMIS with helpful guideline.

  • PDF

A Study on Reliability Centered Maintenance (통합신뢰성 경영에서 보전에 중점을 둔 신뢰성에 관한 연구)

  • Kim, Hwan-Joong
    • Journal of Applied Reliability
    • /
    • v.3 no.1
    • /
    • pp.73-82
    • /
    • 2003
  • Reliability Centered Maintenance(RCM) was initially developed for the commercial aviation industry in the late 1960s and now is equally applicable to a variety of equipment other than aircraft. RCM is a method for establishing a preventive maintenance program which will efficiently and effectively allow the achivement of the the required safety and availability levels of equipment and structures. RCM provides for the use of a decision logic tree to identify applicable and effective preventive maintenance requirements for equipment and structures according to the safety, maintenance requirements for equipment and structures according to the safety, operational and economic consequences of identifiable failures, and the degradation mechanism, reponsible for the those failures. The end result of working through the decision logic is a judgement as to the necessity of performing a maintenance task. In this paper, we provide guiding principles based on IEC 60300-3-11 for RCM analysis methods and operational method of structure and equipment.

  • PDF

Performance analysis and comparison of various machine learning algorithms for early stroke prediction

  • Vinay Padimi;Venkata Sravan Telu;Devarani Devi Ningombam
    • ETRI Journal
    • /
    • v.45 no.6
    • /
    • pp.1007-1021
    • /
    • 2023
  • Stroke is the leading cause of permanent disability in adults, and it can cause permanent brain damage. According to the World Health Organization, 795 000 Americans experience a new or recurrent stroke each year. Early detection of medical disorders, for example, strokes, can minimize the disabling effects. Thus, in this paper, we consider various risk factors that contribute to the occurrence of stoke and machine learning algorithms, for example, the decision tree, random forest, and naive Bayes algorithms, on patient characteristics survey data to achieve high prediction accuracy. We also consider the semisupervised self-training technique to predict the risk of stroke. We then consider the near-miss undersampling technique, which can select only instances in larger classes with the smaller class instances. Experimental results demonstrate that the proposed method obtains an accuracy of approximately 98.83% at low cost, which is significantly higher and more reliable compared with the compared techniques.