• 제목/요약/키워드: Decision Tree analysis

Search Result 725, Processing Time 0.032 seconds

Predicting stock price direction by using data mining methods : Emphasis on comparing single classifiers and ensemble classifiers

  • Eo, Kyun Sun;Lee, Kun Chang
    • Journal of the Korea Society of Computer and Information
    • /
    • v.22 no.11
    • /
    • pp.111-116
    • /
    • 2017
  • This paper proposes a data mining approach to predicting stock price direction. Stock market fluctuates due to many factors. Therefore, predicting stock price direction has become an important issue in the field of stock market analysis. However, in literature, there are few studies applying data mining approaches to predicting the stock price direction. To contribute to literature, this paper proposes comparing single classifiers and ensemble classifiers. Single classifiers include logistic regression, decision tree, neural network, and support vector machine. Ensemble classifiers we consider are adaboost, random forest, bagging, stacking, and vote. For the sake of experiments, we garnered dataset from Korea Stock Exchange (KRX) ranging from 2008 to 2015. Data mining experiments using WEKA revealed that random forest, one of ensemble classifiers, shows best results in terms of metrics such as AUC (area under the ROC curve) and accuracy.

A Study on RCM Approach to Catenary System of Electric Railway (전기철도 가공전차선로의 신뢰성 기반 유지보수(RCM)에 관한 연구)

  • Youn, Eung-Kyu;Choi, Kyu-Hyoung
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.65 no.8
    • /
    • pp.1457-1465
    • /
    • 2016
  • A RCM approach to maintenance of the catenary system of electric railway is proposed. The proposed RCM approach provides a maintenance-oriented FMECA procedure to derive critical failure modes by analyzing failure effects and a RCM decision logic tree to suggest optimal maintenance works for the derived failure modes. By applying the proposed RCM procedures to the catenary system of high speed railway, it is suggested that strand breaks of dropper and voltage equalizing wire, and trolly wire wear-out are the critical failure modes for whom maintenance works based on condition monitoring should be applied instead of conventional time-based preventive maintenance. It is also proposed by reliability analysis that replacement time of dropper can be reduced from 18 years to 10 years. These results show that the proposed RCM approach can optimize the maintenance procedures of catenary system.

POSTTS : Corpus Based Korean TTS based on Natural Language Analysis (POSTTS : 자연어 분석을 통한 코퍼스 기반 한국어 TTS)

  • Ha Ju-Hong;Zheng Yu;Kim Byeongchang;Lee Geunbae Lee
    • Proceedings of the KSPS conference
    • /
    • 2003.05a
    • /
    • pp.87-90
    • /
    • 2003
  • In order to produce high quality synthesized speech, it is very important to get an accurate grapheme-to-phoneme conversion and prosody model from texts using natural language processing. Robust preprocessing for non-Korean characters should also be required. In this paper, we analyzed Korean texts using a morphological analyzer, part-of-speech tagger and syntactic chunker. We present a new grapheme-to-phoneme conversion method, i.e. a dictionary-based and rule-based hybrid method, for unlimited vocabulary Korean TTS. We constructed a prosody model using a probabilistic method and decision tree-based method.

  • PDF

Automatic analysis of Heart Rate Variability of a tangible game user on NUI space (NUI 공간에서 체감형 게임을 통한 사용자의 심박변이도 자동분석)

  • Lee, Hyun-Ju;Shin, Dong-Il;Shin, Dong-Kyoo
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2013.11a
    • /
    • pp.1689-1692
    • /
    • 2013
  • NUI(Natural User Interface)는 사용자가 신체부위를 사용하여 인터페이스 할 수 있는 기술이다. 본 연구에서는 NUI 공간에서 체감형 게임을 시행하였다. 게임은 태권도게임으로 사용자와 컴퓨터간의 대련이며, 게임 시 사용자의 심전도 신호를 측정하였다. 사용자는 심전도데이터를 게임 시에 사용자 프로파일로 전송한다. 전송받은 심전도신호로 사용자의 심박변이도를 분류하여 분류기 실험을 시행하고 정확도를 측정하였다. 실험은 체감형 게임 시행 전과 시행 후의 상태로 나누어 실험하였으며, 분류기는 Decision Tree를 사용하였다. 실험결과 심박변이율은 게임 시행 후 정확도가 4.16% 높게 도출되었다.

Hybrid Model Based Intruder Detection System to Prevent Users from Cyber Attacks

  • Singh, Devendra Kumar;Shrivastava, Manish
    • International Journal of Computer Science & Network Security
    • /
    • v.21 no.4
    • /
    • pp.272-276
    • /
    • 2021
  • Presently, Online / Offline Users are facing cyber attacks every day. These cyber attacks affect user's performance, resources and various daily activities. Due to this critical situation, attention must be given to prevent such users through cyber attacks. The objective of this research paper is to improve the IDS systems by using machine learning approach to develop a hybrid model which controls the cyber attacks. This Hybrid model uses the available KDD 1999 intrusion detection dataset. In first step, Hybrid Model performs feature optimization by reducing the unimportant features of the dataset through decision tree, support vector machine, genetic algorithm, particle swarm optimization and principal component analysis techniques. In second step, Hybrid Model will find out the minimum number of features to point out accurate detection of cyber attacks. This hybrid model was developed by using machine learning algorithms like PSO, GA and ELM, which trained the system with available data to perform the predictions. The Hybrid Model had an accuracy of 99.94%, which states that it may be highly useful to prevent the users from cyber attacks.

A Study on Methods to Prevent Pima Indians Diabetes using SVM

  • YOU, Sanghyuck;KANG, Minsoo
    • Korean Journal of Artificial Intelligence
    • /
    • v.8 no.2
    • /
    • pp.7-10
    • /
    • 2020
  • In this paper, a study was conducted to find main factorsto Pima Indians Diabetes based on machine learning. Diabetes is a type of metabolic disease such as insufficient secretion of insulin or inability to function normally and is characterized by a high blood glucose concentration. According to a situation report from WHO(World Health Organization), Diabetes is a chronic, metabolic disease characterized by elevated levels of blood glucose (or blood sugar), which leads over time to serious damage to the heart, blood vessels, eyes, kidneys and nerves. And also about 422 million people worldwide have diabetes, the majority living in low-and middle-income countries, and 1.6 million deaths are directly attributed to diabetes each year. Both the number of cases and the prevalence of diabetes have been steadily increasing over the past few decades. Therefore, in this study, we used Support Vector Machine (SVM), Decision Tree, and correlation analysisto discover three important factorsthat predict Pima Indians diabetes with 70% accuracy. Applying the results suggested in this paper, doctors can quickly diagnose potential Pima Indians diabetics and prevent Pima Indians diabetes.

Mobile health service user characteristics analysis and churn prediction model development (모바일 헬스 서비스 사용자 특성 분석 및 이탈 예측 모델 개발)

  • Han, Jeong Hyeon;Lee, Joo Yeoun
    • Journal of the Korean Society of Systems Engineering
    • /
    • v.17 no.2
    • /
    • pp.98-105
    • /
    • 2021
  • As the average life expectancy is rising, the population is aging and the number of chronic diseases is increasing. This has increased the importance of healthy life and health management, and interest in mobile health services is on the rise thanks to the development of ICT(Information and communication technologies) and the smartphone use expansion. In order to meet these interests, many mobile services related to daily health are being launched in the market. Therefore, in this study, the characteristics of users who actually use mobile health services were analyzed and a predictive model applied with machine learning modeling was developed. As a result of the study, we developed a prediction model to which the decision tree and ensemble methods were applied. And it was found that the mobile health service users' continued use can be induced by providing features that require frequent visit, suggesting achievable activity missions, and guiding the sensor connection for user's activity measurement.

Electrical equipment pattern analysis using Class Activation Map (Class Activation Map을 활용한 전력 설비 패턴의 주요원인 분석)

  • Jang, Young-Jun;Kim, Ji-Ho;Choi, Young-Jin;lee, Hong-Chul
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2021.07a
    • /
    • pp.75-77
    • /
    • 2021
  • 전력 생산의 효율을 높이고 지속적인 공정관리를 위해 전력 설비 데이터의 패턴을 분석하고 원인이 되는 주요 변수를 찾는 것이 중요하다. 따라서, 본 연구에서는 전력 설비 데이터의 패턴을 분석하기 위해 데이터를 군집화하고 연구 방법으로 Decision Tree, Random Forest와 ResNet을 이용하여 패턴을 분류하였다. Class Activation Map을 이용하여 설비데이터의 원인이 되는 주요 변수를 확인하였다. 본 연구를 통해 전력 설비 데이터의 분류 및 원인 분석이 가능한 통합적 솔루션을 제시하고자 한다.

  • PDF

Decision tree based obesity and metabolic syndrome data classification and feature importance analysis (의사결정나무 기반 비만과 대사증후군 데이터 분류와 특징 중요도 분석)

  • Lee, Jongwook;Kim, Youngho;Baek, Byunghyun;Hwang, Doosung
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2021.11a
    • /
    • pp.880-883
    • /
    • 2021
  • 비만은 다양한 합병증을 일으키는 위험요소로 현대인의 건강을 위협한다. 비만에 영향을 주는 요소들은 유전적 영향, 식습관, 신체활동 등이 연관된다. 비만 인구의 증가로 대사증후군의 발병률이 높아졌다. 대사증후군은 비만, 고지혈증과 고혈압 등의 여러 가지 성인병을 동반한다. 비만과 대사증후군 판별 요소 검출을 위한 개인의 신체 정보와 생활 정보 분석이 필요하다. 본 논문에서는 의사결정나무를 이용하여 비만과 대사증후군을 분류하고 분류 시 사용된 특징의 중요도 분석을 수행한다. 비만 분석 결과는 체중과 신장이 높은 특징 중요도를 나타냈으며 대사증후군은 HDL, 허리둘레, 혈압과 나이 등이 높은 특징 중요도를 보였다.

A Study on Pattern Analysis of Sustainability Management Using Fuzzy ID3 (퍼지 ID3를 이용한 지속가능경영의 패턴분석에 관한 연구)

  • Kim, Hong-Jin;Hwang, Seung-Gook
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.18 no.5
    • /
    • pp.700-705
    • /
    • 2008
  • In this paper, a model to evaluate the sustainability management for small and middle enterprises was suggested. Also, the if-then rules and its decision tree for pattern analysis which is obtained by fuzzy ID3 from the data of sustainability management were shown. The suggested model can be used for the evaluation tool of competition increasement of enterprises. If the enterprise can recognize that the evaluation rule can be taken advantage of the sustainability management pattern analysis using fuzzy ID3, it is expected that they can use the rule effectively for self evaluation.