• Title/Summary/Keyword: 의사결정트리

Search Result 242, Processing Time 0.032 seconds

Generating baseball articles using decision tree (의사결정 트리를 이용한 야구기사 작성 기법)

  • Kim, Ju-bong;Go, Hyun-yung;Yong, sang-Hyuk;Han, Youn-Hee
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2016.10a
    • /
    • pp.628-631
    • /
    • 2016
  • '야구경기 결과에 대해 자동으로 기사를 작성할 수 있는가'에서 본 논문에서는 야구 경기 데이터들을 기반으로 의사결정 트리기법을 사용하여 경기결과의 문맥과 기사작성에 필요한 요소들을 자동으로 추출해보았다. 그 결과 해당경기의 데이터를 가지고 객관적인 야구기사를 생산해 낼 수 있음을 도출해냈다.

Recommender System using Context Information and Spatial Data Mining (상황정보와 공간 데이터 마이닝 기법을 이용한 추천 시스템)

  • Lee Bae-Hee;Jo Geun-Sik
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2005.11b
    • /
    • pp.667-669
    • /
    • 2005
  • 유비쿼터스 시대를 향하여 나아가는 현대 사회에서 사람들을 위한 추천시스템은 필수 불가결한 요소 중의 하나이다. 추천 시스템 중에서 사용자의 성별, 나이, 직업 등의 인구 통계적 요소를 고려한 시스템이 주를 이루고 있지만 이러한 시스템에는 어느 정도의 한계가 있다. 추천에 있어서 사용자의 기분, 날씨, 온도 등 주변 환경의 상황이 반영되지 않고 있고 학습을 위한 데이터에 대한 신뢰도 또한 문제가 된다. 이러한 문제점을 해결하기 위해 본 논문에서는 상황정보(Context Information)와 공간 데이터 마이닝(Spatial Data Mining) 기법을 이용한 향상된 추천 시스템을 제안한다. 제안하는 시스템에서는 보다 정확한 추천을 위해 첫째, 날씨, 온도, 사용자의 기분 등의 상황정보를 고려하였다. 그리고 사용자의 유사도 측정을 통해 학습 데이터의 신뢰도를 향상시켰으며, 셋째, 의사결정 트리(Decision Tree) 기법을 이용하여 추천의 정확도를 높였다. 실험을 통하여 측정한 결과 제안하는 추천시스템이 기존의 인구 통계적 요소만을 고려한 시스템이나 의사결정 트리만을 이용한 시스템보다 향상된 성능을 보였다.

  • PDF

Multiple Pedestrian Tracking based on Decision Trees (의사결정 트리 기반의 다중 보행자 추적)

  • Yu, Hye-Yeon;Kim, Young-Nam;Kim, Moon-Hyun
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2015.10a
    • /
    • pp.1302-1304
    • /
    • 2015
  • 컴퓨터 비전에서 다수의 보행자 궤적을 생성하는 문제는 여전히 어려운 문제이다. 전경에서 추출된 보행자 윤곽은 음영과 밝기 등의 문제로 윤곽이 명확하지 않고, 보행자들이 서로 다른 방향으로 움직이며 상호작용을 한다. 이로 인해 보행자를 식별하고 궤적을 생성하기에는 다소 어려움이 있다. 우리는 의사결정 트리를 사용하여 보행자 영역의 병합과 분할 상황을 개별 분리된 보행자로 검출한다. 검출된 개별 보행자는 점 대응 알고리즘으로 각 보행자의 궤적을 생성한다. 우리는 수정된 $A^*$ 검색 알고리즘으로 새로운 휴리스틱 점 대응 알고리즘을 소개한다. 우리의 실험은 PETS2010 데이터 세트로 구현되고 실험했다.

P2P Traffic Classification using Advanced Heuristic Rules and Analysis of Decision Tree Algorithms (개선된 휴리스틱 규칙 및 의사 결정 트리 분석을 이용한 P2P 트래픽 분류 기법)

  • Ye, Wujian;Cho, Kyungsan
    • Journal of the Korea Society of Computer and Information
    • /
    • v.19 no.3
    • /
    • pp.45-54
    • /
    • 2014
  • In this paper, an improved two-step P2P traffic classification scheme is proposed to overcome the limitations of the existing methods. The first step is a signature-based classifier at the packet-level. The second step consists of pattern heuristic rules and a statistics-based classifier at the flow-level. With pattern heuristic rules, the accuracy can be improved and the amount of traffic to be classified by statistics-based classifier can be reduced. Based on the analysis of different decision tree algorithms, the statistics-based classifier is implemented with REPTree. In addition, the ensemble algorithm is used to improve the performance of statistics-based classifier Through the verification with the real datasets, it is shown that our hybrid scheme provides higher accuracy and lower overhead compared to other existing schemes.

Materialized View Selection Scheme for enhancing RDF Query Performance (RDF 질의 처리 성능 향상을 위한 실체 뷰 선택 기법)

  • Park, Jaeyeol;Yoon, Sangwon;Choi, Kitae;Lim, Jongtae;Lee, Byoungyup;Shin, Jaeryong;Bok, Kyoungsoo;Yoo, Jaesoo
    • The Journal of the Korea Contents Association
    • /
    • v.15 no.12
    • /
    • pp.24-34
    • /
    • 2015
  • With the development of the semantic web, a large amount of data being produced nowadays is in RDF format. RDF is represented by a triple. An RDF database consisting of triples requires the high cost of join query processing. Materialized view is known as a scheme to reduce the query processing cost by accessing materialized views without accessing the database. It is physically stored the results or the intermediate results of the query processing in a storage area. In this paper, we propose a materialized view selection scheme by using decision tree to solve such a problem. The decision tree considers the size and maintenance costs of the materialized view as well as the profit of query response times. It is shown through performance evaluation that the proposed scheme increases the number of materialized views in the limited storage space and decreases the update rates of the materialized views.

Generation of Efficient Fuzzy Classification Rules for Intrusion Detection (침입 탐지를 위한 효율적인 퍼지 분류 규칙 생성)

  • Kim, Sung-Eun;Khil, A-Ra;Kim, Myung-Won
    • Journal of KIISE:Software and Applications
    • /
    • v.34 no.6
    • /
    • pp.519-529
    • /
    • 2007
  • In this paper, we investigate the use of fuzzy rules for efficient intrusion detection. We use evolutionary algorithm to optimize the set of fuzzy rules for intrusion detection by constructing fuzzy decision trees. For efficient execution of evolutionary algorithm we use supervised clustering to generate an initial set of membership functions for fuzzy rules. In our method both performance and complexity of fuzzy rules (or fuzzy decision trees) are taken into account in fitness evaluation. We also use evaluation with data partition, membership degree caching and zero-pruning to reduce time for construction and evaluation of fuzzy decision trees. For performance evaluation, we experimented with our method over the intrusion detection data of KDD'99 Cup, and confirmed that our method outperformed the existing methods. Compared with the KDD'99 Cup winner, the accuracy was increased by 1.54% while the cost was reduced by 20.8%.

A Study on the Node Split in Decision Tree with Multivariate Target Variables (다변량 목표변수를 갖는 의사결정나무의 노드분리에 관한 연구)

  • Kim, Seong-Jun
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.13 no.4
    • /
    • pp.386-390
    • /
    • 2003
  • Data mining is a process of discovering useful patterns for decision making from an amount of data. It has recently received much attention in a wide range of business and engineering fields. Classifying a group into subgroups is one of the most important subjects in data mining. Tree-based methods, known as decision trees, provide an efficient way to finding the classification model. The primary concern in tree learning is to minimize a node impurity, which is evaluated using a target variable in the data set. However, there are situations where multiple target variable should be taken into account, for example, such as manufacturing process monitoring, marketing science, and clinical and health analysis. The purpose of this article is to present some methods for measuring the node impurity, which are applicable to data sets with multivariate target variables. For illustration, a numerical cxample is given with discussion.

EEG Classification for depression patients using decision tree and possibilistic support vector machines (뇌파의 의사 결정 트리 분석과 가능성 기반 서포트 벡터 머신 분석을 통한 우울증 환자의 분류)

  • Sim, Woo-Hyeon;Lee, Gi-Yeong;Chae, Jeong-Ho;Jeong, Jae-Seung;Lee, Do-Heon
    • Bioinformatics and Biosystems
    • /
    • v.1 no.2
    • /
    • pp.134-138
    • /
    • 2006
  • Depression is the most common and widespread mood disorder. About 20% of the population might suffer a major, incapacitating episode of depression during their lifetime. This disorder can be classified into two types: major depressive disorders and bipolar disorder. Since pharmaceutical treatments are different according to types of depression disorders, correct and fast classification is quite critical for depression patients. Yet, classical statistical method, such as minnesota multiphasic personality inventory (MMPI), have some difficulties in applying to depression patients, because the patients suffer from concentration. We used electroencephalogram (EEG) analysis method fer classification of depression. We extracted nonlinearity of information flows between channels and estimated approximate entropy (ApEn) for the EEG at each channel. Using these attributes, we applied two types of data mining classification methods: decision tree and possibilistic support vector machines (PSVM). We found that decision tree showed 85.19% accuracy and PSVM exhibited 77.78% accuracy for classification of depression, 30 patients with major depressive disorder and 24 patients having bipolar disorder.

  • PDF

A Study on Management of Student Retention Rate Using Association Rule Mining (연관관계 규칙을 이용한 학생 유지율 관리 방안 연구)

  • Kim, Jong-Man;Lee, Dong-Cheol
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.23 no.6
    • /
    • pp.67-77
    • /
    • 2018
  • Currently, there are many problems due to the decline in school-age population. Moreover, Korea has the largest number of universities compared to the population, and the university enrollment rate is also the highest in the world. As a result, the minimum student retention rate required for the survival of each university is becoming increasingly important. The purpose of this study was to examine the effects of reducing the number of graduates of education and the social climate that prioritizes employment. And to determine what the basic direction is for students to manage the student retention rate, which can be maintained from admission to graduation, to determine the optimal input variables, Based on the input parameters, we will make associative analysis using apriori algorithm to collect training data that is most suitable for maintenance rate management and make base data for development of the most efficient Deep Learning module based on it. The accuracy of Deep Learning was 75%, which is a measure of graduation using decision trees. In decision tree, factors that determine whether to graduate are graduated from general high school and students who are female and high in residence in urban area have high probability of graduation. As a result, the Deep Learning module developed rather than the decision tree was identified as a model for evaluating the graduation of students more efficiently.

Development of a model to analyze the relationship between smart pig-farm environmental data and daily weight increase based on decision tree (의사결정트리를 이용한 돈사 환경데이터와 일당증체 간의 연관성 분석 모델 개발)

  • Han, KangHwi;Lee, Woongsup;Sung, Kil-Young
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.20 no.12
    • /
    • pp.2348-2354
    • /
    • 2016
  • In recent days, IoT (Internet of Things) technology has been widely used in the field of agriculture, which enables the collection of environmental data and biometric data into the database. The availability of big data on agriculture results in the increase of the machine learning based analysis. Through the analysis, it is possible to forecast agricultural production and the diseases of livestock, thus helping the efficient decision making in the management of smart farm. Herein, we use the environmental and biometric data of Smart Pig farm to derive the accurate relationship model between the environmental information and the daily weight increase of swine and verify the accuracy of the derived model. To this end, we applied the M5P tree algorithm of machine learning which reveals that the wind speed is the major factor which affects the daily weight increase of swine.