• 제목/요약/키워드: Tree algorithm

Search Result 1,726, Processing Time 0.029 seconds

Text Document Categorization using FP-Tree (FP-Tree를 이용한 문서 분류 방법)

  • Park, Yong-Ki;Kim, Hwang-Soo
    • Journal of KIISE:Software and Applications
    • /
    • v.34 no.11
    • /
    • pp.984-990
    • /
    • 2007
  • As the amount of electronic documents increases explosively, automatic text categorization methods are needed to identify those of interest. Most methods use machine learning techniques based on a word set. This paper introduces a new method, called FPTC (FP-Tree based Text Classifier). FP-Tree is a data structure used in data-mining. In this paper, a method of storing text sentence patterns in the FP-Tree structure and classifying text using the patterns is presented. In the experiments conducted, we use our algorithm with a #Mutual Information and Entropy# approach to improve performance. We also present an analysis of the algorithm via an ordinary differential categorization method.

A Pseudopolynomial-time Algorithm for Solving a Capacitated Subtree of a Tree Problem in a Telecommunication System

  • Cho, Geon
    • Journal of Korean Institute of Industrial Engineers
    • /
    • v.22 no.3
    • /
    • pp.485-498
    • /
    • 1996
  • For a tree T rooted at a concentrator location in a telecommunication system, we assume that the capacity H for the concentrator is given and a profit $c_v$, and a demand $d_v$, on each node $\upsilon$ of T are also given. Then, the capacitated subtree of a tree problem (CSTP) is to find a subtree of T rooted at the concentrator location so as to maximize the total profit, the sum of profits over the subtree, under the constraint satisfying that the sum of demands over the subtree does not exceed H. In this paper, we develop a pseudopolynomial-time algorithm for CSTP, the depth-first dynamic programming algorithm. We show that a CSTP can be solved by our algorithm in $\theta$ (nH) time, where n is the number of nodes in T. Our algorithm has its own advantage and outstanding computational performance incomparable with other approaches such as CPLEX, a general integer programming solver, when it is incorporated to solve a Local Access Telecommunication Network design problem. We report the computational results for the depth-first dynamic programming algorithm and also compare them with those for CPLEX. The comparison shows that our algorithm is competitive with CPLEX for most cases.

  • PDF

Classification Accuracy Improvement for Decision Tree (의사결정트리의 분류 정확도 향상)

  • Rezene, Mehari Marta;Park, Sanghyun
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2017.04a
    • /
    • pp.787-790
    • /
    • 2017
  • Data quality is the main issue in the classification problems; generally, the presence of noisy instances in the training dataset will not lead to robust classification performance. Such instances may cause the generated decision tree to suffer from over-fitting and its accuracy may decrease. Decision trees are useful, efficient, and commonly used for solving various real world classification problems in data mining. In this paper, we introduce a preprocessing technique to improve the classification accuracy rates of the C4.5 decision tree algorithm. In the proposed preprocessing method, we applied the naive Bayes classifier to remove the noisy instances from the training dataset. We applied our proposed method to a real e-commerce sales dataset to test the performance of the proposed algorithm against the existing C4.5 decision tree classifier. As the experimental results, the proposed method improved the classification accuracy by 8.5% and 14.32% using training dataset and 10-fold crossvalidation, respectively.

Multi-Interval Discretization of Continuous-Valued Attributes for Constructing Incremental Decision Tree (증분 의사결정 트리 구축을 위한 연속형 속성의 다구간 이산화)

  • Baek, Jun-Geol;Kim, Chang-Ouk;Kim, Sung-Shick
    • Journal of Korean Institute of Industrial Engineers
    • /
    • v.27 no.4
    • /
    • pp.394-405
    • /
    • 2001
  • Since most real-world application data involve continuous-valued attributes, properly addressing the discretization process for constructing a decision tree is an important problem. A continuous-valued attribute is typically discretized during decision tree generation by partitioning its range into two intervals recursively. In this paper, by removing the restriction to the binary discretization, we present a hybrid multi-interval discretization algorithm for discretizing the range of continuous-valued attribute into multiple intervals. On the basis of experiment using semiconductor etching machine, it has been verified that our discretization algorithm constructs a more efficient incremental decision tree compared to previously proposed discretization algorithms.

  • PDF

A Spanning Tree-based Representation and Its Application to the MAX CUT Problem (신장 트리 기반 표현과 MAX CUT 문제로의 응용)

  • Hyun, Soohwan;Kim, Yong-Hyuk;Seo, Kisung
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.18 no.12
    • /
    • pp.1096-1100
    • /
    • 2012
  • Most of previous genetic algorithms for solving graph problems have used a vertex-based encoding. We proposed an edge encoding based new genetic algorithm using a spanning tree. Contrary to general edge-based encoding, a spanning tree-based encoding represents only feasible partitions. As a target problem, we adopted the MAX CUT problem, which is well known as a representative NP-hard problem, and examined the performance of the proposed genetic algorithm. The experiments on benchmark graphs are executed and compared with vertex-based encoding. Performance improvements of the spanning tree-based encoding on sparse graphs was observed.

Restoration of Distribution System with Distributed Energy Resources using Level-based Candidate Search

  • Kim, Dong-Eok;Cho, Namhun
    • Journal of Electrical Engineering and Technology
    • /
    • v.13 no.2
    • /
    • pp.637-647
    • /
    • 2018
  • In this paper, we propose a method to search candidates of network reconfiguration to restore distribution system with distributed energy resources using a level-based tree search algorithm. First, we introduce a method of expressing distribution network with distributed energy resources for fault restoration, and to represent the distribution network into a simplified graph. Second, we explain the tree search algorithm, and introduce a method of performing the tree search on the basis of search levels, which we call a level-based tree search in this paper. Then, we propose a candidate search method for fault restoration, and explain it using an example. Finally, we verify the proposed method using computer simulations.

An Expert System for Fault Restoration using Tree Search Strategies in Distribution System (트리탐색법을 이용한 사고복구 전문가시스템)

  • 김세호;최병윤;문영현
    • The Transactions of the Korean Institute of Electrical Engineers
    • /
    • v.43 no.3
    • /
    • pp.363-371
    • /
    • 1994
  • This thesis investigates an expert system(ES) to propose fault restoration plan by utilizing tree search strategies. In order to cope with an extensive amount of data and frequent breaker switching operations in distribution systems, the database of system configuration is constructed by using binary trees. This remarkably enhances the efficiency of search algorithm and makes the proposed ES easily adaptable to system changes due to switching operations. The rule-base is established to fully utilize the meris of tree-structured database. The inferring strategy is developed mainly based on the best-first search algorithm to increase computation efficiency. The proposed ES has been implemented to efficiently deal with large distribution systems by reducing computational burden remarkably compared with the conventional ES's.

  • PDF

TFP tree-based Incremental Emerging Patterns Mining for Analysis of Safe and Non-safe Power Load Lines (Safe와 Non-safe 전력 부하 라인 분석을 위한 TFP트리 기반의 점진적 출현패턴 마이닝)

  • Lee, Jong-Bum;Piao, Ming Hao;Ryu, Keun-Ho
    • Spatial Information Research
    • /
    • v.19 no.2
    • /
    • pp.71-76
    • /
    • 2011
  • In this paper, for using emerging patterns to define and analyze the significant difference of safe and non-safe power load lines, and identify which line is potentially non-safe, we proposed an incremental TFP-tree algorithm for mining emerging patterns that can search efficiently within limitation of memory. Especially, the concept of pre-infrequent patterns pruning and use of two different minimum supports, made the algorithm possible to mine most emerging patterns and handle the problem of mining from incrementally increased, large size of data sets such as power consumption data.

Hybrid Tag Anti-Collision Algorithms in RFID System (RFID 시스템에서 하이브리드 태그 충돌 방지 알고리즘)

  • Shin, Jae-Dong;Yeo, Sang-Soo;Cho, Jung-Sik
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.32 no.4A
    • /
    • pp.358-364
    • /
    • 2007
  • RFID, Radio Frequency Identification, technology is a contactless automatic identification technology using radio frequency. For this RFID technology to be widely spread, the problem of multiple tag identification, which a reader identifies a multiple number of tags in a very short time, has to be solved. Up to the present, many anti-collision algorithms have been developed in order to solve this problem, and those can be largely divided into ALOHA based algorithm and tree based algorithm. In this paper, two new anti-collision algorithms combining the characteristics of these two categories are presented. And the performances of the two algorithms are compared and evaluated in comparison with those of typical anti-collision algorithms: 18000-6 Type A, Type B, Type C, and query tree algorithm.

Enhanced Anti-Collision Protocol for Identification Systems: Binary Slotted Query Tree Algorithm

  • Le, Nam-Tuan;Choi, Sun-Woong;Jang, Yeong-Min
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.36 no.9B
    • /
    • pp.1092-1097
    • /
    • 2011
  • An anti-collision protocol which tries to minimize the collision probability and identification time is the most important factor in all identification technologies. This paper focuses on methods to improve the efficiency of tag's process in identification systems. Our scheme, Binary Slotted Query Tree (BSQT) algorithm, is a memoryless protocol that identifies an object's ID more efficiently by removing the unnecessary prefixes of the traditional Query Tree (QT) algorithm. With enhanced QT algorithm, the reader will broadcast 1 bit and wait the response from the tags but the difference in this scheme is the reader will listen in 2 slots (slot 1 is for 0 bit Tags and slot 2 is for 1 bit Tags). Base on the responses the reader will decide next broadcasted bit. This will help for the reader to remove some unnecessary broadcasted bits which no tags will response. Numerical and simulation results show that the proposed scheme decreases the tag identification time by reducing the overall number of request.