• Title/Summary/Keyword: Prefix Partition

Search Result 5, Processing Time 0.022 seconds

A Partition Mining Method of Sequential Patterns using Suffix Checking (서픽스 검사를 이용한 단계적 순차패턴 분할 탐사 방법)

  • 허용도;조동영;박두순
    • Journal of Korea Multimedia Society
    • /
    • v.5 no.5
    • /
    • pp.590-598
    • /
    • 2002
  • For efficient sequential pattern mining, we need to reduce the cost to generate candidate patterns and searching space for the generated ones. Although Apriori-like methods like GSP[8] are simple, they have some problems such as generating of many candidate patterns and repetitive searching of a large database. PrefixSpan[2], which was proposed as an alternative of GSP, constructs the prefix projected databases which are stepwise partitioned in the mining process. It can reduce the searching space to estimate the support of candidate patterns, but the construction cost of projected databases is still high. To solve these problems, we proposed SuffixSpan(Suffix checked Sequential Pattern mining) as a new sequential pattern mining method. It generates a small size of candidate pattern sets using partition property and suffix property at a low cost and also uses 1-prefix projected databases as the searching space in order to reduce the cost of estimating the support of candidate patterns.

  • PDF

Prefix Cuttings for Packet Classification with Fast Updates

  • Han, Weitao;Yi, Peng;Tian, Le
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.8 no.4
    • /
    • pp.1442-1462
    • /
    • 2014
  • Packet classification is a key technology of the Internet for routers to classify the arriving packets into different flows according to the predefined rulesets. Previous packet classification algorithms have mainly focused on search speed and memory usage, while overlooking update performance. In this paper, we propose PreCuts, which can drastically improve the update speed. According to the characteristics of IP field, we implement three heuristics to build a 3-layer decision tree. In the first layer, we group the rules with the same highest byte of source and destination IP addresses. For the second layer, we cluster the rules which share the same IP prefix length. Finally, we use the heuristic of information entropy-based bit partition to choose some specific bits of IP prefix to split the ruleset into subsets. The heuristics of PreCuts will not introduce rule duplication and incremental update will not reduce the time and space performance. Using ClassBench, it is shown that compared with BRPS and EffiCuts, the proposed algorithm not only improves the time and space performance, but also greatly increases the update speed.

A Space Efficient Indexing Technique for DNA Sequences (공간 효율적인 DNA 시퀀스 인덱싱 방안)

  • Song, Hye-Ju;Park, Young-Ho;Loh, Woong-Kee
    • Journal of KIISE:Databases
    • /
    • v.36 no.6
    • /
    • pp.455-465
    • /
    • 2009
  • Suffix trees are widely used in similar sequence matching for DNA. They have several problems such as time consuming, large space usages of disks and memories and data skew, since DNA sequences are very large and do not fit in the main memory. Thus, in the paper, we present a space efficient indexing method called SENoM, allowing us to build trees without merging phases for the partitioned sub trees. The proposed method is constructed in two phases. In the first phase, we partition the suffixes of the input string based on a common variable-length prefix till the number of suffixes is smaller than a threshold. In the second phase, we construct a sub tree based on the disk using the suffix sets, and then write it to the disk. The proposed method, SENoM eliminates complex merging phases. We show experimentally that proposed method is effective as bellows. SENoM reduces the disk usage less than 35% and reduces the memory usage less than 20% compared with TRELLIS algorithm. SENoM is available to query efficiently using the prefix tree even when the length of query sequence is large.

A Parallel IP Address Lookup Scheme for High-Speed Routers (고속의 라우터를 위한 병렬 IP 주소 검색 기법)

  • Park, Jae-hyung;Chung, Min-Young;Kim, Jin-soo;Won, Yong-gwan
    • The KIPS Transactions:PartA
    • /
    • v.11A no.5
    • /
    • pp.333-340
    • /
    • 2004
  • In order that routers forward a packet to its destination, they perform IP address lookup which determines the next hop according to the packet's destination address. In designing high speed routers, IP address lookup is an important issue. In order to design high speed routers, this paper proposes a parallel IP lookup scheme which consists of several IP lookup engines without any modification of already fabricated indirect IP lookup chipsets. Also, we propose a simple rule for partitioning IP prefix entries In an overall forwarding table among several IP lookup engines. And we evaluate the performance of the proposed scheme in terms of the memory size required for storing lookup information and the number of memory accesses on constructing the forwarding table. With additional hardware logics, the proposed scheme can reduce about 30% of the required memory size and 80% of the memory access counts.

A Partitioned Compressed-Trie for Speeding up IP Address Lookups (IP 주소 검색의 속도 향상을 위한 분할된 압축 트라이 구조)

  • Park, Jae-Hyung;Jang, Ik-Hyeon;Chung, Min-Young;Won, Yong-Gwan
    • The KIPS Transactions:PartC
    • /
    • v.10C no.5
    • /
    • pp.641-646
    • /
    • 2003
  • Packet processing speed of routers as well as transmission speed of physical links gives a great effect on IP packet transfer rate in Internet. The router forwards a packet after determining the next hop to the packet's destination. IP address lookup is a main design issue for high performance routers. In this paper, we propose a partitioned compressed-trie for speeding-up IP address lookup algorithms based on tie data structure by exploiting path compression. In the ,proposed scheme, IP prefixes are divided into several compressed-tries and lookup is performed on only one partitioned compressed-trie. Memory access time for IP address lookup is lessen due to compression technique and memory required for maintaining partition does not increased.