• Title/Summary/Keyword: Trie Index

Search Result 15, Processing Time 0.026 seconds

A Study on Small-sized Index Structure and Fast Retrieval Method Using The RCB trio (RCB트라이를 이용한 빠른 검색과 소용량 색인 구조에 관한 연구)

  • Jung, Kyu-Cheol
    • Journal of the Korea Society of Computer and Information
    • /
    • v.12 no.4
    • /
    • pp.11-19
    • /
    • 2007
  • This paper proposes RCB(Reduced Compact Binary) tie to correct faults of both CB(Compact Binary) tie and HCB(Hierarchical Compact Binary) trie. First, in the case of CB trie, a compact structure was tried for the first time, but as the amount of data was increasing, that of inputted data gained and much difficulty was experienced in insertion due to the dummy nods used in balancing trees. On the other hand, if the HCB trie realized hierarchically, given certain depth to prevent the map from increasing on the right, reached the depth, the method for making new trees and connecting to them was used. Eventually, fast progress could be made in the inputting and searching speed, but this had a disadvantage of the storage space becoming bigger because of the use of dummy nods like CB trie and of many tree links. In the case of RCB trie in this thesis, the tree-map could be reduced by about 35% by completely cutting down dummy nods and the whole size by half, compared with the HCB trie.

  • PDF

Bit-Map Based Hybrid Fast IP Lookup Technique (비트-맵 기반의 혼합형 고속 IP 검색 기법)

  • Oh Seung-Hyun
    • Journal of Korea Multimedia Society
    • /
    • v.9 no.2
    • /
    • pp.244-254
    • /
    • 2006
  • This paper presents an efficient hybrid technique to compact the trie indexing the huge forward table small enough to be stored into cache for speeding up IP lookup. It combines two techniques, an encoding scheme called bit-map and a controlled-prefix expanding scheme to replace slow memory search with few fast-memory accesses and computations. For compaction, the bit-map represents each index and child pointer with one bit respectively. For example, when one node denotes n bits, the bit-map gives a high compression rate by consumes $2^{n-1}$ bits for $2^n$ index and child link pointers branched out of the node. The controlled-prefix expanding scheme determines the number of address bits represented by all root node of each trie's level. At this time, controlled-prefix scheme use a dynamic programming technique to get a smallest trie memory size with given number of trie's level. This paper proposes standard that can choose suitable trie structure depending on memory size of system and the required IP lookup speed presenting optimal memory size and the lookup speed according to trie level number.

  • PDF

Video Index Generation and Search using Trie Structure (Trie 구조를 이용한 비디오 인덱스 생성 및 검색)

  • 현기호;김정엽;박상현
    • Journal of KIISE:Software and Applications
    • /
    • v.30 no.7_8
    • /
    • pp.610-617
    • /
    • 2003
  • Similarity matching in video database is of growing importance in many new applications such as video clustering and digital video libraries. In order to provide efficient access to relevant data in large databases, there have been many research efforts in video indexing with diverse spatial and temporal features. however, most of the previous works relied on sequential matching methods or memory-based inverted file techniques, thus making them unsuitable for a large volume of video databases. In order to resolve this problem, this paper proposes an effective and scalable indexing technique using a trie, originally proposed for string matching, as an index structure. For building an index, we convert each frame into a symbol sequence using a window order heuristic and build a disk-resident trie from a set of symbol sequences. For query processing, we perform a depth-first search on the trie and execute a temporal segmentation. To verify the superiority of our approach, we perform several experiments with real and synthetic data sets. The results reveal that our approach consistently outperforms the sequential scan method, and the performance gain is maintained even with a large volume of video databases.

File Content Retrieval Program Using HashMap-based Trie (HashMap 기반의 트라이를 이용한 파일 내용 검색 프로그램)

  • Kim, Sung Wan;Lee, Woosoon
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2014.01a
    • /
    • pp.467-468
    • /
    • 2014
  • 본 논문에서는 파일 내용 기반 검색 프로그램을 설계하고 구현하였다. 역 인덱스 구조를 이용하여 설계하였으며 별도의 정보 검색 라이브러리 사용 없이 구현하였다. 인덱스 파일은 트라이 자료 구조를 직접 설계 및 구현 하였으며 자바 언어의 HashMap 구조를 중첩 형태로 구현하였다. 개발 시스템의 유용성을 테스트하기 위해 GRE 단어집에 수록된 약 3,300개의 단어를 사용하여 임의 생성한 텍스트 파일 집합을 사용하였다.

  • PDF

Searching for Variants Using Trie-Index (트라이 인덱스를 이용한 이형태 검색)

  • Park, In-Cheol
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.10 no.8
    • /
    • pp.1986-1992
    • /
    • 2009
  • A user often searches a data by inputting a variant such as the abbreviation or substring of a word, or a misspelled word. The simple approach to the searching for variants is to build a variants dictionary. However, it entails enormous cost and time and can not handle variants by misspelling. Approximate searching, searching by approximate string matching, is a good approach to the searching. A problem in the approach is that it cannot handle variants by abbreviations. This paper propose a method for searching various variants including abbreviations and misspelled words, by using the trie indexing. First, this paper shows a variant matching method with the calculation of path weighted-metric. In addition, it provides variant searching algorithm to reduce the search time.

Enhancement of HCB Tree for Improving Retrieval Performance and Dynamic Environments (검색 성능 향상과 동적 환경을 위한 HCB 트리의 개선)

  • Kim, Sung Wan
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.19 no.2
    • /
    • pp.365-371
    • /
    • 2015
  • CB tree represents the binary trie by a compact binary sequence. However, retrieval time grows fast since the more keys stored in the trie, longer the binary sequences are. In addition it is inefficient for frequent key insertion/deletion. HCB tree is a hierarchical CB tree consisting of small binary tries. However it can not avoid shift operations and have to scan an additional table to refer child or parent trie. In order to improve retrieval performance and avoid shift operations when keys are inserted or deleted, we in this paper represent each separated trie by a full binary trie and then assign the unique identifier to it. Finally the theoretical evaluations show that both the proposed approach and HCB tree provides better than CB tree for key retrieval. The proposed approach shows the highest performance in case of key insertion/deletion and moreover requires only 71%~89% of storage as compared with CB tree.

IMT: A Memory-Efficient and Fast Updatable IP Lookup Architecture Using an Indexed Multibit Trie

  • Kim, Junghwan;Ko, Myeong-Cheol;Shin, Moon Sun;Kim, Jinsoo
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.13 no.4
    • /
    • pp.1922-1940
    • /
    • 2019
  • IP address lookup is a function to determine nexthop for a given destination IP address. It takes an important role in modern routers because of its computation time and increasing Internet traffic. TCAM-based IP lookup approaches can exploit the capability of parallel searching but have a limitation of its size due to latency, power consumption, updatability, and cost. On the other hand, multibit trie-based approaches use SRAM which has relatively low power consumption and cost. They reduce the number of memory accesses required for each lookup, but it still needs several accesses. Moreover, the memory efficiency and updatability are proportional to the number of memory accesses. In this paper, we propose a novel architecture using an Indexed Multibit Trie (IMT) which is based on combined TCAM and SRAM. In the proposed architecture, each lookup takes at most two memory accesses. We present how the IMT is constructed so as to be memory-efficient and fast updatable. Experiment results with real-world forwarding tables show that our scheme achieves good memory efficiency as well as fast updatability.

Efficient Indexing for Large DNA Sequence Databases (대용량 DNA 시퀀스 데이타베이스를 위한 효율적인 인덱싱)

  • Won Jung-Im;Yoon Jee-Hee;Park Sang-Hyun;Kim Sang-Wook
    • Journal of KIISE:Databases
    • /
    • v.31 no.6
    • /
    • pp.650-663
    • /
    • 2004
  • In molecular biology, DNA sequence searching is one of the most crucial operations. Since DNA databases contain a huge volume of sequences, a fast indexing mechanism is essential for efficient processing of DNA sequence searches. In this paper, we first identify the problems of the suffix tree in aspects of the storage overhead, search performance, and integration with DBMSs. Then, we propose a new index structure that solves those problems. The proposed index consists of two parts: the primary part represents the trie as bit strings without any pointers, and the secondary part helps fast accesses of the leaf nodes of the trio that need to be accessed for post processing. We also suggest an efficient algorithm based on that index for DNA sequence searching. To verify the superiority of the proposed approach, we conducted a performance evaluation via a series of experiments. The results revealed that the proposed approach, which requires smaller storage space, achieves 13 to 29 times performance improvement over the suffix tree.

A Bit-Map Trie for the High-Speed Longest Prefix Search of IP Addresses (고속의 최장 IP 주소 프리픽스 검색을 위한 비트-맵 트라이)

  • 오승현;안종석
    • Journal of KIISE:Information Networking
    • /
    • v.30 no.2
    • /
    • pp.282-292
    • /
    • 2003
  • This paper proposes an efficient data structure for forwarding IPv4 and IPv6 packets at the gigabit speed in backbone routers. The LPM(Longest Prefix Matching) search becomes a bottleneck of routers' performance since the LPM complexity grows in proportion to the forwarding table size and the address length. To speed up the forwarding process, this paper introduces a data structure named BMT(Bit-Map Tie) to minimize the frequent main memory accesses. All the necessary search computations in BMT are done over a small index table stored at cache. To build the small index table from the tie representation of the forwarding table, BMT represents a link pointer to the child node and a node pointer to the corresponding entry in the forwarding table with one bit respectively. To improve the poor performance of the conventional tries when their height becomes higher due to the increase of the address length, BMT adopts a binary search algorithm for determining the appropriate level of tries to start. The simulation experiments show that BMT compacts the IPv4 backbone routers' forwarding table into a small one less than 512-kbyte and achieves the average speed of 250ns/packet on Pentium II processors, which is almost the same performance as the fastest conventional lookup algorithms.

Region Query Reconstruction Method Using Trie-Structured Quad Tree in USN Middleware (USN 미들웨어에서 트라이 구조 쿼드 트리를 이용한 영역 질의 재구성 기법)

  • Cho, Sook-Kyoung;Jeong, Mi-Young;Jung, Hyun-Meen;Kim, Jong-Hoon
    • Journal of Korea Spatial Information System Society
    • /
    • v.10 no.1
    • /
    • pp.15-28
    • /
    • 2008
  • In ubiquitous sensor networks(USN) environment, it is essential to process region query for user-demand services. Using R-tree is a preferred technique to process region query for in-network query environment. In USN environment, USN middleware must select sensors that transfers region query with accuracy because the lifetime of sensors is that of whole sensor networks. When R-tree is used, however, it blindly passes the region query including non-existent sensors where MBR(Minimum Boundary Rectangle) of R-tree is Intersected by region of query. To solve in this problem, we propose a reconstruction of region query method which is a trie-structured Quad tree in the base station that includes sensors in region of query select with accuracy. We observed that the proposed method delays response time than R-tree, but is useful for reducing communication cost and energy consumption.

  • PDF