• Title/Summary/Keyword: 트리(tree)구조

Search Result 784, Processing Time 0.024 seconds

Quantitative approach to analyze searching efficiencies varying degrees of imbalance in a binary search tree (수량적 접근 방법에 의한 이진 검색 트리 불균형도에 따른 검색 성능 비교 분석)

  • 김숙영
    • Journal of the Korea Computer Industry Society
    • /
    • v.3 no.2
    • /
    • pp.235-242
    • /
    • 2002
  • To minimize restructuring cost of a tree, experiments were conducted to collect quantitative information of searching efficiencies varying degrees of imbalance in a binary search tree. Degrees of tree imbalance were measured by a balance factor, an absolute value of height difference of left subtree and right subtree in a binary search tree. The average number of comparisons increased (p<0.01), and searching efficiency of O(n) was more appropriate rather than O(logn), as degrees of imbalance in a binary search tree deteriorated. However, there were no significant differences of searching efficiencies in height balanced trees and trees with subtrees to have height 3 less than the other (p>0.05). Therefore, the findings would be applicable to maintain searching efficiency of a software with a binary search tree.

  • PDF

Making Cache-Conscious CCMR-trees for Main Memory Indexing (주기억 데이타베이스 인덱싱을 위한 CCMR-트리)

  • 윤석우;김경창
    • Journal of KIISE:Databases
    • /
    • v.30 no.6
    • /
    • pp.651-665
    • /
    • 2003
  • To reduce cache misses emerges as the most important issue in today's situation of main memory databases, in which CPU speeds have been increasing at 60% per year, and memory speeds at 10% per year. Recent researches have demonstrated that cache-conscious index structure such as the CR-tree outperforms the R-tree variants. Its search performance can be poor than the original R-tree, however, since it uses a lossy compression scheme. In this paper, we propose alternatively a cache-conscious version of the R-tree, which we call MR-tree. The MR-tree propagates node splits upward only if one of the internal nodes on the insertion path has empty room. Thus, the internal nodes of the MR-tree are almost 100% full. In case there is no empty room on the insertion path, a newly-created leaf simply becomes a child of the split leaf. The height of the MR-tree increases according to the sequence of inserting objects. Thus, the HeightBalance algorithm is executed when unbalanced heights of child nodes are detected. Additionally, we also propose the CCMR-tree in order to build a more cache-conscious MR-tree. Our experimental and analytical study shows that the two-dimensional MR-tree performs search up to 2.4times faster than the ordinary R-tree while maintaining slightly better update performance and using similar memory space.

A Distributed High Dimensional Indexing Structure for Content-based Retrieval of Large Scale Data (대용량 데이터의 내용 기반 검색을 위한 분산 고차원 색인 구조)

  • Cho, Hyun-Hwa;Lee, Mi-Young;Kim, Young-Chang;Chang, Jae-Woo;Lee, Kyu-Chul
    • Journal of KIISE:Databases
    • /
    • v.37 no.5
    • /
    • pp.228-237
    • /
    • 2010
  • Although conventional index structures provide various nearest-neighbor search algorithms for high-dimensional data, there are additional requirements to increase search performances as well as to support index scalability for large scale data. To support these requirements, we propose a distributed high-dimensional indexing structure based on cluster systems, called a Distributed Vector Approximation-tree (DVA-tree), which is a two-level structure consisting of a hybrid spill-tree and VA-files. We also describe the algorithms used for constructing the DVA-tree over multiple machines and performing distributed k-nearest neighbors (NN) searches. To evaluate the performance of the DVA-tree, we conduct an experimental study using both real and synthetic datasets. The results show that our proposed method contributes to significant performance advantages over existing index structures on difference kinds of datasets.

Embedding Complete Binary Trees into Crossed Cubes (완전이진트리의 교차큐브에 대한 임베딩)

  • Kim, Sook-Yeon
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.36 no.3
    • /
    • pp.149-157
    • /
    • 2009
  • The crossed cube, a variation of the hypercube, possesses a better topological property than the hypercube in its diameter that is about half of that of the hypercube. It has been known that an N-node complete binary tree is a subgraph of an (N+1)-node crossed cube [P. Kulasinghe and S. Bettayeb, 1995]. However, efficient embedding methods have not been known for the case that the number of nodes of the complete binary tree is greater than that of the crossed cube. In this paper, we show that an N-node complete binary tree can be embedded into an M-node crossed cube with dilation 1 and load factor [N/M], N>M$\geq$2. The dilation and load factor is optimal. Our embedding has a property that the tree nodes on the same level are evenly distributed over the crossed cube nodes. The property is especially useful when tree-structured algorithms are processed on a crossed cube in a level-by-level way.

Iceberg Query Evaluation Technical Using a Cuboid Prefix Tree (큐보이드 전위트리를 이용한 빙산질의 처리)

  • Han, Sang-Gil;Yang, Woo-Sock;Lee, Won-Suk
    • Journal of KIISE:Databases
    • /
    • v.36 no.3
    • /
    • pp.226-234
    • /
    • 2009
  • A data stream is a massive unbounded sequence of data elements continuously generated at a rapid rate. Due to the characteristics of a data stream, it is impossible to save all the data elements of a data stream. Therefore it is necessary to define a new synopsis structure to store the summary information of a data stream. For this purpose, this paper proposes a cuboid prefix tree that can be effectively employed in evaluating an iceberg query over data streams. A cuboid prefix tree only stores those itemsets that consist of grouping attributes used in GROUP BY query. In addition, a cuboid prefix tree can compute multiple iceberg queries simultaneously by sharing their common sub-expressions. A cuboid prefix tree evaluates an iceberg query over an infinitely generated data stream while efficiently reducing memory usage and processing time, which is verified by a series of experiments.

Dynamic Extension of Genetic Tree Maps (유전 목 지도의 동적 확장)

  • Ha, seong-Wook;Kwon, Kee-Hang;Kang, Dae-Seong
    • Journal of KIISE:Software and Applications
    • /
    • v.29 no.6
    • /
    • pp.386-395
    • /
    • 2002
  • In this paper, we suggest dynamic genetic tree-maps(DGTM) using optimal features on recognizing data. The DGTM uses the genetic algorithm about the importance of features rarely considerable on conventional neural networks and introduces GTM(genetic tree-maps) using tree structure according of the priority of features. Hence, we propose the extended formula, DGTM(dynamic GTM) has dynamic functions to separate and merge the neuron of neural network along the similarity of features.

Incremental Generation of A Decision Tree Using Global Discretization For Large Data (대용량 데이터를 위한 전역적 범주화를 이용한 결정 트리의 순차적 생성)

  • Han, Kyong-Sik;Lee, Soo-Won
    • The KIPS Transactions:PartB
    • /
    • v.12B no.4 s.100
    • /
    • pp.487-498
    • /
    • 2005
  • Recently, It has focused on decision tree algorithm that can handle large dataset. However, because most of these algorithms for large datasets process data in a batch mode, if new data is added, they have to rebuild the tree from scratch. h more efficient approach to reducing the cost problem of rebuilding is an approach that builds a tree incrementally. Representative algorithms for incremental tree construction methods are BOAT and ITI and most of these algorithms use a local discretization method to handle the numeric data type. However, because a discretization requires sorted numeric data in situation of processing large data sets, a global discretization method that sorts all data only once is more suitable than a local discretization method that sorts in every node. This paper proposes an incremental tree construction method that efficiently rebuilds a tree using a global discretization method to handle the numeric data type. When new data is added, new categories influenced by the data should be recreated, and then the tree structure should be changed in accordance with category changes. This paper proposes a method that extracts sample points and performs discretiration from these sample points to recreate categories efficiently and uses confidence intervals and a tree restructuring method to adjust tree structure to category changes. In this study, an experiment using people database was made to compare the proposed method with the existing one that uses a local discretization.

A Hash based R-Tree for Fast Search of Mass Spatial Data (대용량 공간 데이터의 빠른 검색을 위한 해시 기반 R-Tree)

  • Kang, Hong-Koo;Kim, Joung-Joon;Shin, In-Su;Han, Ki-Joon
    • Proceedings of the Korean Association of Geographic Inforamtion Studies Conference
    • /
    • 2008.10a
    • /
    • pp.82-89
    • /
    • 2008
  • 최근, GIS 분야에서 RFID와 GPS 센서 같은 위치 및 공간 데이타를 포함하는 다양한 GeoSensor의 활용으로 수집되는 공간 데이타가 크게 증가하면서, 대용량 공간 데이타의 빠른 처리를 위한 공간 인덱스의 중요성이 높아지고 있다. 특히, 대표적인 공간 인덱스인 R-Tree를 기반으로 검색 성능을 높이기 위한 연구가 활발히 진행되고 있다. 그러나, 기존 연구는 R-Tree에서 노드의 MBR 간의 겹침이나 트리 높이를 어느 정도 줄임으로써 다소 검색 성능을 향상시켰지만, 트리 검색에서 발생하는 불필요한 노드 접근 비용 문제를 효율적으로 해결하지 못하고 있다. 본 논문에서는 이러한 문제를 해결하고 R-Tree에서 대용량 공간 데이타의 빠른 검색을 제공하는 인덱스인 HR-Tree(Hash based R-Tree)를 제시한다. HR-Tree는 트리 검색 없이 R-Tree 리프 노드를 직접 접근할 수 있는 해시 테이블을 이용함으로써 R-Tree의 검색 성능을 높인다. 해시 테이블은 데이타 영역을 차원에 따라 반복적으로 분할한 Partition과 대응되는 R-Tree 리프 노드의 MBR과 포인터들로 구성된다. 각 Partition은 생성 과정에서 고유의 식별 코드를 갖기 때문에 Partition 코드가 주어지면 해시 테이블에서 해당 레코드를 쉽게 접근할 수 있다. 또한, HR-Tree는 R-Tree구조의 변경없이 다양한 R-Tree 변형 구조에 쉽게 적용할 수 있는 장점이 있다. 마지막으로 실험을 통하여 HR-Tree의 우수성을 입증하였다.

  • PDF

A Two-Dimensional Binary Prefix Tree for Packet Classification (패킷 분류를 위한 이차원 이진 프리픽스 트리)

  • Jung, Yeo-Jin;Kim, Hye-Ran;Lim, Hye-Sook
    • Journal of KIISE:Information Networking
    • /
    • v.32 no.4
    • /
    • pp.543-550
    • /
    • 2005
  • Demand for better services in the Internet has been increasing due to the rapid growth of the Internet, and hence next generation routers are required to perform intelligent packet classification. For a given classifier defining packet attributes or contents, packet classification is the process of identifying the highest priority rule to which a packet conforms. A notable characteristic of real classifiers is that a packet matches only a small number of distinct source-destination prefix pairs. Therefore, a lot of schemes have been proposed to filter rules based on source and destination prefix pairs. However, most of the schemes are based on sequential one-dimensional searches using trio which requires huge memory. In this paper, we proposea memory-efficient two-dimensional search scheme using source and destination prefix pairs. By constructing binary prefix tree, source prefix search and destination prefix search are simultaneously performed in a binary tree. Moreover, the proposed two-dimensional binary prefix tree does not include any empty internal nodes, and hence memory waste of previous trio-based structures is completely eliminated.

Trajectory Index Structure based on Signatures for Moving Objects on a Spatial Network (공간 네트워크 상의 이동객체를 위한 시그니처 기반의 궤적 색인구조)

  • Kim, Young-Jin;Kim, Young-Chang;Chang, Jae-Woo;Sim, Chun-Bo
    • Journal of Korea Spatial Information System Society
    • /
    • v.10 no.3
    • /
    • pp.1-18
    • /
    • 2008
  • Because we can usually get many information through analyzing trajectories of moving objects on spatial networks, efficient trajectory index structures are required to achieve good retrieval performance on their trajectories. However, there has been little research on trajectory index structures for spatial networks such as FNR-tree and MON-tree. Also, because FNR-tree and MON-tree store the segment unit of moving objects, they can't support the trajectory of whole moving objects. In this paper, we propose an efficient trajectory index structures based on signatures on a spatial network, named SigMO-Tree. For this, we divide moving object data into spatial and temporal attributes, and design an index structure which supports not only range query but trajectory query by preserving the whole trajectory of moving objects. In addition, we divide user queries into trajectory query based on spatio-temporal area and similar-tralectory query, and propose query processing algorithms to support them. The algorithm uses a signature file in order to retrieve candidate trajectories efficiently Finally, we show from our performance analysis that our trajectory index structure outperforms the existing index structures like FNR-Tree and MON-Tree.

  • PDF