• Title/Summary/Keyword: Data Tree

Search Result 3,320, Processing Time 0.031 seconds

Efficient Spatial Index for Mobile Software (모바일 소프트웨어를 위한 효율적인 공간 인덱스)

  • Oh, Byoung-Woo
    • Spatial Information Research
    • /
    • v.16 no.1
    • /
    • pp.113-127
    • /
    • 2008
  • This paper proposes an efficient spatial index, named $AR^*$-tree(Area $R^*$-tree) which is a variant of the $R^*$-tree, for mobile software. A MBR(Minimum Bounding Rectangle) structure of the $AR^*$-tree has additional min and max values of area axis as well as x and y axes. The value of area axis is used to determine the significance of a spatial data. If area of a spatial data is large, then it is significant when drawing a map. To reduce complexity of a map on a small screen of mobile device, only significant spatial data can be found by the $AR^*$-tree. The result of a series of tests indicates that the $AR^*$-tree provides a method for control of readability of a map and guarantees an efficient performance at the same time.

  • PDF

Performance Comparisons on MongoDB with B-Tree Indexes and Fractal Tree Indexes (MongoDB에서 B-트리 인덱스와 Fractal 트리 인덱스를 이용한 성능 비교)

  • Jang, Seongho;Kim, Suhee
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2014.05a
    • /
    • pp.622-625
    • /
    • 2014
  • As Big data began to produce a variety of values, a database that allows for huge amount of data with varieties became to be needed. Therefore, for the purpose of overcoming the limitations of the complexity and capacity of the existing RDBMS, NoSQL databases were introduced. Among the different types of NoSQL databases, MongoDB is most commonly used and is offered as open sources. The B-Tree index, used in MongoDB, experiences a significant decrease in performance as the amount of data increases. The fractal tree index enables to enhance the performance of B-Tree substantially by improving B-Tree's insertion algorithm. In this paper, the performances of MongoDB when using B-Tree Index and when using Fractal Tree Index are compared.

  • PDF

Incremental Generation of A Decision Tree Using Global Discretization For Large Data (대용량 데이터를 위한 전역적 범주화를 이용한 결정 트리의 순차적 생성)

  • Han, Kyong-Sik;Lee, Soo-Won
    • The KIPS Transactions:PartB
    • /
    • v.12B no.4 s.100
    • /
    • pp.487-498
    • /
    • 2005
  • Recently, It has focused on decision tree algorithm that can handle large dataset. However, because most of these algorithms for large datasets process data in a batch mode, if new data is added, they have to rebuild the tree from scratch. h more efficient approach to reducing the cost problem of rebuilding is an approach that builds a tree incrementally. Representative algorithms for incremental tree construction methods are BOAT and ITI and most of these algorithms use a local discretization method to handle the numeric data type. However, because a discretization requires sorted numeric data in situation of processing large data sets, a global discretization method that sorts all data only once is more suitable than a local discretization method that sorts in every node. This paper proposes an incremental tree construction method that efficiently rebuilds a tree using a global discretization method to handle the numeric data type. When new data is added, new categories influenced by the data should be recreated, and then the tree structure should be changed in accordance with category changes. This paper proposes a method that extracts sample points and performs discretiration from these sample points to recreate categories efficiently and uses confidence intervals and a tree restructuring method to adjust tree structure to category changes. In this study, an experiment using people database was made to compare the proposed method with the existing one that uses a local discretization.

H*-tree/H*-cubing-cubing: Improved Data Cube Structure and Cubing Method for OLAP on Data Stream (H*-tree/H*-cubing: 데이터 스트림의 OLAP를 위한 향상된 데이터 큐브 구조 및 큐빙 기법)

  • Chen, Xiangrui;Li, Yan;Lee, Dong-Wook;Kim, Gyoung-Bae;Bae, Hae-Young
    • The KIPS Transactions:PartD
    • /
    • v.16D no.4
    • /
    • pp.475-486
    • /
    • 2009
  • Data cube plays an important role in multi-dimensional, multi-level data analysis. Meeting on-line analysis requirements of data stream, several cube structures have been proposed for OLAP on data stream, such as stream cube, flowcube, S-cube. Since it is costly to construct data cube and execute ad-hoc OLAP queries, more research works should be done considering efficient data structure, query method and algorithms. Stream cube uses H-cubing to compute selected cuboids and store the computed cells in an H-tree, which form the cuboids along popular-path. However, the H-tree layoutis disorderly and H-cubing method relies too much on popular path.In this paper, first, we propose $H^*$-tree, an improved data structure, which makes the retrieval operation in tree structure more efficient. Second, we propose an improved cubing method, $H^*$-cubing, with respect to computing the cuboids that cannot be retrieved along popular-path when an ad-hoc OLAP query is executed. $H^*$-tree construction and $H^*$-cubing algorithms are given. Performance study turns out that during the construction step, $H^*$-tree outperforms H-tree with a more desirable trade-off between time and memory usage, and $H^*$-cubing is better adapted to ad-hoc OLAP querieswith respect to the factors such as time and memory space.

An Efficient Parallel Construction Scheme of An R-Tree using Hadoop (Hadoop을 이용한 R-트리의 효율적인 병렬 구축 기법)

  • Cong, Viet-Ngu Huynh;Kim, Jongmin;Kwon, Oh-Heum;Song, Ha-Joo
    • Journal of Korea Multimedia Society
    • /
    • v.22 no.2
    • /
    • pp.231-241
    • /
    • 2019
  • Bulk-loading an R-tree can be a good approach to build an efficient one. However, it takes a lot of time to bulk-load an R-tree for huge amount of data. In this paper, we propose a parallel R-tree construction scheme based on a Hadoop framework. The proposed scheme divides the data set into a number of partitions for which local R-trees are built in parallel via Map-Reduce operations. Then the local R-trees are merged into an global R-tree that covers the whole data set. While generating the partitions, it considers the spatial distribution of the data into account so that each partition has nearly equal amounts of data. Therefore, the proposed scheme gives an efficient index structure while reducing the construction time. Experimental tests show that the proposed scheme builds an R-tree more efficiently than the existing approaches.

HD-Tree: High performance Lock-Free Nearest Neighbor Search KD-Tree (HD-Tree: 고성능 Lock-Free NNS KD-Tree)

  • Lee, Sang-gi;Jung, NaiHoon
    • Journal of Korea Game Society
    • /
    • v.20 no.5
    • /
    • pp.53-64
    • /
    • 2020
  • Supporting NNS method in KD-Tree algorithm is essential in multidimensional data applications. In this paper, we propose HD-Tree, a high-performance Lock-Free KD-Tree that supports NNS in situations where reads and writes occurs concurrently. HD-Tree reduced the number of synchronization nodes used in NNS and requires less atomic operations during Lock-Free method execution. Comparing with existing algorithms, in a multi-core system with 8 core 16 thread, HD-Tree's performance has improved up to 95% on NNS and 15% on modifying in oversubscription situation.

Design and Implementation of System for Estimating Diameter at Breast Height and Tree Height using LiDAR point cloud data

  • Jong-Su, Yim;Dong-Hyeon, Kim;Chi-Ung, Ko;Dong-Geun, Kim;Hyung-Ju, Cho
    • Journal of the Korea Society of Computer and Information
    • /
    • v.28 no.2
    • /
    • pp.99-110
    • /
    • 2023
  • In this paper, we propose a system termed ForestLi that can accurately estimate the diameter at breast height (DBH) and tree height using LiDAR point cloud data. The ForestLi system processes LiDAR point cloud data through the following steps: downsampling, outlier removal, ground segmentation, ground height normalization, stem extraction, individual tree segmentation, and DBH and tree height measurement. A commercial system, such as LiDAR360, for processing LiDAR point cloud data requires the user to directly correct errors in lower vegetation and individual tree segmentation. In contrast, the ForestLi system can automatically remove LiDAR point cloud data that correspond to lower vegetation in order to improve the accuracy of estimating DBH and tree height. This enables the ForestLi system to reduce the total processing time as well as enhance the accuracy of accuracy of measuring DBH and tree height compared to the LiDAR360 system. We performed an empirical study to confirm that the ForestLi system outperforms the LiDAR360 system in terms of the total processing time and accuracy of measuring DBH and tree height.

Performance Analysis of Tree-based Indexing Scheme for Trajectories Processing of Moving Objects (이동객체의 궤적처리를 위한 트리기반 색인기법의 성능분석)

  • Shim, Choon-Bo;Shin, Yong-Won
    • Journal of the Korean Association of Geographic Information Studies
    • /
    • v.7 no.4
    • /
    • pp.1-14
    • /
    • 2004
  • In this study, we propose Linktable based on extended TB-Tree(LTB-Tree) which can improve the performance of existing TB (Trajectory-Bundle)-tree proposed for indexing the trajectory of moving objects in GIS Applications. In addition, in order to evaluate proposed indexing scheme, we take into account as follows. At first, we select existing R*-tree, TB-tree, and LTB-tree as the subject of performance evaluation. Secondly, we make use of random data set and real data set as experimental data. Thirdly, we evaluate the performance with respect to the variation of size of memory buffer by considering the restriction of available memory of a given system. Fourth, we test them by using the experimental data set with a variation of data distribution. Finally, we think over insertion and retrieval performance of trajectory query and range query as experimental measures. The experimental results show that the proposed indexing scheme, LTB-tree, gains better performance than traditional other schemes with respect to the insertion and retrieval of trajectory query.

  • PDF

Fault Diagnosis of Induction Motors using Decision Trees (결정목을 이용한 유도전동기 결함진단)

  • Tran Van Tung;Yang Bo-Suk;Oh Myung-Suck
    • Proceedings of the Korean Society for Noise and Vibration Engineering Conference
    • /
    • 2006.11a
    • /
    • pp.407-410
    • /
    • 2006
  • Decision tree is one of the most effective and widely used methods for building classification model. Researchers from various disciplines such as statistics, machine teaming, pattern recognition, and data mining have considered the decision tree method as an effective solution to their field problems. In this paper, an application of decision tree method to classify the faults of induction motors is proposed. The original data from experiment is dealt with feature calculation to get the useful information as attributes. These data are then assigned the classes which are based on our experience before becoming data inputs for decision tree. The total 9 classes are defined. An implementation of decision tree written in Matlab is used for four data sets with good performance results

  • PDF

Dynamic Decision Tree for Data Mining (데이터마이닝을 위한 동적 결정나무)

  • Choi, Byong-Su;Cha, Woon-Ock
    • Communications for Statistical Applications and Methods
    • /
    • v.16 no.6
    • /
    • pp.959-969
    • /
    • 2009
  • Decision tree is a typical tool for data classification. This tool is implemented in DAVIS (Huh and Song, 2002). All the visualization tools and statistical clustering tools implemented in DAVIS can communicate with the decision tree. This paper presents methods to apply data visualization techniques to the decision tree using a real data set.