• Title/Summary/Keyword: Data Tree

Search Result 3,320, Processing Time 0.028 seconds

A Comparative Study of Medical Data Classification Methods Based on Decision Tree and System Reconstruction Analysis

  • Tang, Tzung-I;Zheng, Gang;Huang, Yalou;Shu, Guangfu;Wang, Pengtao
    • Industrial Engineering and Management Systems
    • /
    • v.4 no.1
    • /
    • pp.102-108
    • /
    • 2005
  • This paper studies medical data classification methods, comparing decision tree and system reconstruction analysis as applied to heart disease medical data mining. The data we study is collected from patients with coronary heart disease. It has 1,723 records of 71 attributes each. We use the system-reconstruction method to weight it. We use decision tree algorithms, such as induction of decision trees (ID3), classification and regression tree (C4.5), classification and regression tree (CART), Chi-square automatic interaction detector (CHAID), and exhausted CHAID. We use the results to compare the correction rate, leaf number, and tree depth of different decision-tree algorithms. According to the experiments, we know that weighted data can improve the correction rate of coronary heart disease data but has little effect on the tree depth and leaf number.

Optimization for Large-Scale n-ary Family Tree Visualization

  • Kyoungju, Min;Jeongyun, Cho;Manho, Jung;Hyangbae, Lee
    • Journal of information and communication convergence engineering
    • /
    • v.21 no.1
    • /
    • pp.54-61
    • /
    • 2023
  • The family tree is one of the key elements of humanities classics research and is very important for accurately understanding people or families. In this paper, we introduce a method for automatically generating a family tree using information on interpersonal relationships (IIPR) from the Korean Classics Database (KCDB) and visualize interpersonal searches within a family tree using data-driven document JavaScript (d3.js). To date, researchers of humanities classics have wasted considerable time manually drawing family trees to understand people's influence relationships. An automatic family tree builder analyzes a database that visually expresses the desired family tree. Because a family tree contains a large amount of data, we analyze the performance and bottlenecks according to the amount of data for visualization and propose an optimal way to construct a family tree. To this end, we create an n-ary tree with fake data, visualize it, and analyze its performance using simulation results.

A Tombstone Filtered LSM-Tree for Stable Performance of KVS (키밸류 저장소 성능 제어를 위한 삭제 키 분리 LSM-Tree)

  • Lee, Eunji
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.22 no.4
    • /
    • pp.17-22
    • /
    • 2022
  • With the spread of web services, data types are becoming more diversified. In addition to the form of storing data such as images, videos, and texts, the number and form of properties and metadata expressing the data are different for each data. In order to efficiently process such unstructured data, a key-value store is widely used for state-of-the-art applications. LSM-Tree (Log Structured Merge Tree) is the core data structure of various commercial key-value stores. LSM-Tree is optimized to provide high performance for small writes by recording all write and delete operations in a log manner. However, there is a problem in that the delay time and processing speed of user requests are lowered as batches of deletion operations for expired data are inserted into the LSM-Tree as special key-value data. This paper presents a Filtered LSM-Tree (FLSM-Tree) that solves the above problem by separating the deleted key from the main tree structure while maintaining all the advantages of the existing LSM-Tree. The proposed method is implemented in LevelDB, a commercial key-value store and it shows that the read performance is improved by up to 47% in performance evaluation.

MR-Tree: A Mapping-based R-Tree for Efficient Spatial Searching (Mr-Tree: 효율적인 공간 검색을 위한 매핑 기반 R-Tree)

  • Kang, Hong-Koo;Shin, In-Su;Kim, Joung-Joon;Han, Ki-Joon
    • Spatial Information Research
    • /
    • v.18 no.4
    • /
    • pp.109-120
    • /
    • 2010
  • Recently, due to rapid increasement of spatial data collected from various geosensors in u-GIS environments, the importance of spatial index for efficient search of large spatial data is rising gradually. Especially, researches based R-Tree to improve search performance of spatial data have been actively performed. These previous researches focus on reducing overlaps between nodes or the height of the R -Tree. However, these can not solve an unnecessary node access problem efficiently occurred in tree traversal. In this paper, we propose a MR-Tree(Mapping-based R-Tree) to solve this problem and to support efficient search of large spatial data. The MR-Tree can improve search performance by using a mapping tree for direct access to leaf nodes of the R-Tree without tree traversal. The mapping tree is composed with MBRs and pointers of R-Tree leaf nodes associating each partition which is made by splitting data area repeatedly along dimensions. Especially, the MR-Tree can be adopted in various variations of the R-Tree easily without a modification of the R-Tree structure. In addition, because the mapping tree is constructed in main memory, search time can be greatly reduced. Finally, we proved superiority of MR-Tree performance through experiments.

Efficient Spatial Index Structure for GIS and VLSI Design (GIS와 VLSI Design을 위한 효율적인 공간 색인구조)

  • Bang Kapsan
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2004.11a
    • /
    • pp.129-132
    • /
    • 2004
  • 공간 색인구조는 공간 데이터를 효율적으로 관리하기 위한 도구로써, GIS와 같은 공간 데이터베이스의 성능을 결정하는 중요한 요소라 하겠다. 대부분의 응용분야에서 공간 데이터베이스는 보조기억장치에 저장된 방대한 양의 공간데이터 처리를 요구하므로 디스크 접근의 수를 줄이는 것이 전체 데이터베이스의 성능을 향상시키는데 중요한 요소이다. 이 논문에서는 SMR-tree라는 공간색인구조의 여러 응용분야에서 활용 가능성을 기존의 색인구조들과의 비교를 통해 확인한다. SMR-tree는 R-tree 계열의 구조로써 기존의 R-tree계열의 구조들과 동일한 노드의 형태를 가지고 있으나, 여러 개의 data space를 사용하여 data object를 배분함으로써 $R^{+}-tree$의 말단노드 내에 존재하는 잉여공간을 제거하면서 R-tree의 단점인 색인노드들 사이에 중첩을 허용치 않는다. SMR-tree의 성능은 여러 종류의 테스트 데이터(VLSI layout data, Tiger/Line file data)를 사용하여 R-tree, $R^{+}-tree,\;R^{\ast}-tree$와 비교된다. SMR-tree는 높은 공간 활용도와 다른 색인구조에 비해 빠른 질의 성능을 보임으로써 GIS와 같은 공간 데이터베이스를 위한 효율적인 색인구조로 사용이 될 것으로 기대된다.

  • PDF

SQMR-tree: An Efficient Hybrid Index Structure for Large Spatial Data (SQMR-tree: 대용량 공간 데이타를 위한 효율적인 하이브리드 인덱스 구조)

  • Shin, In-Su;Kim, Joung-Joon;Kang, Hong-Koo;Han, Ki-Joon
    • Spatial Information Research
    • /
    • v.19 no.4
    • /
    • pp.45-54
    • /
    • 2011
  • In this paper, we propose a hybrid index structure, called the SQMR-tree(Spatial Quad MR-tree) that can process spatial data efficiently by combining advantages of the MR-tree and the SQR-tree. The MR-tree is an extended R-tree using a mapping tree to access directly to leaf nodes of the R-tree and the SQR-tree is a combination of the SQ-tree(Spatial Quad-tree) which is an extended Quad-tree to process spatial objects with non-zero area and the R-tree which actually stores spatial objects and are associated with each leaf node of the SQ-tree. The SQMR-tree consists of the SQR-tree as the base structure and the mapping trees associated with each R-tree of the SQR-tree. Therefore, because spatial objects are distributedly inserted into several R-trees and only R-trees intersected with the query area are accessed to process spatial queries like the SQR-tree, the query processing cost of the SQMR-tree can be reduced. Moreover, the search performance of the SQMR-tree is improved by using the mapping trees to access directly to leaf nodes of the R-tree without tree traversal like the MR-tree. Finally, we proved superiority of the SQMR-tree through experiments.

Design and Performance Analysis of Signature-Based Hybrid Spill-Tree for Indexing High Dimensional Vector Data (고차원 벡터 데이터 색인을 위한 시그니쳐-기반 Hybrid Spill-Tree의 설계 및 성능평가)

  • Lee, Hyun-Jo;Hong, Seung-Tae;Na, So-Ra;Jang, You-Jin;Chang, Jae-Woo;Shim, Choon-Bo
    • Journal of Internet Computing and Services
    • /
    • v.10 no.6
    • /
    • pp.173-189
    • /
    • 2009
  • Recently, video data has attracted many interest. That is the reason why efficient indexing schemes are required to support the content-based retrieval of video data. But most indexing schemes are not suitable for indexing a high-dimensional data except Hybrid Spill-Tree. In this paper, we propose an efficient high-dimensional indexing scheme to support the content-based retrieval of video data. For this, we extend Hybrid Spill-Tree by using a newly designed clustering technique and by adopting a signature method. Finally, we show that proposed signature-based high dimensional indexing scheme achieves better retrieval performance than existing M-Tree and Hybrid Spill-Tree.

  • PDF

An Index Data Structure for String Search in External Memory (외부 메모리에서 문자열을 효율적으로 탐색하기 위한 인덱스 자료 구조)

  • Na, Joong-Chae;Park, Kun-Soo
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.32 no.11_12
    • /
    • pp.598-607
    • /
    • 2005
  • We propose a new external-memory index data structure, the Suffix B-tree. The Suffix B-tree is a B-tree in which the key is a string like the String B-tree. While the node in the String B-tree is implemented with a Patricia trio, the node in the Suffix B-tree is implemented with an array. So the Suffix B-tree is simpler and easier to be Implemented than the String B-tree. Nevertheless, the branching algorithm of the Suffix B-tree is as efficient as that of the String B-tree. Consequently, the Suffix B-tree takes the same worst-case disk accesses as the String B-tree to solve the string matching problem, which is fundamental and important in the area of string algorithms.

Current Status of Tree Height Estimation from Airborne LiDAR Data

  • Hwang, Se-Ran;Lee, Im-Pyeong
    • Korean Journal of Remote Sensing
    • /
    • v.27 no.3
    • /
    • pp.389-401
    • /
    • 2011
  • Most nations around the world have expressed significant concern in the climate change due to a rapid increase in green-house gases and thus reach an international agreement to control total amount of these gases for the mitigation of global warming. As the most important absorber of carbon dioxide, one of major green-house gases, forest resources should be more tightly managed with a means to measure their total amount, forest biomass, efficiently and accurately. Forest biomass has close relations with forest areas and tree height. Airborne LiDAR data helps extract biophysical properties on forest resources such as tree height more efficiently by providing detailed spatial information about the wide-range ground surface. Many researchers have thus developed various methods to estimate tree height using LiDAR data, which retain different performance and characteristics depending on forest environment and data characteristics. In this study, we attempted to investigate such various techniques to estimate tree height, elaborate their advantages and limitations, and suggest future research directions. We first examined the characteristics of LiDAR data applied to forest studies and then analyzed methods on filtering, a precedent procedure for tree height estimation. Regarding the methods for tree height estimation, we classified them into two categories: individual tree-based and regression-based method and described the representative methods under each category with a summary of their analysis results. Finally, we reviewed techniques regarding data fusion between LiDAR and other remote sensing data for future work.

DISCRIMINATING MAJOR SPECIES OF TREE IN COMPARTMENT FROM OPTIC IMAGERY AND LIDAR DATA

  • Hong, Sung-Hoo;Lee, Seung-Ho;Cho, Hyun-Kook
    • Proceedings of the KSRS Conference
    • /
    • 2008.10a
    • /
    • pp.41-44
    • /
    • 2008
  • In this paper, major species of tree were discriminated in compartment by using LiDAR data and optic imagery. This is an important work in forest field. A current digital stock map has created the aerial photo and collecting survey data. Unlike high resolution imagery, LiDAR data is not influenced by topographic effects since it is an active sensory system. LiDAR system can measure three dimension information of individual tree. And the main methods of this study were to extract reliable the individual tree and analysis techniques to facilitate the used LiDAR data for calculating tree crown 2D parameter. We should estimate the forest inventory for calculating parameter. 2D parameter has need of area, perimeter, diameter, height, crown shape, etc. Eventually, major species of tree were determined the tree parameters, compared a digital stock map.

  • PDF