• 제목/요약/키워드: Data Tree

Search Result 3,320, Processing Time 0.029 seconds

Text Document Categorization using FP-Tree (FP-Tree를 이용한 문서 분류 방법)

  • Park, Yong-Ki;Kim, Hwang-Soo
    • Journal of KIISE:Software and Applications
    • /
    • v.34 no.11
    • /
    • pp.984-990
    • /
    • 2007
  • As the amount of electronic documents increases explosively, automatic text categorization methods are needed to identify those of interest. Most methods use machine learning techniques based on a word set. This paper introduces a new method, called FPTC (FP-Tree based Text Classifier). FP-Tree is a data structure used in data-mining. In this paper, a method of storing text sentence patterns in the FP-Tree structure and classifying text using the patterns is presented. In the experiments conducted, we use our algorithm with a #Mutual Information and Entropy# approach to improve performance. We also present an analysis of the algorithm via an ordinary differential categorization method.

Analysis of GPU-based Parallel Shifted Sort Algorithm by comparing with General GPU-based Tree Traversal (일반적인 GPU 트리 탐색과의 비교실험을 통한 GPU 기반 병렬 Shifted Sort 알고리즘 분석)

  • Kim, Heesu;Park, Taejung
    • Journal of Digital Contents Society
    • /
    • v.18 no.6
    • /
    • pp.1151-1156
    • /
    • 2017
  • It is common to achieve lower performance in traversing tree data structures in GPU than one expects. In this paper, we analyze the reason of lower-than-expected performance in GPU tree traversal and present that the warp divergences is caused by the branch instructions ("if${\ldots}$ else") which appear commonly in tree traversal CUDA codes. Also, we compare the parallel shifted sort algorithm which can reduce the number of warp divergences with a kd-tree CUDA implementation to show that the shifted sort algorithm can work faster than the kd-tree CUDA implementation thanks to less warp divergences. As the analysis result, the shifted sort algorithm worked about 16-fold faster than the kd-tree CUDA implementation for $2^{23}$ query points and $2^{23}$ data points in $R^3$ space. The performance gaps tend to increase in proportion to the number of query points and data points.

Automated Individual Tree Detection and Crown Delineation Using High Spatial Resolution RGB Aerial Imagery

  • Park, Tae-Jin;Lee, Jong-Yeol;Lee, Woo-Kyun;Kwak, Doo-Ahn;Kwak, Han-Bin;Lee, Sang-Chul
    • Korean Journal of Remote Sensing
    • /
    • v.27 no.6
    • /
    • pp.703-715
    • /
    • 2011
  • Forests have been considered one of the most important ecosystems on the earth, affecting the lives and environment. The sustainable forest management requires accurate and timely information of forest and tree parameters. Appropriately interpreted remotely sensed imagery can provide quantitative data for deriving forest information temporally and spatially. Especially, analysis of individual tree detection and crown delineation is significant issue, because individual trees are basic units for forest management. Individual trees in aerial imagery have reflectance characteristics according to tree species, crown shape and hierarchical status. This study suggested a method that identified individual trees and delineated crown boundaries through adopting gradient method algorithm to amplified greenness data using red and green band of aerial imagery. The amplification of specific band value improved possibility of detecting individual trees, and gradient method algorithm was performed to apply to identify individual tree tops. Additionally, tree crown boundaries were explored using spectral intensity pattern created by geometric characteristic of tree crown shape. Finally, accuracy of result derived from this method was evaluated by comparing with the reference data about individual tree location, number and crown boundary acquired by visual interpretation. The accuracy ($\hat{K}$) of suggested method to identify individual trees was 0.89 and adequate window size for delineating crown boundaries was $19{\times}19$ window size (maximum crown size: 9.4m) with accuracy ($\hat{K}$) at 0.80.

Personalized Recommendation System using FP-tree Mining based on RFM (RFM기반 FP-tree 마이닝을 이용한 개인화 추천시스템)

  • Cho, Young-Sung;Ho, Ryu-Keun
    • Journal of the Korea Society of Computer and Information
    • /
    • v.17 no.2
    • /
    • pp.197-206
    • /
    • 2012
  • A exisiting recommedation system using association rules has the problem, such as delay of processing speed from a cause of frequent scanning a large data, scalability and accuracy as well. In this paper, using a Implicit method which is not used user's profile for rating, we propose the personalized recommendation system which is a new method using the FP-tree mining based on RFM. It is necessary for us to keep the analysis of RFM method and FP-tree mining to be able to reflect attributes of customers and items based on the whole customers' data and purchased data in order to find the items with high purchasability. The proposed makes frequent items and creates association rule by using the FP-tree mining based on RFM without occurrence of candidate set. We can recommend the items with efficiency, are used to generate the recommendable item according to the basic threshold for association rules with support, confidence and lift. To estimate the performance, the proposed system is compared with existing system. As a result, it can be improved and evaluated according to the criteria of logicality through the experiment with dataset, collected in a cosmetic internet shopping mall.

Study on Developing Program for Efficient Landscape Woody Plants Management - Mainly Focused on the Development of a Tree Inventory System - (조경수목의 효율적 관리를 위한 프로그램 개발에 관한 연구 - 관리대장(Tree Inventory) 개발을 중심으로 -)

  • 조영환;곽행구
    • Journal of the Korean Institute of Landscape Architecture
    • /
    • v.24 no.4
    • /
    • pp.1-22
    • /
    • 1997
  • This paper was focused on the efficient management of landscape woody plants, and concerned itself with their important role in the urban environment. Based on the philosophy that there is nothing that can be done without an inventory, the purpose of this study was to develop an inventory system and iris proper application to a site for establishing a management plan Two different approaches were used, The first was to make a newly structured inventory system through collecting, analyzing, and evaluating various types of inventories used in Korea, the U. S. A., and Japan. The second approach was to apply a newly designed inventory system to the case study area. using GIS 'as a tool of spacial analysis and statistics for making decisions. The results could be summarized as follows; 1. In Korea, most of the Landscape Woozy Plants Inventories had datas which represented possession of trees, and only the work which they had done according to their traditional ways, There was no data related to the conditions, management needs, and site conditions of individual trees, This is essential information for organizing an inventory system . 2. There needs to be data which is balanced, containing tree characteristics and site characteristics. Through such information the management needs could be adjusted properly. The inventory list described in this paper was determined by botanical identity, placement condition, condition of tree, and types of work for maintaining as well as improving the condition of each tree One of the most important things was to determine the location data of each tree so as to compare data with other trees. The data gained from the field survey still had some problems because of lack of scientific method for supporting objective views, and because of actual situations, especially in the field of evaluating site conditions and management needs. All data should be revised to fit a computer data management system , if possible 3. The GIS(Geographic Information System) application showed good performance in handling inventory data for decision making. All the data used for the GIS application was divided into location and non-spatial data. Using the location data, it was easy to find the exact location of each tree on the monitor and on the maps generated by the computer even in the actual managed trite, along with various attribute data. Therefore it could be said that the entire management plan should start from data of individual trees with their exact locations, for making concrete management goals through actual budget planning.

  • PDF

Development of a model to analyze the relationship between smart pig-farm environmental data and daily weight increase based on decision tree (의사결정트리를 이용한 돈사 환경데이터와 일당증체 간의 연관성 분석 모델 개발)

  • Han, KangHwi;Lee, Woongsup;Sung, Kil-Young
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.20 no.12
    • /
    • pp.2348-2354
    • /
    • 2016
  • In recent days, IoT (Internet of Things) technology has been widely used in the field of agriculture, which enables the collection of environmental data and biometric data into the database. The availability of big data on agriculture results in the increase of the machine learning based analysis. Through the analysis, it is possible to forecast agricultural production and the diseases of livestock, thus helping the efficient decision making in the management of smart farm. Herein, we use the environmental and biometric data of Smart Pig farm to derive the accurate relationship model between the environmental information and the daily weight increase of swine and verify the accuracy of the derived model. To this end, we applied the M5P tree algorithm of machine learning which reveals that the wind speed is the major factor which affects the daily weight increase of swine.

Design of the Node Decision Scheme for Processing Queries on Sensor Network Environments (센서 네트워크 환경에서 질의 처리를 위한 노드 선정 기법의 설계)

  • Kim, Dong Hyun
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.16 no.10
    • /
    • pp.2224-2229
    • /
    • 2012
  • Since sensor data are inserted into a data set continuously, continuous queries should be evaluated for searching data. To processing the continuous queries, it is required to build a query index on each sensor node and to transmit result data appropriate for query predicates. However, if query predicates are transferred to all sensor nodes, massive messages are required. In this paper, we propose the node decision scheme using the sensor node decision tree in order to diminish messages. The entry of a leaf node in the node decision tree represents a sensor node and defines the data region of the sensor node. When a user query is issued, sensor nodes are decided by intersecting between data regions of the tree with the query predicates of the user query, and then the query predicates are transmitted to the selected sensor nodes. We also implement the proposed sensor node decision tree and evaluate the experiments for the tree.

Efficient Multicast Routing on BCube-Based Data Centers

  • Xie, Junjie;Guo, Deke;Xu, Jia;Luo, Lailong;Teng, Xiaoqiang
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.8 no.12
    • /
    • pp.4343-4355
    • /
    • 2014
  • Multicast group communication has many advantages in data centers and thus is widely used by many applications. It can efficiently reduce the network traffic and improve the application throughput. For the multicast application in data centers, an essential problem is how to find a minimal multicast tree, which has been proved to be NP-hard. In this paper, we propose an approximation tree-building method for the minimal multicast problem, named HD(Hamming Distance)-based multicast tree. Consider that many new network structures have been proposed for data centers. We choose three representative ones, including BCube, FBFLY, and HyperX, whose topological structures can be regarded as the generalized hypercube. Given a multicast group in BCube, the HD-based method can jointly schedule the path from each of receiver to the only sender among multiple disjoint paths; hence, it can quickly construct an efficient multicast tree with the low cost. The experimental results demonstrate that our method consumes less time to construct an efficient multicast tree, while considerably reduces the cost of the multicast tree compared to the representative methods. Our approach for BCube can also be adapted to other generalized hypercube network structures for data centers after minimal modifications.

PdR-Tree : An Efficient Indexing Technique for the improvement of search performance in High-Dimensional Data (PdR-트리 : 고차원 데이터의 검색 성능 향상을 위한 효율적인 인덱스 기법)

  • Joh, Beom-Seok;Park, Young-Bae
    • The KIPS Transactions:PartD
    • /
    • v.8D no.2
    • /
    • pp.145-153
    • /
    • 2001
  • The Pyramid-Technique is based on mapping n-dimensional space data into one-dimensional data and expressing it as B-tree ; and by solving the problem of search time complexity the pyramid technique also prevents the effect \"phenomenon of dimensional curse\" which is caused by treatment of hypercube range query in n-dimensional data space. The Spherical Pyramid-Technique applies the pyramid method’s space division strategy, uses spherical range query and improves the search performance to make it suitable for similarity search. However, depending on the size of data and change in dimensions, the two above technique demonstrate significantly inferior search performance for data sizes greater than one million and dimensions greater than sixteen. In this paper, we propose a new index-structured PdR-Tree to improve the search performance for high dimensional data such as multimedia data. Test results using simulation data as well as real data demonstrate that PdR-Tree surpasses both the Pyramid-Technique and Spherical Pyramid-Technique in terms of search performance.

  • PDF

Bit-Vector-Based Space Partitioning Indexing Scheme for Improving Node Utilization and Information Retrieval (노드 이용률과 검색 속도 개선을 위한 비트 벡터 기반 공간 분할 색인 기법)

  • Yeo, Myung-Ho;Seong, Dong-Ook;Yoo, Jae-Soo
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.16 no.7
    • /
    • pp.799-803
    • /
    • 2010
  • The KDB-tree is a traditional indexing scheme for retrieving multidimensional data. Much research for KDB-tree family frequently addresses the low storage utilization and insufficient retrieval performance as their two bottlenecks. The bottlenecks occur due to a number of unnecessary splits caused by data insertion orders and data skewness. In this paper, we propose a novel index structure, called as $KDB_{CS}^+$-tree, to process skewed data efficiently and improve the retrieval performance. The $KDB_{CS}^+$-tree increases the number of fan-outs by exploiting bit-vectors for representing splitting information and pointer elimination. It also improves the storage utilization by representing entries as a hierarchical structure in each internal node.