• Title/Summary/Keyword: Tree data

Search Result 3,320, Processing Time 0.029 seconds

Streaming Decision Tree for Continuity Data with Changed Pattern (패턴의 변화를 가지는 연속성 데이터를 위한 스트리밍 의사결정나무)

  • Yoon, Tae-Bok;Sim, Hak-Joon;Lee, Jee-Hyong;Choi, Young-Mee
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.20 no.1
    • /
    • pp.94-100
    • /
    • 2010
  • Data Mining is mainly used for pattern extracting and information discovery from collected data. However previous methods is difficult to reflect changing patterns with time. In this paper, we introduce Streaming Decision Tree(SDT) analyzing data with continuity, large scale, and changed patterns. SDT defines continuity data as blocks and extracts rules using a Decision Tree's learning method. The extracted rules are combined considering time of occurrence, frequency, and contradiction. In experiment, we applied time series data and confirmed resonable result.

IRFP-tree: Intersection Rule Based FP-tree (IRFP-tree(Intersection Rule Based FP-tree): 메모리 효율성을 향상시키기 위해 교집합 규칙 기반의 패러다임을 적용한 FP-tree)

  • Lee, Jung-Hun
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.5 no.3
    • /
    • pp.155-164
    • /
    • 2016
  • For frequency pattern analysis of large databases, the new tree-based frequency pattern analysis algorithm which can compensate for the disadvantages of the Apriori method has been variously studied. In frequency pattern tree, the number of nodes is associated with memory allocation, but also affects memory resource consumption and processing speed of the growth. Therefore, reducing the number of nodes in the tree is very important in the frequency pattern mining. However, the absolute criteria which need to order the transaction items for construction frequency pattern tree has lowered the compression ratio of the tree nodes. But most of the frequency based tree construction methods adapted the absolute criteria. FP-tree is typically frequency pattern tree structure which is an extended prefix-tree structure for storing compressed frequent crucial information about frequent patterns. For construction the tree, all the frequent items in different transactions are sorted according to the absolute criteria, frequency descending order. CanTree also need to absolute criteria, canonical order, to construct the tree. In this paper, we proposed a novel frequency pattern tree construction method that does not use the absolute criteria, IRFP-tree algorithm. IRFP-tree(Intersection Rule based FP-tree). IRFP-tree is constituted with the new paradigm of the intersection rule without the use of the absolute criteria. It increased the compression ratio of the tree nodes, and reduced the tree construction time. Our method has the additional advantage that it provides incremental mining. The reported test result demonstrate the applicability and effectiveness of the proposed approach.

Iceberg Query Evaluation Technical Using a Cuboid Prefix Tree (큐보이드 전위트리를 이용한 빙산질의 처리)

  • Han, Sang-Gil;Yang, Woo-Sock;Lee, Won-Suk
    • Journal of KIISE:Databases
    • /
    • v.36 no.3
    • /
    • pp.226-234
    • /
    • 2009
  • A data stream is a massive unbounded sequence of data elements continuously generated at a rapid rate. Due to the characteristics of a data stream, it is impossible to save all the data elements of a data stream. Therefore it is necessary to define a new synopsis structure to store the summary information of a data stream. For this purpose, this paper proposes a cuboid prefix tree that can be effectively employed in evaluating an iceberg query over data streams. A cuboid prefix tree only stores those itemsets that consist of grouping attributes used in GROUP BY query. In addition, a cuboid prefix tree can compute multiple iceberg queries simultaneously by sharing their common sub-expressions. A cuboid prefix tree evaluates an iceberg query over an infinitely generated data stream while efficiently reducing memory usage and processing time, which is verified by a series of experiments.

Computerization for Management of Street Tree Using CAD (CAD를 이용한 가로수 관리 전산화에 관한 연구)

  • 허상현;심경구
    • Journal of the Korean Institute of Landscape Architecture
    • /
    • v.29 no.2
    • /
    • pp.68-76
    • /
    • 2001
  • The purpose of this study is to computerize street tree management using a CAD program in order to manage the drawing record of street trees systematically and concurrently. The configuration of this program is composed of Reference Data, Data Inquiry, and Cost Assessment. The Reference Data includes characteristics of trees, monthly managements records, damage by blight and insects and usage of pesticides. The Data Inquiry includes an individual search of the tree index, simple searches and multiple searches. The Cost Assessment includes two main components, the data input with labor cost, manure ocst and pesticide cost and the assesment of management cost for prevention of blight and insects, pruning and fertilization. The results of this study are as follows: 1) When there are practices such as transplanting and removing of street trees it is immediately updated with the various situation. By creating an in progress a tree management system, up to the date information can be given to the manager for decision making. 2) To identify individual tree at the site or in drawing, the street name and numbers were used instead of coordinates. Tree tags are attached to the street trees individually. It can make DB management simple and easy. 3) By doing simple or multiple search with constructed DB, data can be provided quickly. 4) The result of this type of search are useful in the assessment of management cost very useful in regards to items such as the pruning, pesticides scattering and fertilization. 5) By using the AutoCAD software and existing PC without purchasing new equipment, the cost of system implementation can be minimized.

  • PDF

Index method of using Rend 3DR-tree for Location-Based Service (위치 기반 서비스를 위한 Rend 3DR-tree를 이용한 색인 기법)

  • Nam, Ji-Yeun;Rim, Kee-Wook;Lee, Jeong-Bae;Lee, Jong-Woock;Shin, Hyun-Cheol
    • Convergence Security Journal
    • /
    • v.8 no.4
    • /
    • pp.97-104
    • /
    • 2008
  • Recently, the wireless positioning techniques and mobile computing techniques have rapidly developed to use location data of moving objects. The more the number of moving objects is numerous and the more periodical sampling of locations is frequent, the more location data of moving objects become very large. Hence the system should be able to efficiently manage mass location data, support various spatio-temporal queries for LBS, and solve the uncertainty problem of moving objects. Therefore, in this paper, innovating the location data of moving object effectively, we propose Rend 3DR-tree method to decrease the dead space and complement the overlapping of nodes by utilizing 3DR-tree with the indexing structure to support indexing of current data and history data.

  • PDF

An Index Structure for Efficiently Handling Dynamic User Preferences and Multidimensional Data (다차원 데이터 및 동적 이용자 선호도를 위한 색인 구조의 연구)

  • Choi, Jong-Hyeok;Yoo, Kwan-Hee;Nasridinov, Aziz
    • Asia-pacific Journal of Multimedia Services Convergent with Art, Humanities, and Sociology
    • /
    • v.7 no.7
    • /
    • pp.925-934
    • /
    • 2017
  • R-tree is index structure which is frequently used for handling spatial data. However, if the number of dimensions increases, or if only partial dimensions are used for searching the certain data according to user preference, the time for indexing is greatly increased and the efficiency of the generated R-tree is greatly reduced. Hence, it is not suitable for the multidimensional data, where dimensions are continuously increasing. In this paper, we propose a multidimensional hash index, a new multidimensional index structure based on a hash index. The multidimensional hash index classifies data into buckets of euclidean space through a hash function, and then, when an actual search is requested, generates a hash search tree for effective searching. The generated hash search tree is able to handle user preferences in selected dimensional space. Experimental results show that the proposed method has better indexing performance than R-tree, while maintaining the similar search performance.

Decision Tree Classifier for Multiple Abstraction Levels of Data (다중 추상화 수준의 데이터를 위한 결정 트리 분류기)

  • Jeong, Min-A;Lee, Do-Heon
    • The KIPS Transactions:PartD
    • /
    • v.10D no.1
    • /
    • pp.23-32
    • /
    • 2003
  • Since the data is collected from disparate sources in many actual data mining environments, it is common to have data values in different abstraction levels. This paper shows that such multiple abstraction levels of data can cause undesirable effects in decision tree classification. After explaining that equalizing abstraction levels by force cannot provide satisfactory solutions of this problem, it presents a method to utilize the data as it is. The proposed method accommodates the generalization/specialization relationship between data values in both of the construction and the class assignment phase of decision tree classification. The experimental results show that the proposed method reduces classification error rates significantly when multiple abstraction levels of data are involved.

Research on improving correctness of cardiac disorder data classifier by applying Best-First decision tree method (Best-First decision tree 기법을 적용한 심전도 데이터 분류기의 정확도 향상에 관한 연구)

  • Lee, Hyun-Ju;Shin, Dong-Kyoo;Park, Hee-Won;Kim, Soo-Han;Shin, Dong-Il
    • Journal of Internet Computing and Services
    • /
    • v.12 no.6
    • /
    • pp.63-71
    • /
    • 2011
  • Cardiac disorder data are generally tested using the classifier and QRS-Complex and R-R interval which is used in this experiment are often extracted by ECG(Electrocardiogram) signals. The experimentation of ECG data with classifier is generally performed with SVM(Support Vector Machine) and MLP(Multilayer Perceptron) classifier, but this study experimented with Best-First Decision Tree(B-F Tree) derived from the Dicision Tree among Random Forest classifier algorithms to improve accuracy. To compare and analyze accuracy, experimentation of SVM, MLP, RBF(Radial Basic Function) Network and Decision Tree classifiers are performed and also compared the result of announced papers carried out under same interval and data. Comparing the accuracy of Random Forest classifier with above four ones, Random Forest is the best in accuracy. As though R-R interval was extracted using Band-pass filter in pre-processing of this experiment, in future, more filter study is needed to extract accurate interval.

Clustering based on Dependence Tree in Massive Data Streams

  • Yun, Hong-Won
    • Journal of information and communication convergence engineering
    • /
    • v.6 no.2
    • /
    • pp.182-186
    • /
    • 2008
  • RFID systems generate huge amount of data quickly. The data are associated with the locations and the timestamps and the containment relationships. It is requires to assure efficient queries and updates for product tracking and monitoring. We propose a clustering technique for fast query processing. Our study presents the state charts of temporal event flow and proposes the dependence trees with data association and uses them to cluster the linked events. Our experimental evaluation show the power of proposing clustering technique based on dependence tree.

Selection of Tree History Management System Items for Analyzing the Causes of Landscape Tree Defects in an Apartment Complex

  • Park, Sang Wook
    • Journal of People, Plants, and Environment
    • /
    • v.23 no.3
    • /
    • pp.347-362
    • /
    • 2020
  • Background and objective: It is difficult to conclusively determine the exact cause of tree defects since multiple causes are involved such as climate change, plantation, tree quality and planting time, construction, planting base, drainage, sunshine conditions, maintenance, and microclimate. The data related to landscaping construction defects are scattered or fragmented by companies and years, but not managed systematically by the defect information management system. Most of the earlier studies associated with tree defects in apartment complexes suggested defect rates after examining tree defects in the completed construction site and proposed fragmentary and subjective conclusions about the causes of defects observed in trees with high defect rates. It is proposed to continue to conduct studies on the establishment and analysis of systematic databases to identify the exact causes of tree defects and measures to improve, and the need to accumulate systematic data in the construction process where many defects arises. This study was conducted to reduce the defects of trees planted in apartment complexes. Methods: Main factors related to tree defects were subdivided based on the results of literature review and a defect investigation at the completion site, and tree history management items were selected and subdivided during the construction stage. Results: The criteria for the preparation of subdivided items were obtained, and the tree history management checklist was written for the site under actual construction and a systematic database was established. Items that are categorized based to the causes of defects include the location of nurseries, date, tree quality, site conditions, planting techniques, microclimates, and maintenance. Conclusion: This study suggested tree history management items based on the tree defects that can be identified at the construction stage and applied them to the selected study site, which differentiates this study from earlier studies. It will be necessary to conduct a comprehensive and objective time series analysis on tree defects that occur over time by continuously monitoring and collecting data after construction.