• Title/Summary/Keyword: Tree data

Search Result 3,320, Processing Time 0.029 seconds

Construction of Energy-Efficient Data Aggregation Tree in Wireless Sensor Networks (무선 센서 네트워크에서 에너지 효율적인 데이터 병합 트리의 생성 방법)

  • Choi, Hyun-Ho
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.41 no.9
    • /
    • pp.1057-1059
    • /
    • 2016
  • A construction method of energy-efficient data aggregation tree is proposed by considering a tradeoff between acquisition time and energy consumption in wireless sensor networks. This proposed method constructs the data aggregation tree to minimize the link cost between the connected nodes for reducing energy consumption, while minimizing the maximum distance between sensor nodes and a sink node for rapid information gathering. Simulation results show that the proposed aggregation tree can be generated with low complexity and achieves high energy efficiency compared to conventional methods.

Industrial Waste Database Analysis Using Data Mining

  • Cho, Kwang-Hyun;Park, Hee-Chang
    • 한국데이터정보과학회:학술대회논문집
    • /
    • 2006.04a
    • /
    • pp.241-251
    • /
    • 2006
  • Data mining is the method to find useful information for large amounts of data in database It is used to find hidden knowledge by massive data, unexpectedly pattern, relation to new rule. The methods of data mining are decision tree, association rules, clustering, neural network and so on. We analyze industrial waste database using data mining technique. We use k-means algorithm for clustering and C5.0 algorithm for decision tree and Apriori algorithm for association rule. We can use these analysis outputs for environmental preservation and environmental improvement.

  • PDF

Query Optimization on Large Scale Nested Data with Service Tree and Frequent Trajectory

  • Wang, Li;Wang, Guodong
    • Journal of Information Processing Systems
    • /
    • v.17 no.1
    • /
    • pp.37-50
    • /
    • 2021
  • Query applications based on nested data, the most commonly used form of data representation on the web, especially precise query, is becoming more extensively used. MapReduce, a distributed architecture with parallel computing power, provides a good solution for big data processing. However, in practical application, query requests are usually concurrent, which causes bottlenecks in server processing. To solve this problem, this paper first combines a column storage structure and an inverted index to build index for nested data on MapReduce. On this basis, this paper puts forward an optimization strategy which combines query execution service tree and frequent sub-query trajectory to reduce the response time of frequent queries and further improve the efficiency of multi-user concurrent queries on large scale nested data. Experiments show that this method greatly improves the efficiency of nested data query.

An Efficient Disk Block Allocation Method for XML Data (XML 데이타를 위한 효율적인 디스크 블록 할당 방법)

  • Kim, Jung-Hoon;Son, Jin-Hyun;Chung, Yon-Dohn;Kim, Myoung-Ho
    • Journal of KIISE:Databases
    • /
    • v.34 no.5
    • /
    • pp.465-472
    • /
    • 2007
  • With the recent proliferation of the use of semi-structured data such as XML, it becomes more important to efficiently store and manage the semi-structured data. The XML data can be logically modelled as a rooted tree e.g., the DOM tree. In order to process a query on the XML data, we traverse the tree structure. In this paper we present an algorithm that places the XML data to disk blocks. The proposed algorithm assigns a number to each node of the tree in a bottom-up fashion. Then, the nodes are allocated to disk blocks using the assigned number. The proposed algorithm does not need access pattern information, and provides good performance for any access pattern. The characteristics of the proposed method are presented with analysis. Through experiments, we evaluate the performance of the proposed method.

Prediction Model for the Risk of Scapular Winging in Young Women Based on the Decision Tree

  • Gwak, Gyeong-tae;Ahn, Sun-hee;Kim, Jun-hee;Weon, Young-soo;Kwon, Oh-yun
    • Physical Therapy Korea
    • /
    • v.27 no.2
    • /
    • pp.140-148
    • /
    • 2020
  • Background: Scapular winging (SW) could be caused by tightness or weakness of the periscapular muscles. Although data mining techniques are useful in classifying or predicting risk of musculoskeletal disorder, predictive models for risk of musculoskeletal disorder using the results of clinical test or quantitative data are scarce. Objects: This study aimed to (1) investigate the difference between young women with and without SW, (2) establish a predictive model for presence of SW, and (3) determine the cutoff value of each variable for predicting the risk of SW using the decision tree method. Methods: Fifty young female subjects participated in this study. To classify the presence of SW as the outcome variable, scapular protractor strength, elbow flexor strength, shoulder internal rotation, and whether the scapula is in the dominant or nondominant side were determined. Results: The classification tree selected scapular protractor strength, shoulder internal rotation range of motion, and whether the scapula is in the dominant or nondominant side as predictor variables. The classification tree model correctly classified 78.79% (p = 0.02) of the training data set. The accuracy obtained by the classification tree on the test data set was 82.35% (p = 0.04). Conclusion: The classification tree showed acceptable accuracy (82.35%) and high specificity (95.65%) but low sensitivity (54.55%). Based on the predictive model in this study, we suggested that 20% of body weight in scapular protractor strength is a meaningful cutoff value for presence of SW.

A Distributed High Dimensional Indexing Structure for Content-based Retrieval of Large Scale Data (대용량 데이터의 내용 기반 검색을 위한 분산 고차원 색인 구조)

  • Cho, Hyun-Hwa;Lee, Mi-Young;Kim, Young-Chang;Chang, Jae-Woo;Lee, Kyu-Chul
    • Journal of KIISE:Databases
    • /
    • v.37 no.5
    • /
    • pp.228-237
    • /
    • 2010
  • Although conventional index structures provide various nearest-neighbor search algorithms for high-dimensional data, there are additional requirements to increase search performances as well as to support index scalability for large scale data. To support these requirements, we propose a distributed high-dimensional indexing structure based on cluster systems, called a Distributed Vector Approximation-tree (DVA-tree), which is a two-level structure consisting of a hybrid spill-tree and VA-files. We also describe the algorithms used for constructing the DVA-tree over multiple machines and performing distributed k-nearest neighbors (NN) searches. To evaluate the performance of the DVA-tree, we conduct an experimental study using both real and synthetic datasets. The results show that our proposed method contributes to significant performance advantages over existing index structures on difference kinds of datasets.

On Efficient Processing of Multidimensional Temporal Aggregates In Temporal Databases (시간지원 데이타베이스에서 다차원 시간 집계 연산의 효율적인 처리 기법)

  • 강성탁;정연돈;김명호
    • Journal of KIISE:Databases
    • /
    • v.29 no.6
    • /
    • pp.429-440
    • /
    • 2002
  • Temporal databases manage time-evolving data. They provide built-in supports for efficient recording and querying of temporal data. The temporal aggregate in temporal databases is an extension of the conventional aggregate to include time concept on the domain and range of aggregation. This paper focuses on multidimensional temporal aggregation. In a multidimensional temporal aggregate, we use one or more general attributes as well as a time attribute on the range of aggregation, thus it is a useful operation for historical data warehouse, Call Data Records(CDR), etc. In this paper, we propose a structure for multidimensional temporal aggregation, called PTA-tree, and an aggregate processing method based on the PTA-tree. Through analyses and performance experiments, we also compare the PTA-tree with the simple extension of SB-tree that was proposed for temporal aggregation.

Correlation Analysis of the Frequency and Death Rates in Arterial Intervention using C4.5

  • Jung, Yong Gyu;Jung, Sung-Jun;Cha, Byeong Heon
    • International journal of advanced smart convergence
    • /
    • v.6 no.3
    • /
    • pp.22-28
    • /
    • 2017
  • With the recent development of technologies to manage vast amounts of data, data mining technology has had a major impact on all industries.. Data mining is the process of discovering useful correlations hidden in data, extracting executable information for the future, and using it for decision making. In other words, it is a core process of Knowledge Discovery in data base(KDD) that transforms input data and derives useful information. It extracts information that we did not know until now from a large data base. In the decision tree, c4.5 algorithm was used. In addition, the C4.5 algorithm was used in the decision tree to analyze the difference between frequency and mortality in the region. In this paper, the frequency and mortality of percutaneous coronary intervention for patients with heart disease were divided into regions.

Update Propagation Protocol Using Tree of Replicated Data Items in Partially Replicated Databases

  • Bae, Misook;Hwang, Buhyun
    • Proceedings of the IEEK Conference
    • /
    • 2002.07c
    • /
    • pp.1859-1862
    • /
    • 2002
  • The replication of data is used to increase its availability, improve the performance of a system, and advance the fault-tolerance of a system. In this paper, it is required for the information about the location of a primary site of the replicas of each data item. The replicas of each data item are hierarchically organized to a tree based on the fact that the root is the primary replica in partially replicated databases. It eliminates useless propagation since the propagation can be done to only sites having replicas following the hierarchy of data. And our algorithm schedules transactions so that the execution order of updates at each primary site is identical at all sites by using timestamp. Using our algorithm, the consistent data are supplied and the performance of read-only transactions can be improved by using tree structure of replicas of each data item.

  • PDF

A Hash based R-Tree for Fast Search of Mass Spatial Data (대용량 공간 데이터의 빠른 검색을 위한 해시 기반 R-Tree)

  • Kang, Hong-Koo;Kim, Joung-Joon;Shin, In-Su;Han, Ki-Joon
    • Proceedings of the Korean Association of Geographic Inforamtion Studies Conference
    • /
    • 2008.10a
    • /
    • pp.82-89
    • /
    • 2008
  • 최근, GIS 분야에서 RFID와 GPS 센서 같은 위치 및 공간 데이타를 포함하는 다양한 GeoSensor의 활용으로 수집되는 공간 데이타가 크게 증가하면서, 대용량 공간 데이타의 빠른 처리를 위한 공간 인덱스의 중요성이 높아지고 있다. 특히, 대표적인 공간 인덱스인 R-Tree를 기반으로 검색 성능을 높이기 위한 연구가 활발히 진행되고 있다. 그러나, 기존 연구는 R-Tree에서 노드의 MBR 간의 겹침이나 트리 높이를 어느 정도 줄임으로써 다소 검색 성능을 향상시켰지만, 트리 검색에서 발생하는 불필요한 노드 접근 비용 문제를 효율적으로 해결하지 못하고 있다. 본 논문에서는 이러한 문제를 해결하고 R-Tree에서 대용량 공간 데이타의 빠른 검색을 제공하는 인덱스인 HR-Tree(Hash based R-Tree)를 제시한다. HR-Tree는 트리 검색 없이 R-Tree 리프 노드를 직접 접근할 수 있는 해시 테이블을 이용함으로써 R-Tree의 검색 성능을 높인다. 해시 테이블은 데이타 영역을 차원에 따라 반복적으로 분할한 Partition과 대응되는 R-Tree 리프 노드의 MBR과 포인터들로 구성된다. 각 Partition은 생성 과정에서 고유의 식별 코드를 갖기 때문에 Partition 코드가 주어지면 해시 테이블에서 해당 레코드를 쉽게 접근할 수 있다. 또한, HR-Tree는 R-Tree구조의 변경없이 다양한 R-Tree 변형 구조에 쉽게 적용할 수 있는 장점이 있다. 마지막으로 실험을 통하여 HR-Tree의 우수성을 입증하였다.

  • PDF