• Title/Summary/Keyword: Data Tree

Search Result 3,331, Processing Time 0.025 seconds

A Study on the Implementation of SQL Primitives for Decision Tree Classification (판단 트리 분류를 위한 SQL 기초 기능의 구현에 관한 연구)

  • An, Hyoung Geun;Koh, Jae Jin
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.2 no.12
    • /
    • pp.855-864
    • /
    • 2013
  • Decision tree classification is one of the important problems in data mining fields and data minings have been important tasks in the fields of large database technologies. Therefore the coupling efforts of data mining systems and database systems have led the developments of database primitives supporting data mining functions such as decision tree classification. These primitives consist of the special database operations which support the SQL implementation of decision tree classification algorithms. These primitives have become the consisting modules of database systems for the implementations of the specific algorithms. There are two aspects in the developments of database primitives which support the data mining functions. The first is the identification of database common primitives which support data mining functions by analysis. The other is the provision of the extended mechanism for the implementations of these primitives as an interface of database systems. In data mining, some primitives want be stored in DBMS is one of the difficult problems. In this paper, to solve of the problem, we describe the database primitives which construct and apply the optimized decision tree classifiers. Then we identify the useful operations for various classification algorithms and discuss the implementations of these primitives on the commercial DBMS. We implement these primitives on the commercial DBMS and present experimental results demonstrating the performance comparisons.

Efficient Searching Technique for Nearest Neighbor Object in High-Dimensional Data (고차원 데이터의 효율적인 최근접 객체 검색 기법)

  • Kim, Jin-Ho;Park, Young-Bae
    • The KIPS Transactions:PartD
    • /
    • v.11D no.2
    • /
    • pp.269-280
    • /
    • 2004
  • The Pyramid-Technique is based on mapping n-dimensional space data into one-dimensional data and expresses it as a B+-tree. By solving the problem of search time complexity the pyramid technique also prevents the effect of "phenomenon of dimensional curse" which is caused by treatment of hypercube range query in n-dimensional data space. The SPY-TEC applies the space division strategy in pyramid method and uses spherical range query suitable for similarity search so that Improves the search performance. However, nearest neighbor query is more efficient than range query because it is difficult to specify range in similarity search. Previously proposed index methods perform well only in the specific distribution of data. In this paper, we propose an efficient searching technique for nearest neighbor object using PdR-Tree suggested to improve the search performance for high dimensional data such as multimedia data. Test results, which uses simulation data with various distribution as well as real data, demonstrate that PdR-Tree surpasses both the Pyramid-Technique and SPY-TEC in views of search performance.rformance.

Tree Based Cluster Analysis Using Reference Data (배경자료를 이용한 나무구조의 군집분석)

  • 최대우;구자용;최용석
    • The Korean Journal of Applied Statistics
    • /
    • v.17 no.3
    • /
    • pp.535-545
    • /
    • 2004
  • The clustering method suggested in this paper produces clusters based on the 'rules of variables' by merging the 'training' and the identically structured reference data and then by filtering it to obtain the clusters of the 'training data' through the use of the 'tree classification model'. The reference dataset is generated by spatially contrasting it to the 'training data' through the 'reverse arcing' algorithm to effectively identify the clusters. The strength of this method is that it can be applied even to the mixture of continuous and discrete types of 'training data' and the performance of this algorithm is illustrated by applying it to the simulated data as well as to the actual data.

A Comparative Study on the Performance of Intrusion Detection using Decision Tree and Artificial Neural Network Models (의사결정트리와 인공 신경망 기법을 이용한 침입탐지 효율성 비교 연구)

  • Jo, Seongrae;Sung, Haengnam;Ahn, Byunghyuk
    • Journal of Korea Society of Digital Industry and Information Management
    • /
    • v.11 no.4
    • /
    • pp.33-45
    • /
    • 2015
  • Currently, Internet is used an essential tool in the business area. Despite this importance, there is a risk of network attacks attempting collection of fraudulence, private information, and cyber terrorism. Firewalls and IDS(Intrusion Detection System) are tools against those attacks. IDS is used to determine whether a network data is a network attack. IDS analyzes the network data using various techniques including expert system, data mining, and state transition analysis. This paper tries to compare the performance of two data mining models in detecting network attacks. They are decision tree (C4.5), and neural network (FANN model). I trained and tested these models with data and measured the effectiveness in terms of detection accuracy, detection rate, and false alarm rate. This paper tries to find out which model is effective in intrusion detection. In the analysis, I used KDD Cup 99 data which is a benchmark data in intrusion detection research. I used an open source Weka software for C4.5 model, and C++ code available for FANN model.

PM-MAC : An Efficient MAC Protocol for Periodic Traffic Monitoring In Wireless Sensor Networks (무선 센서 네트워크에서 주기적인 트래픽의 효율적인 모니터링을 위한 MAC 프로토콜)

  • Kim, Dong-Min;Kim, Seong-Cheol
    • Journal of the Korea Society of Computer and Information
    • /
    • v.13 no.7
    • /
    • pp.157-164
    • /
    • 2008
  • In this paper we suggest a scheduling algorithm that transmits periodic traffics efficiently in tree-structured wireless sensor networks (WSNs). The related research[l] showed the problems such as increasing the energy consumption and decreasing the data throughput as the depth of tree increases. To solve these problems. we use idle time slots and avoid the redundancy at data transmission. Also we suggest the algorithm that transmits the control packet when it is similar to a previously measured data. And if emergency data is occurred, our proposed algorithm transits that data in EDP(Emergency Data Period) for reducing the wait time. The proposed algorithm shows more data throughput and less energy consumption than that of the related research.

  • PDF

Plan to Construct Tree Belt around Saemangeum Reclaimed Land - Analysis of Initial Growth Amount of Pinus thunbergii and Quercus serrata - (새만금 간척지 수림대 조성 방안 - 곰솔과 졸참나무의 초기 생장량 분석 -)

  • Kim, Hyun
    • Journal of the Korean Society of Environmental Restoration Technology
    • /
    • v.20 no.1
    • /
    • pp.117-129
    • /
    • 2017
  • This research was conducted to construct a tree belt around Saemangeum reclaimed land using various planting methods and to analyze initial growth amount, to provide practical data to construct tree belt of various purposes. Tree species used in tree belt construction were Pinus thunbergii and Quercus serrata, and the main planting treatment methods used were categorized by existence of windy fence, mixed planting, and un-mixed planting. Growth amount analysis was conducted using ANOVA to compare growth amounts in different experimental groups and Duncan's multiple range test. Growth amount analysis results of tree belt by planting method showed that it is most statistically plausible to install 50% porous windy fence from the direction of wind and frost, followed by planting P. thunbergii and Q. serrata in areas that require mixed tree species tree belt around Saemangeum reclaimed land. In areas where un-mixed planting tree belt is required, it was appropriate to use P. thunbergii alone without a windy fence. Lastly, if the purpose of the tree belt is limited to rapid growth, it was most ideal to plant P. thunbergii alone (without windy fence) or install 50% porous windy fence from the direction of wind and frost, followed by planting P. thunbergii and Q. serrata. This research is based on initial growth amount of tree belt and there is a need for a long-term monitoring of tree belt growth to increase tree-planting success rate in establishing tree belt according to Saemangeum internal development.

Tmr-Tree : An Efficient Spatial Index Technique in Main Memory Databases (Tmr-트리 : 주기억 데이터베이스에서 효율적인 공간 색인 기법)

  • Yun Suk-Woo;Kim Kyung-Chang
    • The KIPS Transactions:PartD
    • /
    • v.12D no.4 s.100
    • /
    • pp.543-552
    • /
    • 2005
  • As random access memory chip gets cheaper, it becomes affordable to realize main memory-based database systems. The disk-based spatial indexing techniques, however, cannot direct apply to main memory databases, because the main purpose of disk-based techniques is to reduce the number of disk accesses. In main memory-based indexing techniques, the node access time is much faster than that in disk-based indexing techniques, because all index nodes reside in a main memory. Unlike disk-based index techniques, main memory-based spatial indexing techniques must reduce key comparing time as well as node access time. In this paper, we propose an efficient spatial index structure for main memory-based databases, called Tmr-tree. Tmr-tree integrates the characteristics of R-tree and T-tree. Therefore, Nodes of Tmr-tree consist of several entries for data objects, main memory pointers to left and right child, and three additional fields. First is a MBR of a self node, which tightly encloses all data MBRs (Minimum Bounding Rectangles) in a current node, and second and third are MBRs of left and right sub-tree, respectively. Because Tmr-tree needs not to visit all leaf nodes, in terms of search time, proposed Tmr-tree outperforms R-tree in our experiments. As node size is increased, search time is drastically decreased followed by a gradual increase. However, in terms of insertion time, the performance of Tmr-tree was slightly lower than R-tree.

Garbage Collection Method using Proxy Block considering Index Data Structure based on Flash Memory (플래시 메모리 기반 인덱스 구조에서 대리블록 이용한 가비지 컬렉션 기법)

  • Kim, Seon Hwan;Kwak, Jong Wook
    • Journal of the Korea Society of Computer and Information
    • /
    • v.20 no.6
    • /
    • pp.1-11
    • /
    • 2015
  • Recently, NAND flash memories are used for storage devices because of fast access speed and low-power. However, applications of FTL on low power computing devices lead to heavy workloads which result in a memory requirement and an implementation overhead. Consequently, studies of B+-Tree on embedded devices without the FTL have been proposed. The studies of B+-Tree are optimized for performance of inserting and updating records, considering to disadvantages of the NAND flash memory that it can not support in-place update. However, if a general garbage collection method is applied to the previous studies of B+-Tree, a performance of the B+-Tree is reduced, because it generates a rearrangement of the B+-Tree by changing of page positions on the NAND flash memory. Therefor, we propose a novel garbage collection method which can apply to the B+-Tree based on the NAND flash memory without the FTL. The proposed garbage collection method does not generate a rearrangement of the B+-Tree by using a block information table and a proxy block. We implemented the B+-Tree and ${\mu}$-Tree with the proposed garbage collection on physical devices with the NAND flash memory. In experiment results, the proposed garbage collection scheme compared to greedy algorithm garbage collection scheme increased the number of inserted keys by up to about 73% on B+-Tree and decreased elapsed time of garbage collection by up to about 39% on ${\mu}$-Tree.

The Aging Measurement of Water Tree Using AgNO$_3$Solution (AgNO$_3$을 이용한 수트리의 실시간 열화계측)

  • Kim, Duck-Keun;Ooh, Soo-Hong;Lee, Jin;Lee, Eun-Hak;Kim, Tae-Sung
    • Proceedings of the Korean Institute of Electrical and Electronic Material Engineers Conference
    • /
    • 1997.11a
    • /
    • pp.409-412
    • /
    • 1997
  • The phenomenon of water tree degradation of underground distribution power cables is taking place in polymeric insulation materials under the existence of water and application of electric stress, but water tree is not easy to observe, o water tree features in power cables are shown after cutting and dying with methyleneblue. In previous method, it is impossible to acquire continuous treeing data, and when the insulation material has been cut, the micro crack(water tree) has been damaged. In this paper, to overcome these deflects, the etching method is made use of making needle electrode about 170[${\mu}{\textrm}{m}$] diameter, and AgNO$_3$(silver nitrate) solution is used as liquid electrode to accelerate the growth of water trees. As a result of this study, water tree is observed in real-time with microscope. Electrical tree owing to water treeing is initiated at low electric field and grown with discontinuous. Namely, water tree is shown up a different characteristics of tree growth.

  • PDF

A Scheduling Algorithm for Performance Enhancement of Science Data Center Network based on OpenFlow (오픈플로우 기반의 과학실험데이터센터 네트워크의 성능 향상을 위한 스케줄링 알고리즘)

  • Kong, Jong Uk;Min, Seok Hong;Lee, Jae Yong;Kim, Byung Chul
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.21 no.9
    • /
    • pp.1655-1665
    • /
    • 2017
  • Recently data centers are being constructed actively by many cloud service providers, enterprises, research institutes, etc. Generally, they are built on tree topology using ECMP data forwarding scheme for load balancing. In this paper, we examine data center network topologies like tree topology and fat-tree topology, and load balancing technologies like MLAG and ECMP. Then, we propose a scheduling algorithm to efficiently transmit particular files stored on the hosts in the data center to the destination node outside the data center, where fat-tree topology and OpenFlow protocol between infrastructure layer and control layer are used. We run performance analysis by numerical method, and compare the analysis results with those of ECMP. Through the performance comparison, we show the outperformance of the proposed algorithm in terms of throughput and file transfer completion time.