Search | Korea Science

Interpretation of Data Mining Prediction Model Using Decision Tree

Kang, Hyuncheol;Han, Sang-Tae;Choi, Jong-Ho
- Communications for Statistical Applications and Methods
- /
- v.7 no.3
- /
- pp.937-943
- /
- 2000
Data mining usually deal with undesigned massive data containing many variables for which their characteristics and association rules are unknown, therefore it is actually not easy to interpret the results of analysis. In this paper, it is shown that decision tree can be very useful in interpreting data mining prediction model using two real examples.
PDF

The Transfer Technique among Decision Tree Models for Distributed Data Mining (분산형 데이터마이닝 구현을 위한 의사결정나무 모델 전송 기술)

Kim, Choong-Gon;Woo, Jung-Geun;Baik, Sung-Wook
- Journal of Digital Contents Society
- /
- v.8 no.3
- /
- pp.309-314
- /
- 2007
A decision tree algorithm should be modified to be suitable in distributed and collaborative environments for distributed data mining. The distributed data mining system proposed in this paper consists of several agents and a mediator. Each agent deals with a local data mining for data in each local site and communicates with one another to build the global decision tree model. The mediator helps several agents to efficiently communicate among them. One of advantages in distributed data mining is to save much time to analyze huge data with several agents. The paper focuses on a transfer technique among agents dealing with each local decision tree model to reduce huge overhead in communication among them.
PDF

Development of a Real-Time Mobile GIS using the HBR-Tree (HBR-Tree를 이용한 실시간 모바일 GIS의 개발)

Lee, Ki-Yamg;Yun, Jae-Kwan;Han, Ki-Joon
- Journal of Korea Spatial Information System Society
- /
- v.6 no.1 s.11
- /
- pp.73-85
- /
- 2004
Recently, as the growth of the wireless Internet, PDA and HPC, the focus of research and development related with GIS(Geographic Information System) has been changed to the Real-Time Mobile GIS to service LBS. To offer LBS efficiently, there must be the Real-Time GIS platform that can deal with dynamic status of moving objects and a location index which can deal with the characteristics of location data. Location data can use the same data type(e.g., point) of GIS, but the management of location data is very different. Therefore, in this paper, we studied the Real-Time Mobile GIS using the HBR-tree to manage mass of location data efficiently. The Real-Time Mobile GIS which is developed in this paper consists of the HBR-tree and the Real-Time GIS Platform HBR-tree. we proposed in this paper, is a combined index type of the R-tree and the spatial hash Although location data are updated frequently, update operations are done within the same hash table in the HBR-tree, so it costs less than other tree-based indexes Since the HBR-tree uses the same search mechanism of the R-tree, it is possible to search location data quickly. The Real-Time GIS platform consists of a Real-Time GIS engine that is extended from a main memory database system. a middleware which can transfer spatial, aspatial data to clients and receive location data from clients, and a mobile client which operates on the mobile devices. Especially, this paper described the performance evaluation conducted with practical tests if the HBR-tree and the Real-Time GIS engine respectively.
PDF

Use of Tree Traversal Algorithms for Chain Formation in the PEGASIS Data Gathering Protocol for Wireless Sensor Networks

Meghanathan, Natarajan
- KSII Transactions on Internet and Information Systems (TIIS)
- /
- v.3 no.6
- /
- pp.612-627
- /
- 2009
The high-level contribution of this paper is to illustrate the effectiveness of using graph theory tree traversal algorithms (pre-order, in-order and post-order traversals) to generate the chain of sensor nodes in the classical Power Efficient-Gathering in Sensor Information Systems (PEGASIS) data aggregation protocol for wireless sensor networks. We first construct an undirected minimum-weight spanning tree (ud-MST) on a complete sensor network graph, wherein the weight of each edge is the Euclidean distance between the constituent nodes of the edge. A Breadth-First-Search of the ud-MST, starting with the node located closest to the center of the network, is now conducted to iteratively construct a rooted directed minimum-weight spanning tree (rd-MST). The three tree traversal algorithms are then executed on the rd-MST and the node sequence resulting from each of the traversals is used as the chain of nodes for the PEGASIS protocol. Simulation studies on PEGASIS conducted for both TDMA and CDMA systems illustrate that using the chain of nodes generated from the tree traversal algorithms, the node lifetime can improve as large as by 19%-30% and at the same time, the energy loss per node can be 19%-35% lower than that obtained with the currently used distance-based greedy heuristic.
https://doi.org/10.3837/tiis.2009.06.003 인용 PDF

Learning Algorithm for Multiple Distribution Data using Haar-like Feature and Decision Tree (다중 분포 학습 모델을 위한 Haar-like Feature와 Decision Tree를 이용한 학습 알고리즘)

Kwak, Ju-Hyun;Woen, Il-Young;Lee, Chang-Hoon
- KIPS Transactions on Software and Data Engineering
- /
- v.2 no.1
- /
- pp.43-48
- /
- 2013
Adaboost is widely used for Haar-like feature boosting algorithm in Face Detection. It shows very effective performance on single distribution model. But when detecting front and side face images at same time, Adaboost shows it's limitation on multiple distribution data because it uses linear combination of basic classifier. This paper suggest the HDCT, modified decision tree algorithm for Haar-like features. We still tested the performance of HDCT compared with Adaboost on multiple distributed image recognition.
https://doi.org/10.3745/KTSDE.2013.2.1.043 인용 PDF KSCI

A Data Gathering Scheme using Dynamic Branch of Mobile Sink in Wireless Sensor Networks (무선 센서망에서 이동 싱크의 동적 브랜치를 통한 데이터 수집 방안)

Lee, Kil-Hung
- The Journal of The Korea Institute of Intelligent Transport Systems
- /
- v.11 no.1
- /
- pp.92-97
- /
- 2012
This paper suggests a data gathering scheme using dynamic branch tree in wireless sensor networks. A mobile sink gathers data from each sensor node using a dynamic data gathering tree rooted at the mobile sink node. As the sink moves, a tree that has multiple branch is formed and changed dynamically as with the position of the sink node. A hop-based scope filter and a restricted flooding scheme of the tree are also suggested. Simulation results show that the proposed data gathering scheme has better results in data arrival rate, the end-to-end delay and energy saving characteristics compared with the previous scheme.
https://doi.org/10.12815/kits.2012.11.1.092 인용 PDF KSCI

A study on decision tree creation using intervening variable (매개 변수를 이용한 의사결정나무 생성에 관한 연구)

Cho, Kwang-Hyun;Park, Hee-Chang
- Journal of the Korean Data and Information Science Society
- /
- v.22 no.4
- /
- pp.671-678
- /
- 2011
Data mining searches for interesting relationships among items in a given database. The methods of data mining are decision tree, association rules, clustering, neural network and so on. The decision tree approach is most useful in classification problems and to divide the search space into rectangular regions. Decision tree algorithms are used extensively for data mining in many domains such as retail target marketing, customer classification, etc. When create decision tree model, complicated model by standard of model creation and number of input variable is produced. Specially, there is difficulty in model creation and analysis in case of there are a lot of numbers of input variable. In this study, we study on decision tree using intervening variable. We apply to actuality data to suggest method that remove unnecessary input variable for created model and search the efficiency.
PDF KSCI

Decision Tree Techniques with Feature Reduction for Network Anomaly Detection (네트워크 비정상 탐지를 위한 속성 축소를 반영한 의사결정나무 기술)

Kang, Koohong
- Journal of the Korea Institute of Information Security & Cryptology
- /
- v.29 no.4
- /
- pp.795-805
- /
- 2019
Recently, there is a growing interest in network anomaly detection technology to tackle unknown attacks. For this purpose, diverse studies using data mining, machine learning, and deep learning have been applied to detect network anomalies. In this paper, we evaluate the decision tree to see its feasibility for network anomaly detection on NSL-KDD data set, which is one of the most popular data mining techniques for classification. In order to handle the over-fitting problem of decision tree, we select 13 features from the original 41 features of the data set using chi-square test, and then model the decision tree using TensorFlow and Scik-Learn, yielding 84% and 70% of binary classification accuracies on the KDDTest+ and KDDTest-21 of NSL-KDD test data set. This result shows 3% and 6% improvements compared to the previous 81% and 64% of binary classification accuracies by decision tree technologies, respectively.
https://doi.org/10.13089/JKIISC.2019.29.4.795 인용 PDF KSCI HTML

Urban Sprawl prediction in 2030 using decision tree (의사결정나무를 활용한 2030년 도시 확장 예측)

Kim, Geun-Han;Choi, Hee-Sun;Kim, Dong-Beom;Jung, Yee-Rim;Jin, Dae-Yong
- Journal of the Korean Society of Environmental Restoration Technology
- /
- v.23 no.6
- /
- pp.125-135
- /
- 2020
The uncontrolled urban expansion causes various social, economic problems and natural/environmental problems. Therefore, it is necessary to forecast urban expansion by identifying various factors related to urban expansion. This study aims to forecast it using a decision tree that is widely used in various areas. The study used geographic data such as the area of use, geographical data like elevation and slope, the environmental conservation value assessment map, and population density data for 2006 and 2018. It extracted the new urban expansion areas by comparing the residential, industrial, and commercial zones of the zoning in 2006 and 2018 and derived a decision tree using the 2006 data as independent variables. It is intended to forecast urban expansion in 2030 by applying the data for 2018 to the derived decision tree. The analysis result confirmed that the distance from the green area, the elevation, the grade of the environmental conservation value assessment map, and the distance from the industrial area were important factors in forecasting the urban area expansion. The AUC of 0.95051 showed excellent explanatory power in the ROC analysis performed to verify the accuracy. However, the forecast of the urban area expansion for 2018 using the decision tree was 15,459.98㎢, which was significantly different from the actual urban area of 4,144.93㎢ for 2018. Since many regions use decision tree to forecast urban expansion, they can be useful for identifying which factors affect urban expansion, although they are not suitable for forecasting the expansion of urban region in detail. Identifying such important factors for urban expansion is expected to provide information that can be used in future land, urban, and environmental planning.
https://doi.org/10.13087/kosert.2020.23.6.125 인용 PDF KSCI

Tree-structured Clustering for Mixed Data (혼합형 데이터에 대한 나무형 군집화)

Yang Kyung-Sook;Huh Myung-Hoe
- The Korean Journal of Applied Statistics
- /
- v.19 no.2
- /
- pp.271-282
- /
- 2006
The aim of this study is to propose a tree-structured clustering for mixed data. We suggest a scaling method to reduce the variable selection bias among categorical variables. In numerical examples such as credit data, German credit data, we note several differences between tree-structured clustering and K-means clustering.
https://doi.org/10.5351/KJAS.2006.19.2.271 인용 PDF KSCI

Search Result 3,320, Processing Time 0.025 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)