• 제목/요약/키워드: Data Tree

검색결과 3,320건 처리시간 0.024초

Modeling of Environmental Survey by Decision Trees

  • Park, Hee-Chang;Cho, Kwang-Hyun
    • Journal of the Korean Data and Information Science Society
    • /
    • 제15권4호
    • /
    • pp.759-771
    • /
    • 2004
  • The decision tree approach is most useful in classification problems and to divide the search space into rectangular regions. Decision tree algorithms are used extensively for data mining in many domains such as retail target marketing, fraud dection, data reduction and variable screening, category merging, etc. We analyze Gyeongnam social indicator survey data using decision tree techniques for environmental information. We can use these decision tree outputs for environmental preservation and improvement.

  • PDF

사상체질 임상자료 기반 의사결정나무 생성 알고리즘 비교 (Comparison among Algorithms for Decision Tree based on Sasang Constitutional Clinical Data)

  • 진희정;이수경;이시우
    • 한국한의학연구원논문집
    • /
    • 제17권2호
    • /
    • pp.121-127
    • /
    • 2011
  • Objectives : In the clinical field, it is important to understand the factors that have effects on a certain disease or symptom. For this, many researchers apply Data Mining method to the clinical data that they have collected. One of the efficient methods for Data Mining is decision tree induction. Many researchers have studied to find the best split criteria of decision tree; however, various split criteria coexist. Methods : In this paper, we applied several split criteria(Information Gain, Gini Index, Chi-Square) to Sasang constitutional clinical information and compared each decision tree in order to find optimal split criteria. Results & Conclusion : We found BMI and body measurement factors are important factors to Sasang constitution by analyzing produced decision trees with different split measures. And the decision tree using information gain had the highest accuracy. However, the decision tree that produced highest accuracy is changed depending on given data. So, researcher have to try to find proper split criteria for given data by understanding attribute of the given data.

Feature-Based Image Retrieval using SOM-Based R*-Tree

  • Shin, Min-Hwa;Kwon, Chang-Hee;Bae, Sang-Hyun
    • 한국산학기술학회:학술대회논문집
    • /
    • 한국산학기술학회 2003년도 Proceeding
    • /
    • pp.223-230
    • /
    • 2003
  • Feature-based similarity retrieval has become an important research issue in multimedia database systems. The features of multimedia data are useful for discriminating between multimedia objects (e 'g', documents, images, video, music score, etc.). For example, images are represented by their color histograms, texture vectors, and shape descriptors, and are usually high-dimensional data. The performance of conventional multidimensional data structures(e'g', R- Tree family, K-D-B tree, grid file, TV-tree) tends to deteriorate as the number of dimensions of feature vectors increases. The R*-tree is the most successful variant of the R-tree. In this paper, we propose a SOM-based R*-tree as a new indexing method for high-dimensional feature vectors.The SOM-based R*-tree combines SOM and R*-tree to achieve search performance more scalable to high dimensionalities. Self-Organizing Maps (SOMs) provide mapping from high-dimensional feature vectors onto a two dimensional space. The mapping preserves the topology of the feature vectors. The map is called a topological of the feature map, and preserves the mutual relationship (similarity) in the feature spaces of input data, clustering mutually similar feature vectors in neighboring nodes. Each node of the topological feature map holds a codebook vector. A best-matching-image-list. (BMIL) holds similar images that are closest to each codebook vector. In a topological feature map, there are empty nodes in which no image is classified. When we build an R*-tree, we use codebook vectors of topological feature map which eliminates the empty nodes that cause unnecessary disk access and degrade retrieval performance. We experimentally compare the retrieval time cost of a SOM-based R*-tree with that of an SOM and an R*-tree using color feature vectors extracted from 40, 000 images. The result show that the SOM-based R*-tree outperforms both the SOM and R*-tree due to the reduction of the number of nodes required to build R*-tree and retrieval time cost.

  • PDF

A Decision Tree Approach for Identifying Defective Products in the Manufacturing Process

  • Choi, Sungsu;Battulga, Lkhagvadorj;Nasridinov, Aziz;Yoo, Kwan-Hee
    • International Journal of Contents
    • /
    • 제13권2호
    • /
    • pp.57-65
    • /
    • 2017
  • Recently, due to the significance of Industry 4.0, the manufacturing industry is developing globally. Conventionally, the manufacturing industry generates a large volume of data that is often related to process, line and products. In this paper, we analyzed causes of defective products in the manufacturing process using the decision tree technique, that is a well-known technique used in data mining. We used data collected from the domestic manufacturing industry that includes Manufacturing Execution System (MES), Point of Production (POP), equipment data accumulated directly in equipment, in-process/external air-conditioning sensors and static electricity. We propose to implement a model using C4.5 decision tree algorithm. Specifically, the proposed decision tree model is modeled based on components of a specific part. We propose to identify the state of products, where the defect occurred and compare it with the generated decision tree model to determine the cause of the defect.

교호효과를 고려한 향상된 의사결정나무 알고리듬에 관한 연구 (Improved Decision Tree Algorithms by Considering Variables Interaction)

  • 권근섭;최경현
    • 대한산업공학회지
    • /
    • 제30권4호
    • /
    • pp.267-276
    • /
    • 2004
  • Much of previous attention on researches of the decision tree focuses on the splitting criteria and optimization of tree size. Nowadays the quantity of the data increase and relation of variables becomes very complex. And hence, this comes to have plenty number of unnecessary node and leaf. Consequently the confidence of the explanation and forecasting of the decision tree falls off. In this research report, we propose some decision tree algorithms considering the interaction of predictor variables. A generic algorithm, the k-1 Algorithm, dealing with the interaction with a combination of all predictor variable is presented. And then, the extended version k-k Algorithm which considers with the interaction every k-depth with a combination of some predictor variables. Also, we present an improved algorithm by introducing control parameter to the algorithms. The algorithms are tested by real field credit card data, census data, bank data, etc.

사상체질 임상정보 분석을 위한 웹 기반의 의사결정 나무 프로그램 개발 (Development of Decision Tree Program based on Web for Analyzing Clinical Information of Sasang Constitutional Medicine)

  • 진희정;김명근;김종열
    • 한국한의학연구원논문집
    • /
    • 제14권3호
    • /
    • pp.81-87
    • /
    • 2008
  • Sasanag Contitution Medicine(SCM) is the traditional medicine theory based on constitutional medicine in Korea. It is most import ant that a personal SCM type is determined accurately ahead of applying any Sasang treatments. For this, many researches have been studied to diagnose the SCM type using constitutional clinical data. The decision tree is a tree-structured data-mining methodology. Recently, in the Korean traditional medicine society, there have been several efforts to find diagnosing tools using the decision tree method. So, we developed a decision tree program based on web for analyzing constitutional clinical information. It can use various clinical data as input data, offer filtering function to select clinical data to be used. We can find useful factor to be influential on SCM types using this program.

  • PDF

An Application of Decision Tree Method for Fault Diagnosis of Induction Motors

  • Tran, Van Tung;Yang, Bo-Suk;Oh, Myung-Suck
    • 한국해양공학회:학술대회논문집
    • /
    • 한국해양공학회 2006년 창립20주년기념 정기학술대회 및 국제워크샵
    • /
    • pp.54-59
    • /
    • 2006
  • Decision tree is one of the most effective and widely used methods for building classification model. Researchers from various disciplines such as statistics, machine learning, pattern recognition, and data mining have considered the decision tree method as an effective solution to their field problems. In this paper, an application of decision tree method to classify the faults of induction motors is proposed. The original data from experiment is dealt with feature calculation to get the useful information as attributes. These data are then assigned the classes which are based on our experience before becoming data inputs for decision tree. The total 9 classes are defined. An implementation of decision tree written in Matlab is used for these data.

  • PDF

다중-속성 색인기법을 이용한 공간조인 연산의 성능 (Performance of Spatial Join Operations using Multi-Attribute Access Methods)

  • 황병연
    • Spatial Information Research
    • /
    • 제7권2호
    • /
    • pp.271-282
    • /
    • 1999
  • 본 논문에서느 다중-속성 데이터와 공간 조인 연산을 효율적으로 수행하는 색인기법인 SJ(Spatial Join) 트리를 제안한다. 또한, 다중-속성 데이터를 다루기 위한 기존의 다양한 알고리즘들을 계산 복잡도와 I/O 연산의 복잡도와 함께 설명한다. 우리는 이 논문을 통해서 제안된 SJ 트리가 기존의 데이터베이스 시스템에서 색인 기법으로 많이 사용되는 B-트리를 일반화한 것이라는 것을 보여준다. 이것은 SJ 트리가 기존의 대부분의 B-트리를 이용하는 저장구조에 쉽게 구현될 수 있다는 것을 의미한다. 공간 출력을 갖는 공간 조인 연산은 R-트리, B-트리, K-D-B 트리, SJ 트리에 대해서 성능평가를 수행한다. 성능평가 결과 제안된 SJ 트리가 점 데이터를 갖는 공간 조인 연산에 대해서 다른 색인 기법들보다 상대적으로 우수한 결과를 보여준다.

  • PDF

LiDAR 데이터를 이용한 산림구조 분석 - 오산시 남촌동의 산림을 대상으로 - (Analysis of Forest Structure Using LiDAR Data - A Case Study of Forest in Namchon-Dong, Osan -)

  • 이동근;류지은;김은영;전성우
    • 환경영향평가
    • /
    • 제17권5호
    • /
    • pp.279-288
    • /
    • 2008
  • Vertical forest distribution is one of the important factors to understand various ecological mechanism such as succession, disturbance and environmental effects. LiDAR data provide information, both the horizontal and vertical distribution of forest structure. The laser scanner survey provided a point cloud, in which the x, y, and z coordinates of the points are known. The objectives of this study were 1) to analyze factors of forest structure such as individual tree isolation, tree height, canopy closure and tree density using LiDAR data and 2) to compare the forest structure between outer and interior forest. The paper conducted to extract the individual tree using watershed algorithm and to interpolate using the first return of LiDAR data for yielding digital surface model (DSM). The results of the study show characters of edge such as more isolated individual trees, higher density, lower canopy closure, and lower tree height than those of interior forest. LiDAR data is to be useful for analyzing of forest structure. Further study should be undertaken with species for more accurate results.

A Lifetime-Preserving and Delay-Constrained Data Gathering Tree for Unreliable Sensor Networks

  • Li, Yanjun;Shen, Yueyun;Chi, Kaikai
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제6권12호
    • /
    • pp.3219-3236
    • /
    • 2012
  • A tree routing structure is often adopted for many-to-one data gathering and aggregation in sensor networks. For real-time scenarios, considering lossy wireless links, it is an important issue how to construct a maximum-lifetime data gathering tree with delay constraint. In this work, we study the problem of lifetime-preserving and delay-constrained tree construction in unreliable sensor networks. We prove that the problem is NP-complete. A greedy approximation algorithm is proposed. We use expected transmissions count (ETX) as the link quality indicator, as well as a measure of delay. Our algorithm starts from an arbitrary least ETX tree, and iteratively adjusts the hierarchy of the tree to reduce the load on bottleneck nodes by pruning and grafting its sub-tree. The complexity of the proposed algorithm is $O(N^4)$. Finally, extensive simulations are carried out to verify our approach. Simulation results show that our algorithm provides longer lifetime in various situations compared to existing data gathering schemes.