• 제목/요약/키워드: Tree Compare

검색결과 403건 처리시간 0.024초

A Decision Tree Approach for Identifying Defective Products in the Manufacturing Process

  • Choi, Sungsu;Battulga, Lkhagvadorj;Nasridinov, Aziz;Yoo, Kwan-Hee
    • International Journal of Contents
    • /
    • 제13권2호
    • /
    • pp.57-65
    • /
    • 2017
  • Recently, due to the significance of Industry 4.0, the manufacturing industry is developing globally. Conventionally, the manufacturing industry generates a large volume of data that is often related to process, line and products. In this paper, we analyzed causes of defective products in the manufacturing process using the decision tree technique, that is a well-known technique used in data mining. We used data collected from the domestic manufacturing industry that includes Manufacturing Execution System (MES), Point of Production (POP), equipment data accumulated directly in equipment, in-process/external air-conditioning sensors and static electricity. We propose to implement a model using C4.5 decision tree algorithm. Specifically, the proposed decision tree model is modeled based on components of a specific part. We propose to identify the state of products, where the defect occurred and compare it with the generated decision tree model to determine the cause of the defect.

Comparison Architecture for Large Number of Genomic Sequences

  • Choi, Hae-won;Ryoo, Myung-Chun;Park, Joon-Ho
    • 정보화연구
    • /
    • 제9권1호
    • /
    • pp.11-19
    • /
    • 2012
  • Generally, a suffix tree is an efficient data structure since it reveals the detailed internal structures of given sequences within linear time. However, it is difficult to implement a suffix tree for a large number of sequences because of memory size constraints. Therefore, in order to compare multi-mega base genomic sequence sets using suffix trees, there is a need to re-construct the suffix tree algorithms. We introduce a new method for constructing a suffix tree on secondary storage of a large number of sequences. Our algorithm divides three files, in a designated sequence, into parts, storing references to the locations of edges in hash tables. To execute experiments, we used 1,300,000 sequences around 300Mbyte in EST to generate a suffix tree on disk.

1H*-tree: 데이터 스트림의 다차원 분석을 위한 개선된 데이터 큐브 구조 (1H*-tree: An Improved Data Cube Structure for Multi-dimensional Analysis of Data Streams)

  • 심상예;정우상;이연;신승선;이동욱;배혜영
    • 한국정보처리학회:학술대회논문집
    • /
    • 한국정보처리학회 2008년도 추계학술발표대회
    • /
    • pp.332-335
    • /
    • 2008
  • In this paper, based on H-tree, which is proposed as the basic data cube structure for multi-dimensional data stream analysis, we have done some analysis. We find there are a lot of redundant nodes in H-tree, and the tree-build method can be improved for saving not only memory, but also time used for inserting tuples. Also, to facilitate more fast and large amount of data stream analysis, which is very important for stream research, H*-tree is designed and developed. Our performance study compare the proposed H*-tree and H-tree, identify that H*-tree can save more memory and time during inserting data stream tuples.

글로벌 라우팅 유전자 알고리즘의 설계와 구현 (Design and Implementation of a Genetic Algorithm for Global Routing)

  • 송호정;송기용
    • 융합신호처리학회논문지
    • /
    • 제3권2호
    • /
    • pp.89-95
    • /
    • 2002
  • 글로벌 라우팅(global routing)은 VLSI 설계 과정중의 하나로, 네트리스트의 모든 네트들을 연결하기 위하여 각 네트들을 라우팅 영역(routing area)에 할당시키는 문제이며, 글로벌 라우팅에서 최적의 해를 얻기 위해 maze routing 알고리즘, line-probe 알고리즘, shortest path 기반 알고리즘, Steiner tree 기반 알고리즘등이 이용된다. 본 논문에서는 라우팅 그래프에서 최단 경로 Steiner tree 탐색방법인 weighted network heuristic(WNH)과 이를 기반으로 하는 글로벌 라우팅 유전자 알고리즘(genetic algorithm; GA)을 제안하였으며, 제안한 방식을 시뮬레이티드 어닐링(SA) 방식과 비교, 분석하였다.

  • PDF

Tree size determination for classification ensemble

  • Choi, Sung Hoon;Kim, Hyunjoong
    • Journal of the Korean Data and Information Science Society
    • /
    • 제27권1호
    • /
    • pp.255-264
    • /
    • 2016
  • Classification is a predictive modeling for a categorical target variable. Various classification ensemble methods, which predict with better accuracy by combining multiple classifiers, became a powerful machine learning and data mining paradigm. Well-known methodologies of classification ensemble are boosting, bagging and random forest. In this article, we assume that decision trees are used as classifiers in the ensemble. Further, we hypothesized that tree size affects classification accuracy. To study how the tree size in uences accuracy, we performed experiments using twenty-eight data sets. Then we compare the performances of ensemble algorithms; bagging, double-bagging, boosting and random forest, with different tree sizes in the experiment.

LiDAR 데이터를 이용한 산림구조 분석 - 오산시 남촌동의 산림을 대상으로 - (Analysis of Forest Structure Using LiDAR Data - A Case Study of Forest in Namchon-Dong, Osan -)

  • 이동근;류지은;김은영;전성우
    • 환경영향평가
    • /
    • 제17권5호
    • /
    • pp.279-288
    • /
    • 2008
  • Vertical forest distribution is one of the important factors to understand various ecological mechanism such as succession, disturbance and environmental effects. LiDAR data provide information, both the horizontal and vertical distribution of forest structure. The laser scanner survey provided a point cloud, in which the x, y, and z coordinates of the points are known. The objectives of this study were 1) to analyze factors of forest structure such as individual tree isolation, tree height, canopy closure and tree density using LiDAR data and 2) to compare the forest structure between outer and interior forest. The paper conducted to extract the individual tree using watershed algorithm and to interpolate using the first return of LiDAR data for yielding digital surface model (DSM). The results of the study show characters of edge such as more isolated individual trees, higher density, lower canopy closure, and lower tree height than those of interior forest. LiDAR data is to be useful for analyzing of forest structure. Further study should be undertaken with species for more accurate results.

새만금 간척지 수림대 조성 방안 - 곰솔과 졸참나무의 초기 생장량 분석 - (Plan to Construct Tree Belt around Saemangeum Reclaimed Land - Analysis of Initial Growth Amount of Pinus thunbergii and Quercus serrata -)

  • 김현
    • 한국환경복원기술학회지
    • /
    • 제20권1호
    • /
    • pp.117-129
    • /
    • 2017
  • This research was conducted to construct a tree belt around Saemangeum reclaimed land using various planting methods and to analyze initial growth amount, to provide practical data to construct tree belt of various purposes. Tree species used in tree belt construction were Pinus thunbergii and Quercus serrata, and the main planting treatment methods used were categorized by existence of windy fence, mixed planting, and un-mixed planting. Growth amount analysis was conducted using ANOVA to compare growth amounts in different experimental groups and Duncan's multiple range test. Growth amount analysis results of tree belt by planting method showed that it is most statistically plausible to install 50% porous windy fence from the direction of wind and frost, followed by planting P. thunbergii and Q. serrata in areas that require mixed tree species tree belt around Saemangeum reclaimed land. In areas where un-mixed planting tree belt is required, it was appropriate to use P. thunbergii alone without a windy fence. Lastly, if the purpose of the tree belt is limited to rapid growth, it was most ideal to plant P. thunbergii alone (without windy fence) or install 50% porous windy fence from the direction of wind and frost, followed by planting P. thunbergii and Q. serrata. This research is based on initial growth amount of tree belt and there is a need for a long-term monitoring of tree belt growth to increase tree-planting success rate in establishing tree belt according to Saemangeum internal development.

Feature-Based Image Retrieval using SOM-Based R*-Tree

  • Shin, Min-Hwa;Kwon, Chang-Hee;Bae, Sang-Hyun
    • 한국산학기술학회:학술대회논문집
    • /
    • 한국산학기술학회 2003년도 Proceeding
    • /
    • pp.223-230
    • /
    • 2003
  • Feature-based similarity retrieval has become an important research issue in multimedia database systems. The features of multimedia data are useful for discriminating between multimedia objects (e 'g', documents, images, video, music score, etc.). For example, images are represented by their color histograms, texture vectors, and shape descriptors, and are usually high-dimensional data. The performance of conventional multidimensional data structures(e'g', R- Tree family, K-D-B tree, grid file, TV-tree) tends to deteriorate as the number of dimensions of feature vectors increases. The R*-tree is the most successful variant of the R-tree. In this paper, we propose a SOM-based R*-tree as a new indexing method for high-dimensional feature vectors.The SOM-based R*-tree combines SOM and R*-tree to achieve search performance more scalable to high dimensionalities. Self-Organizing Maps (SOMs) provide mapping from high-dimensional feature vectors onto a two dimensional space. The mapping preserves the topology of the feature vectors. The map is called a topological of the feature map, and preserves the mutual relationship (similarity) in the feature spaces of input data, clustering mutually similar feature vectors in neighboring nodes. Each node of the topological feature map holds a codebook vector. A best-matching-image-list. (BMIL) holds similar images that are closest to each codebook vector. In a topological feature map, there are empty nodes in which no image is classified. When we build an R*-tree, we use codebook vectors of topological feature map which eliminates the empty nodes that cause unnecessary disk access and degrade retrieval performance. We experimentally compare the retrieval time cost of a SOM-based R*-tree with that of an SOM and an R*-tree using color feature vectors extracted from 40, 000 images. The result show that the SOM-based R*-tree outperforms both the SOM and R*-tree due to the reduction of the number of nodes required to build R*-tree and retrieval time cost.

  • PDF

RFID 시스템에서의 태그 인식 알고리즘 성능분석 (Performance Analysis of Tag Identification Algorithm in RFID System)

  • 최호승;김재현
    • 대한전자공학회논문지TC
    • /
    • 제42권5호
    • /
    • pp.47-54
    • /
    • 2005
  • 본 논문은 RFID 시스템에서의 태그 Anti-collision 알고리즘을 제안하고 분석한다. 제안한 RFID 시스템에서의 Anti-collision 알고리즘과 기존의 이진 방식 알고리즘들(이진 탐색 알고리즘, time slot을 이용한 slotted 이진 트리 알고리즘, Auto-ID 센터에서 제안한 bit-by-bit 이진 트리 알고리즘)을 수학적으로 비교하고 분석하였다. 수학적 분석 결과는 OPNET 모의실험을 통하여 그 결과를 검증하였다. 분석 결과에 의하면 제안한 Improved bit-by-bit 이진 트리 알고리즘의 성능이 기존의 Anti-collision 알고리즘 중 가장 좋은 성능을 보이는 bit-by-bit 이진 트리 알고리즘과 비교할 때 리더의 전송요구에 응답한 태그의 개수가 20개일 경우에는 약 $304\%$정도의 성능향상이 있었으며 리더의 전송요구에 응답한 태그의 개수가 200개일 경우에는 $839\%$의 성능향상이 있었다.

KDBcs-트리 : 캐시를 고려한 효율적인 KDB-트리 (KDBcs-Tree : An Efficient Cache Conscious KDB-Tree for Multidimentional Data)

  • 여명호;민영수;유재수
    • 한국정보과학회논문지:데이타베이스
    • /
    • 제34권4호
    • /
    • pp.328-342
    • /
    • 2007
  • 본 논문에서는 데이타의 갱신이 빈번한 상황에서 데이타의 갱신을 효율적으로 처리하기 위한 색인 기법을 제안한다. 제안하는 색인구조는 대표적인 공간 분할 색인 기법 중 하나인 KDB-트리를 기반으로 하고 있으며, 캐시의 활용도를 높이기 위한 데이타 압축 기법과 포인터 제거 기법을 제안한다. 제안하는 기법의 우수성을 보이기 위해서 기존의 대표적인 캐시를 고려한 색인 구조중 하나인 CR-트리와 실험을 통해 성능을 비교하였으며, 성능평가 결과, 제안하는 색인 구조는 삽입 성능과 갱신 성능, 캐시 활용도 면에서 기존 색인 기법에 비해 각각 85%, 97%, 86% 의 성능이 향상되었다.