• Title/Summary/Keyword: Tree data

Search Result 3,342, Processing Time 0.034 seconds

High Utility Pattern Mining using a Prefix-Tree (Prefix-Tree를 이용한 높은 유틸리티 패턴 마이닝 기법)

  • Jeong, Byeong-Soo;Ahmed, Chowdhury Farhan;Lee, In-Gi;Yong, Hwan-Seong
    • Journal of KIISE:Databases
    • /
    • v.36 no.5
    • /
    • pp.341-351
    • /
    • 2009
  • Recently high utility pattern (HUP) mining is one of the most important research issuer in data mining since it can consider the different weight Haloes of items. However, existing mining algorithms suffer from the performance degradation because it cannot easily apply Apriori-principle for pattern mining. In this paper, we introduce new high utility pattern mining approach by using a prefix-tree as in FP-Growth algorithm. Our approach stores the weight value of each item into a node and utilizes them for pruning unnecessary patterns. We compare the performance characteristics of three different prefix-tree structures. By thorough experimentation, we also prove that our approach can give performance improvement to a degree.

An Acceleration Technique of Terrain Rendering using GPU-based Chunk LOD (GPU 기반의 묶음 LOD 기법을 이용한 지형 렌더링의 가속화 기법)

  • Kim, Tae-Gwon;Lee, Eun-Seok;Shin, Byeong-Seok
    • Journal of Korea Multimedia Society
    • /
    • v.17 no.1
    • /
    • pp.69-76
    • /
    • 2014
  • It is hard to represent massive terrain data in real-time even using recent graphics hardware. In order to process massive terrain data, mesh simplification method such as continuous Level-of-Detail is commonly used. However, existing GPU-based methods using quad-tree structure such as geometry splitting, produce lots of vertices to traverse the quad-tree and retransmit those vertices back to the GPU in each tree traversal. Also they have disadvantage of increase of tree size since they construct the tree structure using texture. To solve the problem, we proposed GPU-base chunked LOD technique for real-time terrain rendering. We restrict depth of tree search and generate chunks with tessellator in GPU. By using our method, we can efficiently render the terrain by generating the chunks on GPU and reduce the computing time for tree traversal.

Effects of Packet-Scatter on TCP Performance in Fat-Tree (Fat-Tree에서의 패킷분산이 TCP 성능에 미치는 영향)

  • Lim, Chansook
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.12 no.6
    • /
    • pp.215-221
    • /
    • 2012
  • To address the bottleneck problem in data center networks, there have been several proposals for network architectures providing high path-diversity. In devising new schemes to utilize multiple paths, one must consider the effects on TCP performance because packet reordering can make TCP perform poorly. Therefore most schemes prevent packet reordering by sending packets through one of multiple available paths. In this study we show that packet reordering does not occur severely enough to have a significant impact on TCP performance when scattering packets through all available paths between a pair of hosts in Fat-Tree. Simulation results imply that it is possible to find a low-cost solution to the TCP performance problem for Fat-Tree-like topologies.

Breast Cancer Diagnosis using Naive Bayes Analysis Techniques (Naive Bayes 분석기법을 이용한 유방암 진단)

  • Park, Na-Young;Kim, Jang-Il;Jung, Yong-Gyu
    • Journal of Service Research and Studies
    • /
    • v.3 no.1
    • /
    • pp.87-93
    • /
    • 2013
  • Breast cancer is known as a disease that occurs in a lot of developed countries. However, in recent years, the incidence of Korea's modern woman is increased steadily. As well known, breast cancer usually occurs in women over 50. In the case of Korea, however, the incidence of 40s with young women is increased steadily than the West. Therefore, it is a very urgent task to build a manual to the accurate diagnosis of breast cancer in adult women in Korea. In this paper, we show how using data mining techniques to predict breast cancer. Data mining refers to the process of finding regular patterns or relationships among variables within the database. To this, sophisticated analysis using the model, you will find useful information that is easily revealed. In this paper, through experiments Deicion Tree Naive Bayes analysis techniques were compared using analysis techniques to diagnose breast cancer. Two algorithms was analyzed by applying C4.5 algorithm. Deicison Tree classification accuracy was fairly good. Naive Bayes classification method showed better accuracy compared to the Decision Tree method.

  • PDF

Individual Tree Growth Models for Natural Mixed Forests in Changbai Mountains, Northeast China

  • Lu, Jun;Li, Fengri
    • Journal of Korean Society of Forest Science
    • /
    • v.96 no.2
    • /
    • pp.160-169
    • /
    • 2007
  • The data used to develop distance-independent individual models for natural mixed forests were collected from 712 remeasured permanent sample plots (25,526 trees) of 10-year periodic from 1990 to 2000 in Baihe Forest Bureau of Changbai Mountains, northeast China. Based on analyzing relationship between diameter increment of individual trees with tree size, competitive status, and site condition, the diameter growth models for individual trees of 15 species growing in mixed-species uneven-aged forest stands, that have simple form, good predicting precision, and easily applicable, were developed using stepwise regression method. The main variables influencing on diameter increment of individual trees were tree size and competition, however, the site conditions were not significantly related with diameter increment. The tree size variables (lnDBH and $DBH^2$) were the most significant and important predictors of diameter growth existing in all 15 growth models. The diameter increment was directly proportional to tree diameter for each species. For the competitive factors in growth model, the relative diameter (RD), canopy closure (P), and the ratio of diameter of subject tree with maximum diameter (DDM) were contributed to the diameter increment at a certain extent. Other measures of stand density, such as basal area of stand (G) and stand density index (SDI), were not significantly influenced on diameter increment. Site factors, such as site index, slope and aspect were not important to diameter increment and excluded in the final models. The total variance explained by the final models of squared diameter increment ($R^2$) for all 15 species ranged from 35% to 72% and these results compared quit closely with those of Wykoff (1990) for mixed conifer stands. Using independent data set, validation measures were evaluated for predicting models of diameter increment developed in this study. The result indicated that the estimated precision was all greater than 94% and the models were suitable to describe diameter increment.

Environmental Factors Influencing Tree Species Regeneration in Different Forest Stands Growing on a Limestone Hill in Phrae Province, Northern Thailand

  • Asanok, Lamthai;Marod, Dokrak
    • Journal of Forest and Environmental Science
    • /
    • v.32 no.3
    • /
    • pp.237-252
    • /
    • 2016
  • Improved knowledge of the environmental factors affecting the natural regeneration of tree species in limestone forest is urgently required for species conservation. We examined the environmental factors and tree species characteristics that are important for colonization in diverse forest stands growing on a limestone hill in northern Thailand. Our analysis estimated the relative influence of forest structure and environmental factors on the regeneration traits of tree species. We established sixty-four $100-m^2$ plots in four forest stands on the limestone hill. We determined the species composition of canopy trees, regenerating seedlings, and saplings in relation to the physical environment. The relationships between environmental variables and tree species abundance were assessed by canonical correspondence analysis (CCA), and we used generalized linear mixed models to examine data on seedling/sapling abundances. The CCA ordination indicated that the abundance of tree species within the mixed deciduous forest was closely related to soil depth. The abundances of tree species growing within the sink-hole and hill-slope stands were positively related to the extent of rocky outcropping; light and soil moisture positively influenced the abundance of tree species in the hill-cliff stand. Physical factors had a greater effect on tree regeneration than did factors related to forest structure. Tree species, such as Ficus macleilandii, Dracaena cochinchinensis, and Phyllanthus mirabilis within the hill-cliff or sink-hole stand, colonized well on large rocky outcroppings that were well illuminated and had soft soils. These species regenerated well under conditions prevailing on the limestone hill. The colonization of several species in other stands was negatively influenced by environmental conditions at these sites. We found that natural regeneration of tree species on the limestone hill was difficult because of the prevailing combination of physical and biological factors. The influence of these factors was species dependent, and the magnitude of effects varied across forest stands.

Unusual data local access using inverse order tree (역순트리를 이용한 특이데이터 국소적 접근)

  • Rim, Kwang-Cheol;Seol, Jung-Ja
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.18 no.3
    • /
    • pp.595-601
    • /
    • 2014
  • With the advent of the Smart information-communication era, the number of data has increased exponentially. Accordingly, figuring out and analyzing in which area and circumstance the data has been created becomes one of the factors for prompt actions. In this paper identifies how to analyze the data by implementing a route from the lowest module to highest one in an inverse order for the part judgement for the particular data. The script first identifies cluster analisys, paralizes the analysis using the sum of each factors of the cluster with the tree structure, and finally transpose the answer into number. Also, it is designed to place priority on particular answer, thereafter, draws the wanted answer real-time.

Development of Customized Strategy for Enhancing Automobile Repurchase Using Data Mining Techniques (자동차 재구매 증진을 위한 데이터 마이닝 기반의 맞춤형 전략 개발)

  • Lee, Dong-Wook;Choi, Keun-Ho;Yoo, Dong-Hee
    • The Journal of Information Systems
    • /
    • v.26 no.3
    • /
    • pp.47-61
    • /
    • 2017
  • Purpose Although automobile production has increased since the development of the Korean automobile industry, the number of customers who can purchase automobiles decreases relatively. Therefore, automobile companies need to develop strategies to attract customers and promote their repurchase behaviors. To this end, this paper analyzed customer data from a Korean automobile company using data mining techniques to derive repurchase strategies. Design/methodology/approach We conducted under-sampling to balance the collected data and generated 10 datasets. We then implemented prediction models by applying a decision tree, naive Bayesian, and artificial neural network algorithms to each of the datasets. As a result, we derived 10 patterns consisting of 11 variables affecting customers' decisions about repurchases from the decision tree algorithm, which yielded the best accuracy. Using the derived patterns, we proposed helpful strategies for improving repurchase rates. Findings From the top 10 repurchase patterns, we found that 1) repurchases in January are associated with a specific residential region, 2) repurchases in spring or autumn are associated with whether it is a weekend or not, 3) repurchases in summer are associated with whether the automobile is equipped with a sunroof or not, and 4) a customized promotion for a specific occupation increases the number of repurchases.

Development of an Expert System for Prevention of Industrial Accidents in Manufacturing Industries (제조업에서의 산업재해 예방을 위한 전문가 시스템 개발)

  • Leem Young-Moon;Choi Yo-Han
    • Journal of the Korea Safety Management & Science
    • /
    • v.8 no.1
    • /
    • pp.53-64
    • /
    • 2006
  • Many researches and analyses have been focused on industrial accidents in order to predict and reduce them. As a similar endeavor, this paper is to develop an expert system for prevention of industrial accidents. Although various previous studies have been performed to prevent industrial accidents, these studies only provide managerial and educational policies using frequency analysis and comparative analysis based on data from past industrial accidents. As an initial step for the purpose of this study, this paper provides a comparative analysis of 4 kinds of algorithms including CHAID, CART, C4.5, and QUEST. Decision tree algorithm is utilized to predict results using objective and quantified data as a typical technique of data mining. Enterprise Miner of SAS and Answer Tree of SPSS will be used to evaluate the validity of the results of the four algorithms. The sample for this work was chosen from 10,536 data related to manufacturing industries during three years$(2002\sim2004)$ in korea. The initial sample includes a range of different businesses including the construction and manufacturing industries, which are typically vulnerable to industrial accidents.

A Feature Analysis of Industrial Accidents Using C4.5 Algorithm (C4.5 알고리즘을 이용한 산업 재해의 특성 분석)

  • Leem, Young-Moon;Kwag, Jun-Koo;Hwang, Young-Seob
    • Journal of the Korean Society of Safety
    • /
    • v.20 no.4 s.72
    • /
    • pp.130-137
    • /
    • 2005
  • Decision tree algorithm is one of the data mining techniques, which conducts grouping or prediction into several sub-groups from interested groups. This technique can analyze a feature of type on groups and can be used to detect differences in the type of industrial accidents. This paper uses C4.5 algorithm for the feature analysis. The data set consists of 24,887 features through data selection from total data of 25,159 taken from 2 year observation of industrial accidents in Korea For the purpose of this paper, one target value and eight independent variables are detailed by type of industrial accidents. There are 222 total tree nodes and 151 leaf nodes after grouping. This paper Provides an acceptable level of accuracy(%) and error rate(%) in order to measure tree accuracy about created trees. The objective of this paper is to analyze the efficiency of the C4.5 algorithm to classify types of industrial accidents data and thereby identify potential weak points in disaster risk grouping.