• Title/Summary/Keyword: Data Tree

Search Result 3,320, Processing Time 0.031 seconds

On the Tree Model grown by one-sided purity (단측 순수성에 의한 나무모형의 성장에 대하여)

  • 김용대;최대우
    • Journal of Intelligence and Information Systems
    • /
    • v.7 no.1
    • /
    • pp.17-25
    • /
    • 2001
  • Tree model is the most popular classification algorithm in data mining due to easy interpretation of the result. In CART(Breiman et al., 1984) and C4.5(Quinlan, 1993) which are representative of tree algorithms, the split fur classification proceeds to attain the homogeneous terminal nodes with respect to the composition of levels in target variable. But, fur instance, in the chum prediction modeling fur CRM(Customer Relationship management), the rate of churn is generally very low although we are interested in mining the churners. Thus it is difficult to get accurate prediction modes using tree model based on the traditional split rule, such as mini or deviance. Buja and Lee(1999) introduced a new split rule, one-sided purity for classifying minor interesting group. In this paper, we compared one-sided purity with traditional split rule, deviance analyzing churning vs. non-churning data of ISP company. Also reviewing the result of tree model based on one-sided purity with some simulated data, we discussed problems and researchable topics.

  • PDF

Case Study of CRM Application Using Improvement Method of Fuzzy Decision Tree Analysis (퍼지의사결정나무 개선방법을 이용한 CRM 적용 사례)

  • Yang, Seung-Jeong;Rhee, Jong-Tae
    • The Journal of the Korea Contents Association
    • /
    • v.7 no.8
    • /
    • pp.13-20
    • /
    • 2007
  • Decision tree is one of the most useful analysis methods for various data mining functions, including prediction, classification, etc, from massive data. Decision tree grows by splitting nodes, during which the purity increases. It is needed to stop splitting nodes when the purity does not increase effectively or new leaves does not contain meaningful number of records. Pruning is done if a branch does not show certain level of performance. By pruning, the structure of decision tree is changed and it is implied that the previous splitting of the parent node was not effective. It is also implied that the splitting of the ancestor nodes were not effective and the choices of attributes and criteria in splitting them were not successful. It should be noticed that new attributes or criteria might be selected to split such nodes for better tries. In this paper, we suggest a procedure to modify decision tree by Fuzzy theory and splitting as an integrated approach.

Garbage Collection Technique for Non-volatile Memory by Using Tree Data Structure (트리 자료구조를 이용한 비 휘발성 메모리의 가비지 수집 기법)

  • Lee, Dokeun;Won, Youjip
    • Journal of KIISE
    • /
    • v.43 no.2
    • /
    • pp.152-162
    • /
    • 2016
  • Most traditional garbage collectors commonly use the language level metadata, which is designed for pointer type searching. However, because it is difficult to use this metadata in non-volatile memory allocation platforms, a new garbage collection technique is essential for non-volatile memory utilization. In this paper, we design new metadata for managing information regarding non-volatile memory allocation called "Allocation Tree". This metadata is comprised of tree data structure for fast information lookup and a node that holds an allocation address and an object ID pair in key-value form. The Garbage Collector starts collecting when there are insufficient non-volatile memory spaces, and it compares user data and the allocation tree for garbage detection. We develop this algorithm in a persistent heap based non-volatile memory allocation platform called "HEAPO" for demonstration.

Intelligent Production Management System with the Enhanced PathTree (개선된 패스트리를 이용한 지능형 생산관리 시스템)

  • Kwon, Kyung-Lag;Ryu, Jae-Hwan;Sohn, Jong-Soo;Chung, In-Jeong
    • The KIPS Transactions:PartD
    • /
    • v.16D no.4
    • /
    • pp.621-630
    • /
    • 2009
  • In recent years, there have been many attempts to connect the latest RFID (Radio Frequency Identification) technology with EIS (Enterprise Information System) and utilize them. However, in most cases the focus is only on the simultaneous multiple reading capability of the RFID technology neglecting the management of massive data created from the reader. As a result, it is difficult to obtain time-related information such as flow prediction and analysis in process control. In this paper, we suggest a new method called 'procedure tree', an enhanced and complementary version of PathTree which is one of RFID data mining techniques, to manage massive RFID data sets effectively and to perform a real-time process control efficiently. We will evaluate efficiency of the proposed system after applying real-time process management system connected with the RFID-based EIS. Through the suggested method, we are able to perform such tasks as prediction or tracking of process flow for real-time process control and inventory management efficiently which the existing RFID-based production system could not have done.

Analysis of employee's satisfaction factor in working environment using data mining algorithm (데이터 마이닝 기법을 이용한 피고용자의 근로환경 만족도 요인 분석)

  • Lee, Dong Ryeol;Kim, Tae Ho;Lee, HongChul
    • Journal of the Korea Safety Management & Science
    • /
    • v.16 no.4
    • /
    • pp.275-284
    • /
    • 2014
  • Decision Tree is one of analysis techniques which conducts grouping and prediction into several sub-groups from interested groups. Researcher can easily understand this progress and explain than other techniques. Because Decision Tree is easy technique to see results. This paper uses CART algorithm which is one of data mining technique. It used 273 variables and 70094 data(2010-2011) of working environment survey conducted by Korea Occupational Safety and Health Agency(KOSHA). And then refines this data, uses final 12 variables and 35447 data. To find satisfaction factor in working environment, this page has grouped employee to 3 types (under 30 age, 30 ~ 49age, over 50 age) and analyzed factor. Using CART algorithm, finds the best grouping variables in 155 data. It appeared that 'comfortable in organization' and 'proper reward' is the best grouping factor.

Cyber Shopping Mall Customer Segmentation

  • Koh, Bong-Sung;Kim, Yeon-Hyong
    • Journal of the Korean Data and Information Science Society
    • /
    • v.13 no.1
    • /
    • pp.121-127
    • /
    • 2002
  • The volume of electronic commerce based on Internet and network traffic is increasing rapidly. The objective of this study is to examine the current status of the exponentially multiplying cyber-shopping mall phenomenon. To this end, data obtained from a single cyber-shopping mall exemplified customer purchasing behavior and provided decision tree and correspondence analysis derived customer segmentation and merchandise.

  • PDF

Development of Discriminant Analysis System by Graphical User Interface of Visual Basic

  • Lee, Yong-Kyun;Shin, Young-Jae;Cha, Kyung-Joon
    • Journal of the Korean Data and Information Science Society
    • /
    • v.18 no.2
    • /
    • pp.447-456
    • /
    • 2007
  • Recently, the multivariate statistical analysis has been used to analyze meaningful information for various data. In this paper, we develope the multivariate statistical analysis system combined with Fisher discriminant analysis, logistic regression, neural network, and decision tree using visual basic 6.0.

  • PDF

Symbolic tree based model for HCC using SNP data (악성간암환자의 유전체자료 심볼릭 나무구조 모형연구)

  • Lee, Tae Rim
    • Journal of the Korean Data and Information Science Society
    • /
    • v.25 no.5
    • /
    • pp.1095-1106
    • /
    • 2014
  • Symbolic data analysis extends the data mining and exploratory data analysis to the knowledge mining, we can suggest the SDA tree model on clinical and genomic data with new knowledge mining SDA approach. Using SDA application for huge genomic SNP data, we can get the correlation the availability of understanding of hidden structure of HCC data could be proved. We can confirm validity of application of SDA to the tree structured progression model and to quantify the clinical lab data and SNP data for early diagnosis of HCC. Our proposed model constructs the representative model for HCC survival time and causal association with their SNP gene data. To fit the simple and easy interpretation tree structured survival model which could reduced from huge clinical and genomic data under the new statistical theory of knowledge mining with SDA.

Development of Boolean Operations for CAD System Kernel Supporting Non-manifold Models (비다양체 모델을 수용하는 CAD 시스템 커널을 위한 불리안 조직의 개발)

  • 김성환;이건우;김영진
    • Korean Journal of Computational Design and Engineering
    • /
    • v.1 no.1
    • /
    • pp.20-32
    • /
    • 1996
  • The boundary evaluation technique for Boolean operation on non-manifold models which is regarded as the most popular and powerful method to create and modify 3-D CAD models has been developed. This technique adopted the concept of Merge and Selection in which the CSG tree for Boolean operation can be edited quickly and easily. In this method, the merged set which contains complete information about primitive models involved is created by merging primitives one by one, then the alive entities are selected following the given CSG tree. This technique can support the hybrid representation of B-rep(Boundary Representation) and CSG(Constructive Solid Geometry) tree in a unified non-manifold model data structure, and expected to be used as a basic method for many modeling problems such as data representation of form features, and the interference between them, and data representation of conceptual models in design process, etc.

  • PDF

Environmental Predictors of Atopic Dermatitis in Children - Using Answer Tree Analysis - (아동 아토피 피부염을 예측하는 환경적 요인들 - 의사결정 나무분석의 적용 -)

  • Lee, Ju-Lie
    • Korean Journal of Child Studies
    • /
    • v.31 no.2
    • /
    • pp.183-195
    • /
    • 2010
  • This study sought to investigate the environmental predictors of atopic dermatitis in children. The participants were 1050 (age 3-5) children taken from data data from the Ministry for Health, Welfare and Family Affairs. A data mining decision tree model revealed that the factors of medical neglect, breakfast, attachment to mother, and mother's depression influenced atopic dermatitis in children. Our results revealed that in the factors considered above, medical neglect had the greatest influence upon atopic dermatitis in children.