• Title/Summary/Keyword: Data Mining Process

Search Result 680, Processing Time 0.026 seconds

Development of a Knowledge Discovery System using Hierarchical Self-Organizing Map and Fuzzy Rule Generation

  • Koo, Taehoon;Rhee, Jongtae
    • Proceedings of the Korea Inteligent Information System Society Conference
    • /
    • 2001.01a
    • /
    • pp.431-434
    • /
    • 2001
  • Knowledge discovery in databases(KDD) is the process for extracting valid, novel, potentially useful and understandable knowledge form real data. There are many academic and industrial activities with new technologies and application areas. Particularly, data mining is the core step in the KDD process, consisting of many algorithms to perform clustering, pattern recognition and rule induction functions. The main goal of these algorithms is prediction and description. Prediction means the assessment of unknown variables. Description is concerned with providing understandable results in a compatible format to human users. We introduce an efficient data mining algorithm considering predictive and descriptive capability. Reasonable pattern is derived from real world data by a revised neural network model and a proposed fuzzy rule extraction technique is applied to obtain understandable knowledge. The proposed neural network model is a hierarchical self-organizing system. The rule base is compatible to decision makers perception because the generated fuzzy rule set reflects the human information process. Results from real world application are analyzed to evaluate the system\`s performance.

  • PDF

Analyzing Customer Management Data by Data Mining: Case Study on Chum Prediction Models for Insurance Company in Korea

  • Cho, Mee-Hye;Park, Eun-Sik
    • Journal of the Korean Data and Information Science Society
    • /
    • v.19 no.4
    • /
    • pp.1007-1018
    • /
    • 2008
  • The purpose of this case study is to demonstrate database-marketing management. First, we explore original variables for insurance customer's data, modify them if necessary, and go through variable selection process before analysis. Then, we develop churn prediction models using logistic regression, neural network and SVM analysis. We also compare these three data mining models in terms of misclassification rate.

  • PDF

A Study on the Method for Extracting the Purpose-Specific Customized Information from Online Product Reviews based on Text Mining (텍스트 마이닝 기반의 온라인 상품 리뷰 추출을 통한 목적별 맞춤화 정보 도출 방법론 연구)

  • Kim, Joo Young;Kim, Dong soo
    • The Journal of Society for e-Business Studies
    • /
    • v.21 no.2
    • /
    • pp.151-161
    • /
    • 2016
  • In the era of the Web 2.0, characterized by the openness, sharing and participation, it is easy for internet users to produce and share the data. The amount of the unstructured data which occupies most of the digital world's data has increased exponentially. One of the kinds of the unstructured data called personal online product reviews is necessary for both the company that produces those products and the potential customers who are interested in those products. In order to extract useful information from lots of scattered review data, the process of collecting data, storing, preprocessing, analyzing, and drawing a conclusion is needed. Therefore we introduce the text-mining methodology for applying the natural language process technology to the text format data like product review in order to carry out extracting structured data by using R programming. Also, we introduce the data-mining to derive the purpose-specific customized information from the structured review information drawn by the text-mining.

Implementation of Data Preparation System for Data Mining on Heterogenious Distributed Environment (이기종 분산환경에서 데이터마이닝을 위한 데이터준비 시스템 구현)

  • Lee sang hee;Lee won sup
    • Journal of the Korea Society of Computer and Information
    • /
    • v.9 no.3
    • /
    • pp.109-113
    • /
    • 2004
  • This paper is to investigate the efficiency of the process of data preparation for existing data mining tools, and present a design principle for a new efficient data preparation system . We compare the often used data mining tools based on the access method to local and remote databases, and on the exchange of information resources between different computers. The compared data mining tools are Answer Tree, Clementine, Enterprise Miner, and Weka. We propose a design principle for an efficient system for data preparation for data mining on the distributed networks.

  • PDF

Development of Evaluation Model in Business Incubator Using Data Mining Process (데이터마이닝을 이용한 창업보육센터의 평가모델 개발)

  • Lee, Dong-Youb;Kim, Jin-Wook
    • IE interfaces
    • /
    • v.20 no.3
    • /
    • pp.387-394
    • /
    • 2007
  • Numerous countries promote business programs to revitalize local economy, increase employment, and nurture high-tech industries. Recently, a number of business incubators have been established and operated with aims to adapt to changing environment and increase economic competitiveness in Korea. To give satisfactory results of governmental policy, the requirement to develop the evaluation model to support effective operations of business incubators using the objective and rational criteria is growing. The purpose of this study is to develop evaluation model in Business Incubator using Data Mining Process. We suggested the evaluation model of business incubator, 'Score-5 RS' consists of making evaluation factor process using weighted sum and 5-grade classification and analyzing process by Decision Tree algorithm.

Dynamic knowledge mapping guided by data mining: Application on Healthcare

  • Brahami, Menaouer;Atmani, Baghdad;Matta, Nada
    • Journal of Information Processing Systems
    • /
    • v.9 no.1
    • /
    • pp.1-30
    • /
    • 2013
  • The capitalization of know-how, knowledge management, and the control of the constantly growing information mass has become the new strategic challenge for organizations that aim to capture the entire wealth of knowledge (tacit and explicit). Thus, knowledge mapping is a means of (cognitive) navigation to access the resources of the strategic heritage knowledge of an organization. In this paper, we present a new mapping approach based on the Boolean modeling of critical domain knowledge and on the use of different data sources via the data mining technique in order to improve the process of acquiring knowledge explicitly. To evaluate our approach, we have initiated a process of mapping that is guided by machine learning that is artificially operated in the following two stages: data mining and automatic mapping. Data mining is be initially run from an induction of Boolean case studies (explicit). The mapping rules are then used to automatically improve the Boolean model of the mapping of critical knowledge.

The study of a full cycle semi-automated business process re-engineering: A comprehensive framework

  • Lee, Sanghwa;Sutrisnowati, Riska A.;Won, Seokrae;Woo, Jong Seong;Bae, Hyerim
    • Journal of the Korea Society of Computer and Information
    • /
    • v.23 no.11
    • /
    • pp.103-109
    • /
    • 2018
  • This paper presents an idea and framework to automate a full cycle business process management and re-engineering by integrating traditional business process management systems, process mining, data mining, machine learning, and simulation. We build our framework on the cloud-based platform such that various data sources can be incorporated. We design our systems to be extensible so that not only beneficial for practitioners of BPM, but also for researchers. Our framework can be used as a test bed for researchers without the complication of system integration. The automation of redesigning phase and selecting a baseline process model for deployment are the two main contributions of this study. In the redesigning phase, we deal with both the analysis of the existing process model and what-if analysis on how to improve the process at the same time, Additionally, improving a business process can be applied in a case by case basis that needs a lot of trial and error and huge data. In selecting the baseline process model, we need to compare many probable routes of business execution and calculate the most efficient one in respect to production cost and execution time. We also discuss the challenges and limitation of the framework, including the systems adoptability, technical difficulties and human factors.

Effective eCRM using prediction function of Data Mining (Data Mining의 예측기능을 이용한 효과적인 eCRM)

  • Kang Rae-Goo;Kim Seung-Eon;Jung Chai-Yeoung
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2006.05a
    • /
    • pp.1039-1042
    • /
    • 2006
  • Because many corporations computerize process figure enemy who is introducing eCRM fast and are used mainly at past by purpose to detect and analyze and forecast systematic analysis of customer information and various pattern of customer recently, ordinary peoples are trend that is alternated gradually by data mining that can drawand forecast result of good quality easily. Field that this data mining is used representatively is eCRM. In this treatise customer data of A discount store and sale data of 1 years experimenting that forecast customer contribution to base next year through data mining actuality data and data mining through comparison with predicted data are how effective to eCRM prove.

  • PDF

Development of Data Mining System for Ship Design using Combined Genetic Programming with Self Organizing Map (유전적 프로그래밍과 SOM을 결합한 개선된 선박 설계용 데이터 마이닝 시스템 개발)

  • Lee, Kyung-Ho;Park, Jong-Hoon;Han, Young-Soo;Choi, Si-Young
    • Korean Journal of Computational Design and Engineering
    • /
    • v.14 no.6
    • /
    • pp.382-389
    • /
    • 2009
  • Recently, knowledge management has been required in companies as a tool of competitiveness. Companies have constructed Enterprise Resource Planning(ERP) system in order to manage huge knowledge. But, it is not easy to formalize knowledge in organization. We focused on data mining system by genetic programming(GP). Data mining system by genetic programming can be useful tools to derive and extract the necessary information and knowledge from the huge accumulated data. However when we don't have enough amounts of data to perform the learning process of genetic programming, we have to reduce input parameter(s) or increase number of learning or training data. In this study, an enhanced data mining method combining Genetic Programming with Self organizing map, that reduces the number of input parameters, is suggested. Experiment results through a prototype implementation are also discussed.

A Study on the Data Fusion for Data Enrichment (데이터 보강을 위한 데이터 통합기법에 관한 연구)

  • 정성석;김순영;김현진
    • The Korean Journal of Applied Statistics
    • /
    • v.17 no.3
    • /
    • pp.605-617
    • /
    • 2004
  • One of the best important thing in data mining process is the quality of data used. When we perform the mining on data with excellent quality, the potential value of data mining can be improved. In this paper, we propose the data fusion technique for data enrichment that one phase can improve data quality in KDD process. We attempted to add k-NN technique to the regression technique, to improve performance of fusion technique through reduction of the loss of information. Simulations were performed to compare the proposed data fusion technique with the regression technique. As a result, the newly proposed data fusion technique is characterized with low MSE in continuous fusion variables.