• Title/Summary/Keyword: Fuzzy data mining

Search Result 90, Processing Time 0.026 seconds

Emerging Data Management Tools and Their Implications for Decision Support

  • Eorm, Sean B.;Novikova, Elena;Yoo, Sangjin
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.2 no.2
    • /
    • pp.189-207
    • /
    • 1997
  • Recently, we have witnessed a host of emerging tools in the management support systems (MSS) area including the data warehouse/multidimensinal databases (MDDB), data mining, on-line analytical processing (OLAP), intelligent agents, World Wide Web(WWW) technologies, the Internet, and corporate intranets. These tools are reshaping MSS developments in organizations. This article reviews a set of emerging data management technologies in the knowledge discovery in databases(KDD) process and analyzes their implications for decision support. Furthermore, today's MSS are equipped with a plethora of AI techniques (artifical neural networks, and genetic algorithms, etc) fuzzy sets, modeling by example , geographical information system(GIS), logic modeling, and visual interactive modeling (VIM) , All these developments suggest that we are shifting the corporate decision making paradigm form information-driven decision making in the1980s to knowledge-driven decision making in the 1990s.

  • PDF

Improvement of SOM using Stratification

  • Jun, Sung-Hae
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • v.9 no.1
    • /
    • pp.36-41
    • /
    • 2009
  • Self organizing map(SOM) is one of the unsupervised methods based on the competitive learning. Many clustering works have been performed using SOM. It has offered the data visualization according to its result. The visualized result has been used for decision process of descriptive data mining as exploratory data analysis. In this paper we propose improvement of SOM using stratified sampling of statistics. The stratification leads to improve the performance of SOM. To verify improvement of our study, we make comparative experiments using the data sets form UCI machine learning repository and simulation data.

Case Study of CRM Application Using Improvement Method of Fuzzy Decision Tree Analysis (퍼지의사결정나무 개선방법을 이용한 CRM 적용 사례)

  • Yang, Seung-Jeong;Rhee, Jong-Tae
    • The Journal of the Korea Contents Association
    • /
    • v.7 no.8
    • /
    • pp.13-20
    • /
    • 2007
  • Decision tree is one of the most useful analysis methods for various data mining functions, including prediction, classification, etc, from massive data. Decision tree grows by splitting nodes, during which the purity increases. It is needed to stop splitting nodes when the purity does not increase effectively or new leaves does not contain meaningful number of records. Pruning is done if a branch does not show certain level of performance. By pruning, the structure of decision tree is changed and it is implied that the previous splitting of the parent node was not effective. It is also implied that the splitting of the ancestor nodes were not effective and the choices of attributes and criteria in splitting them were not successful. It should be noticed that new attributes or criteria might be selected to split such nodes for better tries. In this paper, we suggest a procedure to modify decision tree by Fuzzy theory and splitting as an integrated approach.

Similarity Pattern Analysis of Web Log Data using Multidimensional FCM (다차원 FCM을 이용한 웹 로그 데이터의 유사 패턴 분석)

  • 김미라;조동섭
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2002.10d
    • /
    • pp.190-192
    • /
    • 2002
  • 데이터 마이닝(Data Mining)이란 저장된 많은 양의 자료로부터 통계적 수학적 분석방법을 이용하여 다양한 가치 있는 정보를 찾아내는 일련의 과정이다. 데이터 클러스터링은 이러한 데이터 마이닝을 위한 하나의 중요한 기법이다. 본 논문에서는 Fuzzy C-Means 알고리즘을 이용하여 웹 사용자들의 행위가 기록되어 있는 웹 로그 데이터를 데이터 클러스터링 하는 방법에 관하여 연구하고자 한다. Fuzzv C-Means 클러스터링 알고리즘은 각 데이터와 각 클러스터 중심과의 거리를 고려한 유사도 측정에 기초한 목적 함수의 최적화 방식을 사용한다. 웹 로그 데이터의 여러 필드 중에서 사용자 IP, 시간, 웹 페이지 필드를 WLDF(Web Log Data for FCM)으로 가공한 후, 다차원 Fuzzy C-Means 클러스터링을 한다. 그리고 이를 이용하여 샘플 데이터와 임의의 데이터간의 유사 패턴 분석을 하고자 한다.

  • PDF

Cluster Merging Using Enhanced Density based Fuzzy C-Means Clustering Algorithm (개선된 밀도 기반의 퍼지 C-Means 알고리즘을 이용한 클러스터 합병)

  • Han, Jin-Woo;Jun, Sung-Hae;Oh, Kyung-Whan
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.14 no.5
    • /
    • pp.517-524
    • /
    • 2004
  • The fuzzy set theory has been wide used in clustering of machine learning with data mining since fuzzy theory has been introduced in 1960s. In particular, fuzzy C-means algorithm is a popular fuzzy clustering algorithm up to date. An element is assigned to any cluster with each membership value using fuzzy C-means algorithm. This algorithm is affected from the location of initial cluster center and the proper cluster size like a general clustering algorithm as K-means algorithm. This setting up for initial clustering is subjective. So, we get improper results according to circumstances. In this paper, we propose a cluster merging using enhanced density based fuzzy C-means clustering algorithm for solving this problem. Our algorithm determines initial cluster size and center using the properties of training data. Proposed algorithm uses grid for deciding initial cluster center and size. For experiments, objective machine learning data are used for performance comparison between our algorithm and others.

Intelligent Methods to Extract Knowledge from Process Data in the Industrial Applications

  • Woo, Young-Kwang;Bae, Hyeon;Kim, Sung-Shin;Woo, Kwang-Bang
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • v.3 no.2
    • /
    • pp.194-199
    • /
    • 2003
  • Data are an expression of the language or numerical values that show some features. And the information is extracted from data for the specific purposes. The knowledge is utilized as information to construct rules that recognize patterns or make a decision. Today, knowledge extraction and application of that are broadly accomplished for the easy comprehension and the performance improvement of systems in the several industrial fields. The knowledge extraction can be achieved by some steps that include the knowledge acquisition, expression, and implementation. Such extracted knowledge is drawn by rules with data mining techniques. Clustering (CL), input space partition (ISP), neuro-fuzzy (NF), neural network (NN), extension matrix (EM), etc. are employed for the knowledge expression based upon rules. In this paper, the various approaches of the knowledge extraction are surveyed and categorized by methodologies and applied industrial fields. Also, the trend and examples of each approaches are shown in the tables and graphes using the categories such as CL, ISP, NF, NN, EM, and so on.

Improved TI-FCM Clustering Algorithm in Big Data (빅데이터에서 개선된 TI-FCM 클러스터링 알고리즘)

  • Lee, Kwang-Kyug
    • Journal of IKEEE
    • /
    • v.23 no.2
    • /
    • pp.419-424
    • /
    • 2019
  • The FCM algorithm finds the optimal solution through iterative optimization technique. In particular, there is a difference in execution time depending on the initial center of clustering, the location of noise, the location and number of crowded densities. However, this method gradually updates the center point, and the center of the initial cluster is shifted to one side. In this paper, we propose a TI-FCM(Triangular Inequality-Fuzzy C-Means) clustering algorithm that determines the cluster center density by maximizing the distance between clusters using triangular inequality. The proposed method is an effective method to converge to real clusters compared to FCM even in large data sets. Experiments show that execution time is reduced compared to existing FCM.

CADICA: Diagnosis of Coronary Artery Disease Using the Imperialist Competitive Algorithm

  • Mahmoodabadi, Zahra;Abadeh, Mohammad Saniee
    • Journal of Computing Science and Engineering
    • /
    • v.8 no.2
    • /
    • pp.87-93
    • /
    • 2014
  • Coronary artery disease (CAD) is currently a prevalent disease from which many people suffer. Early detection and treatment could reduce the risk of heart attack. Currently, the golden standard for the diagnosis of CAD is angiography, which is an invasive procedure. In this article, we propose an algorithm that uses data mining techniques, a fuzzy expert system, and the imperialist competitive algorithm (ICA), to make CAD diagnosis by a non-invasive procedure. The ICA is used to adjust the fuzzy membership functions. The proposed method has been evaluated with the Cleveland and Hungarian datasets. The advantage of this method, compared with others, is the interpretability. The accuracy of the proposed method is 94.92% by 11 rules, and the average length of 4. To compare the colonial competitive algorithm with other metaheuristic algorithms, the proposed method has been implemented with the particle swarm optimization (PSO) algorithm. The results indicate that the colonial competition algorithm is more efficient than the PSO algorithm.

Two-Phased Fuzzy Partitions with Funny Equalization (퍼지 균등화존건을 갖는 2단 퍼지분할)

  • Kyeongtaek Kim;Chongsu Kim
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.25 no.6
    • /
    • pp.54-58
    • /
    • 2002
  • 퍼지 균등화는 어의론적으로 의미있고, 실험적으로 의미있는 언어레이블을 붙이도록 하는 조건이다. 지금까지 발표된 퍼지 균등화조건을 갖는 퍼지분할을 생성하는 알고리듬은 주어진 데이터에 대하여, 오직 하나의 퍼지분할만을 생성할 수 있었다. 만일 생성된 퍼지 분할이 더 이상 유용하지 못한 것으로 판명되면, 이 알고리듬은 주어진 데이터에 대한 퍼지 균등화조건을 갖는 또 다른 퍼지분할을 생성할 수 없다. 이는 생성된 퍼지분할을 사용하여 탐색적 발견을 수행하는 데이터마이닝의 경우 더 이상 프로세스가 진행되지 못함을 의미한다. 본 연구에서는 주어진 데이터에 대한 퍼지 균등화조건을 갖는 서로 다른 두 퍼지분할이 존재한다면, 어떠한 관계가 있는지를 증명하고, 이를 위치적 특성으로 서술한다. 또한 이 특성을 이용하여 퍼지 균등화조건을 갖는 퍼지분할을 원하는 만큼 생성할 수 있는 알고리듬을 제시하고, 예를 들어 설명한다.