• Title/Summary/Keyword: Fuzzy Mining

Search Result 120, Processing Time 0.03 seconds

Fuzzy category based transaction analysis for web usage mining (웹 사용 마이닝을 위한 퍼지 카테고리 기반의 트랜잭션 분석 기법)

  • 이시헌;이지형
    • Proceedings of the Korean Institute of Intelligent Systems Conference
    • /
    • 2004.04a
    • /
    • pp.341-344
    • /
    • 2004
  • 웹 사용 마이닝(Web usage mining)은 웹 로그 파일(web log file)이나 웹 사용 데이터(Web usage data)에서 의미 있는 정보를 찾아내는 연구 분야이다. 웹 사용 마이닝에서 일반적으로 많이 사용하는 웹 로그 파일은 사용자들이 참조한 페이지의 단순한 리스트들이다. 따라서 단순히 웹 로그 파일만을 이용하는 방법만으로는 사용자가 참조했던 페이지의 내용을 반영하여 분석하는데에는 한계가 있다. 이러한 점을 개선하고자 본 논문에서는 페이지 위주가 아닌 웹 페이지가 포함하고 있는 내용(아이템)을 고려하는 새로운 퍼지 카테고리 기반의 웹 사용 마이닝 기법을 제시한다. 또한 사용자를 잘 파악하기 위해서 시간에 따라 관심의 변화를 파악하는 방법을 제시한다.

  • PDF

Intelligent Distributed Platform using Mobile Agent based on Dynamic Group Binding (동적 그룹 바인딩 기반의 모바일 에이전트를 이용한 인텔리전트 분산 플랫폼)

  • Mateo, Romeo Mark A.;Lee, Jae-Wan
    • Journal of Internet Computing and Services
    • /
    • v.8 no.3
    • /
    • pp.131-143
    • /
    • 2007
  • The current trends in information technology and intelligent systems use data mining techniques to discover patterns and extract rules from distributed databases. In distributed environment, the extracted rules from data mining techniques can be used in dynamic replications, adaptive load balancing and other schemes. However, transmission of large data through the system can cause errors and unreliable results. This paper proposes the intelligent distributed platform based on dynamic group binding using mobile agents which addresses the use of intelligence in distributed environment. The proposed grouping service implements classification scheme of objects. Data compressor agent and data miner agent extracts rules and compresses data, respectively, from the service node databases. The proposed algorithm performs preprocessing where it merges the less frequent dataset using neuro-fuzzy classifier before sending the data. Object group classification, data mining the service node database, data compression method, and rule extraction were simulated. Result of experiments in efficient data compression and reliable rule extraction shows that the proposed algorithm has better performance compared to other methods.

  • PDF

CADICA: Diagnosis of Coronary Artery Disease Using the Imperialist Competitive Algorithm

  • Mahmoodabadi, Zahra;Abadeh, Mohammad Saniee
    • Journal of Computing Science and Engineering
    • /
    • v.8 no.2
    • /
    • pp.87-93
    • /
    • 2014
  • Coronary artery disease (CAD) is currently a prevalent disease from which many people suffer. Early detection and treatment could reduce the risk of heart attack. Currently, the golden standard for the diagnosis of CAD is angiography, which is an invasive procedure. In this article, we propose an algorithm that uses data mining techniques, a fuzzy expert system, and the imperialist competitive algorithm (ICA), to make CAD diagnosis by a non-invasive procedure. The ICA is used to adjust the fuzzy membership functions. The proposed method has been evaluated with the Cleveland and Hungarian datasets. The advantage of this method, compared with others, is the interpretability. The accuracy of the proposed method is 94.92% by 11 rules, and the average length of 4. To compare the colonial competitive algorithm with other metaheuristic algorithms, the proposed method has been implemented with the particle swarm optimization (PSO) algorithm. The results indicate that the colonial competition algorithm is more efficient than the PSO algorithm.

Two-Phased Fuzzy Partitions with Funny Equalization (퍼지 균등화존건을 갖는 2단 퍼지분할)

  • Kyeongtaek Kim;Chongsu Kim
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.25 no.6
    • /
    • pp.54-58
    • /
    • 2002
  • 퍼지 균등화는 어의론적으로 의미있고, 실험적으로 의미있는 언어레이블을 붙이도록 하는 조건이다. 지금까지 발표된 퍼지 균등화조건을 갖는 퍼지분할을 생성하는 알고리듬은 주어진 데이터에 대하여, 오직 하나의 퍼지분할만을 생성할 수 있었다. 만일 생성된 퍼지 분할이 더 이상 유용하지 못한 것으로 판명되면, 이 알고리듬은 주어진 데이터에 대한 퍼지 균등화조건을 갖는 또 다른 퍼지분할을 생성할 수 없다. 이는 생성된 퍼지분할을 사용하여 탐색적 발견을 수행하는 데이터마이닝의 경우 더 이상 프로세스가 진행되지 못함을 의미한다. 본 연구에서는 주어진 데이터에 대한 퍼지 균등화조건을 갖는 서로 다른 두 퍼지분할이 존재한다면, 어떠한 관계가 있는지를 증명하고, 이를 위치적 특성으로 서술한다. 또한 이 특성을 이용하여 퍼지 균등화조건을 갖는 퍼지분할을 원하는 만큼 생성할 수 있는 알고리듬을 제시하고, 예를 들어 설명한다.

Mining Generalized Fuzzy Quantitative Association Rules with Fuzzy Generalization Hierarchies (퍼지 일반화 계층을 이용한 일반화된 퍼지 정량 연관규칙 마이닝)

  • 한상훈;손봉기;이건명
    • Proceedings of the Korean Institute of Intelligent Systems Conference
    • /
    • 2001.05a
    • /
    • pp.8-11
    • /
    • 2001
  • 연관규칙 마이닝은 트랜잭션 데이터를 이루고 있는 항목간의 잠재적인 의존관계를 발견하는 데이터 마이닝의 한 분야이다. 정량 연관규칙이란 부류적 속성과 정량적 속성을 모두 포함한 연관규칙이다. 정량 연관규칙 마아닝을 위한 퍼지 기술의 응용, 정량 연관규칙 마이닝을 위한 일반화된 연관규칙 마이닝, 사용자의 관심도를 반영한 중요도 가중치가 있는 연관규칙 마이닝 등에 대한 연구가 이루어져 왔다. 이 논문에서는 중요도 가중치가 있는 일반화된 퍼지 정량 연관규칙 마이닝의 새로운 방법을 제안한다. 이 방법은 부류적 속성의 퍼지 개념 계층과 정량적 속성의 퍼지 언어항 일반화 계층을 일반화된 추출하기 위해 이용한다. 이것은 속성들의 수준별 일반화 계층과 속성의 중요도 가중치를 이용함으로써 사용자가 보다 융통성 있는 연관규칙을 마이닝할 수 있게 해준다.

  • PDF

A Comparative Study of Estimation by Analogy using Data Mining Techniques

  • Nagpal, Geeta;Uddin, Moin;Kaur, Arvinder
    • Journal of Information Processing Systems
    • /
    • v.8 no.4
    • /
    • pp.621-652
    • /
    • 2012
  • Software Estimations provide an inclusive set of directives for software project developers, project managers, and the management in order to produce more realistic estimates based on deficient, uncertain, and noisy data. A range of estimation models are being explored in the industry, as well as in academia, for research purposes but choosing the best model is quite intricate. Estimation by Analogy (EbA) is a form of case based reasoning, which uses fuzzy logic, grey system theory or machine-learning techniques, etc. for optimization. This research compares the estimation accuracy of some conventional data mining models with a hybrid model. Different data mining models are under consideration, including linear regression models like the ordinary least square and ridge regression, and nonlinear models like neural networks, support vector machines, and multivariate adaptive regression splines, etc. A precise and comprehensible predictive model based on the integration of GRA and regression has been introduced and compared. Empirical results have shown that regression when used with GRA gives outstanding results; indicating that the methodology has great potential and can be used as a candidate approach for software effort estimation.

Visualizing Fuzzy Set Based on Venn Diagram (벤 다이어그램 기반 퍼지 집합 시각화)

  • Park, Ye-Seul;Park, Jin-Ah
    • 한국HCI학회:학술대회논문집
    • /
    • 2009.02a
    • /
    • pp.15-20
    • /
    • 2009
  • Much amount of data which demand fuzzy information system requires various analysis through the fuzzy set visualization. Therefore, this study proposes how to visualize fuzzy data set using variation of Venn diagram. For the fuzzy data which are related to many topics and have ranking of relation, this way gives results that users want by visualizing intersection, union and complementary set. That is, it visualizes the set of fuzzy data which have many topics at once, or the set of all fuzzy data which has topics, or the set of fuzzy data not related to a topic. Users control these sets by overlapping or piling them; visualized with Venn diagram, which is user-oriented. One distinct advantage of this visualization is the fact that it delivers web documents which users of search engine and web developers want much quickly. Furthermore, its possibility can be expanded to several purposes by using for information retrieval.

  • PDF

Case Study of CRM Application Using Improvement Method of Fuzzy Decision Tree Analysis (퍼지의사결정나무 개선방법을 이용한 CRM 적용 사례)

  • Yang, Seung-Jeong;Rhee, Jong-Tae
    • The Journal of the Korea Contents Association
    • /
    • v.7 no.8
    • /
    • pp.13-20
    • /
    • 2007
  • Decision tree is one of the most useful analysis methods for various data mining functions, including prediction, classification, etc, from massive data. Decision tree grows by splitting nodes, during which the purity increases. It is needed to stop splitting nodes when the purity does not increase effectively or new leaves does not contain meaningful number of records. Pruning is done if a branch does not show certain level of performance. By pruning, the structure of decision tree is changed and it is implied that the previous splitting of the parent node was not effective. It is also implied that the splitting of the ancestor nodes were not effective and the choices of attributes and criteria in splitting them were not successful. It should be noticed that new attributes or criteria might be selected to split such nodes for better tries. In this paper, we suggest a procedure to modify decision tree by Fuzzy theory and splitting as an integrated approach.

Slope stability prediction using ANFIS models optimized with metaheuristic science

  • Gu, Yu-tian;Xu, Yong-xuan;Moayedi, Hossein;Zhao, Jian-wei;Le, Binh Nguyen
    • Geomechanics and Engineering
    • /
    • v.31 no.4
    • /
    • pp.339-352
    • /
    • 2022
  • Studying slope stability is an important branch of civil engineering. In this way, engineers have employed machine learning models, due to their high efficiency in complex calculations. This paper examines the robustness of various novel optimization schemes, namely equilibrium optimizer (EO), Harris hawks optimization (HHO), water cycle algorithm (WCA), biogeography-based optimization (BBO), dragonfly algorithm (DA), grey wolf optimization (GWO), and teaching learning-based optimization (TLBO) for enhancing the performance of adaptive neuro-fuzzy inference system (ANFIS) in slope stability prediction. The hybrid models estimate the factor of safety (FS) of a cohesive soil-footing system. The role of these algorithms lies in finding the optimal parameters of the membership function in the fuzzy system. By examining the convergence proceeding of the proposed hybrids, the best population sizes are selected, and the corresponding results are compared to the typical ANFIS. Accuracy assessments via root mean square error, mean absolute error, mean absolute percentage error, and Pearson correlation coefficient showed that all models can reliably understand and reproduce the FS behavior. Moreover, applying the WCA, EO, GWO, and TLBO resulted in reducing both learning and prediction error of the ANFIS. Also, an efficiency comparison demonstrated the WCA-ANFIS as the most accurate hybrid, while the GWO-ANFIS was the fastest promising model. Overall, the findings of this research professed the suitability of improved intelligent models for practical slope stability evaluations.