DOI QR코드

DOI QR Code

Generation of Efficient Fuzzy Classification Rules Using Evolutionary Algorithm with Data Partition Evaluation

데이터 분할 평가 진화알고리즘을 이용한 효율적인 퍼지 분류규칙의 생성

  • 류정우 (한국전자통신연구원 지능형로봇연구단) ;
  • 김성은 ((주)퓨쳐시스템 정보통신연구소) ;
  • 김명원 (숭실대학원 컴퓨터학부)
  • Published : 2008.02.25

Abstract

Fuzzy rules are very useful and efficient to describe classification rules especially when the attribute values are continuous and fuzzy in nature. However, it is generally difficult to determine membership functions for generating efficient fuzzy classification rules. In this paper, we propose a method of automatic generation of efficient fuzzy classification rules using evolutionary algorithm. In our method we generate a set of initial membership functions for evolutionary algorithm by supervised clustering the training data set and we evolve the set of initial membership functions in order to generate fuzzy classification rules taking into consideration both classification accuracy and rule comprehensibility. To reduce time to evaluate an individual we also propose an evolutionary algorithm with data partition evaluation in which the training data set is partitioned into a number of subsets and individuals are evaluated using a randomly selected subset of data at a time instead of the whole training data set. We experimented our algorithm with the UCI learning data sets, the experiment results showed that our method was more efficient at average compared with the existing algorithms. For the evolutionary algorithm with data partition evaluation, we experimented with our method over the intrusion detection data of KDD'99 Cup, and confirmed that evaluation time was reduced by about 70%. Compared with the KDD'99 Cup winner, the accuracy was increased by 1.54% while the cost was reduced by 20.8%.

데이터 속성 값이 연속적이고 애매할 때 퍼지 규칙으로 분류규칙을 표현하는 것은 매우 유용하면서도 효과적이다. 그러나 효과적인 퍼지 분류규칙을 생성하기 위한 소속함수를 결정하기는 어렵다. 본 논문에서는 진화알고리즘을 이용하여 효과적인 퍼지 분류규칙을 자동으로 생성하는 방법을 제안한다. 제안한 방법은 지도 군집화로 클래스 분포에 따라 초기 소속함수를 생성하고, 정확하고 간결한 규칙을 생성할 수 있도록 초기 소속함수를 진화시키는 방법이다. 또한 진화알고리즘의 시간에 대한 효율성을 높이기 위한 방법으로 데이터 분할 평가 진화 방법을 제안한다. 데이터 분할 평가 진화 방법은 전체 학습 데이터를 여러 개의 부분 학습 데이터들로 나누고 개체는 전체 학습 데이터 대신 부분 학습 데이터를 임의로 선택하여 평가하는 방법이다. UCI 벤치마크 데이터로 기존 방법과 비교 실험을 통해 평균적으로 제안한 방법이 효과적임을 보였다. 또한 KDD'99 Cup의 침입탐지 데이터에서 KDD'99 Cup 우승자에 비해 1.54% 향상된 인식률과 20.8% 절감된 탐지비용을 보였고 데이터 분할 평가 진화 방법으로 개체평가 시간을 약 70% 감소시켰다.

Keywords

References

  1. Quinlan, J.R., "Improved use of continuous attributes in C4.5," Journal of Artificial Intelligence Research, 4, pp.77-90, 1996
  2. M. W. Kim, J. W. Ryu, "Optimized Fuzzy Classification for Data Mining," Lecture Notes in Computer Science Vol. 2973, pp.582-593, 2004
  3. X.-Z. Wang, D.S. Yeung, E.C.C. Tsang, "A Comparative Study on Heuristic Algorithms for Generating Fuzzy Decision Trees," IEEE Transactions on Systems, Man, and Cybernetics, Vol. 31, NO. 2, pp. 215-226, 2001 https://doi.org/10.1109/3477.915344
  4. Janikow, C.Z., Fajfer, M., "Fuzzy partitioning with FID3.1," 18th International Conference of the North American, NAFIPS, pp.467-471, 1999
  5. Tomoharu Nakashima, Gaku Nakai, Hisao Ishibuchi, "Improving the Performance of Fuzzy Classification Systems by Membership Function Learning and Feature Selection," FUZZ-IEEE '02 Proceedings of the 2002 International Conference on Vol.1, pp.488-493, 2002
  6. Tomoharu Nakashima, Gerald Schaefer, Yasuyuki Yokota, Hisao Ishibuchi, "A weighted fuzzy classifier and its application to image processing tasks," Fuzzy Sets and Systems 158, pp.284-294, 2007 https://doi.org/10.1016/j.fss.2006.10.011
  7. Hisao Ishibuchi, Takashi Yamamoto, "Effects of Three-Objective Genetic Rule Selection on the Generalization Ability of Fuzzy Rule-Based Systems," Lecture Notes in Computer Science, Vol. 2632, pp.608-622, 2003
  8. Hisao Ishibuchi, "Evolutionary Multiobjective Design of Fuzzy Rule-Based Systems," Proceeding of the 2007 IEEE Symposium on Foundations of Computational Intelligence, pp.9-16, 2007
  9. Hisao Ishibuchi, Yusuke Nojima, "Analysis of interpretability- accuracy tradeoff of fuzzy systems by multiobjective fuzzy genetics-based machine learning," International Journal of Approximate Reasoning 44, pp.4-31, 2007 https://doi.org/10.1016/j.ijar.2006.01.004
  10. J. Roubos, M. Setnes, J. Abonyi, "Learning Fuzzy Classification Rules from Labeled Data," International Journal of Information Sciences, 150(1-2), pp.77-93, 2003 https://doi.org/10.1016/S0020-0255(02)00369-9
  11. J. Abonyi, J. Roubos, F. Szeifert, "Data-driven generation of compact, accurate, and linguistically sound fuzzy classifiers based on a decision-tree initialization," International Journal of Approximate Reasoning, 31(1), pp.1-21, 2003 https://doi.org/10.1016/S0888-613X(02)00066-X
  12. J. Gomez, D. Dasgupta, "Evolving fuzzy classifiers for intrusion detection," International Proceedings of the IEEE Workshop on Information Assurance, 2002
  13. Chi-Ho Tsang, S. Kwong and H. Wang, "Anomaly intrusion detection using multi-objective genetic fuzzy system and agent-based evolutionary computation framework," International Conference on Data Mining, pp.789-792, 2005
  14. http://www.ics.uci.edu/~mlearn/MLRepository.html
  15. KDD Cup 1999 data set, http://kdd.ics.uci.edu/databases/kddcup99/ kddcup99.html, The UCI KDD Archive, University of California
  16. C. Elkan, "Results of the KDD'99 classifier learning," ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp.63-64, 2000