DOI QR코드

DOI QR Code

A Mixed Co-clustering Algorithm Based on Information Bottleneck

  • Liu, Yongli (School of Computer Science and Technology, Henan Polytechnic University) ;
  • Duan, Tianyi (School of Computer Science and Technology, Henan Polytechnic University) ;
  • Wan, Xing (School of Computer Science and Technology, Henan Polytechnic University) ;
  • Chao, Hao (School of Computer Science and Technology, Henan Polytechnic University)
  • Received : 2017.01.06
  • Accepted : 2017.03.04
  • Published : 2017.12.31

Abstract

Fuzzy co-clustering is sensitive to noise data. To overcome this noise sensitivity defect, possibilistic clustering relaxes the constraints in FCM-type fuzzy (co-)clustering. In this paper, we introduce a new possibilistic fuzzy co-clustering algorithm based on information bottleneck (ibPFCC). This algorithm combines fuzzy co-clustering and possibilistic clustering, and formulates an objective function which includes a distance function that employs information bottleneck theory to measure the distance between feature data point and feature cluster centroid. Many experiments were conducted on three datasets and one artificial dataset. Experimental results show that ibPFCC is better than such prominent fuzzy (co-)clustering algorithms as FCM, FCCM, RFCC and FCCI, in terms of accuracy and robustness.

Keywords

Acknowledgement

Supported by : Natural Science Foundation of China

References

  1. K. M. Hammouda and M. S. Kamel, "Efficient phrase-based document indexing for web document clustering," IEEE Transactions on Knowledge and Data Engineering, vol. 16, no. 10, pp. 1279-1296, 2004. https://doi.org/10.1109/TKDE.2004.58
  2. Y. Liu, T. Yang, and L. Fu, "A partitioning based algorithm to fuzzy tricluster," Mathematical Problems in Engineering, vol. 2015, article ID. 235790, 2015.
  3. N. R. Pal, K. Pal, and J. C. Bezdek, "A mixed c-means clustering model," in Proceedings of 6th International Fuzzy Systems Conference, Barcelona, Spain, 1997, pp. 11-21.
  4. T. C. Havens, R. Chitta, A. K. Jain, and R. Jin, "Speedup of fuzzy and possibilistic kernel c-means for large-scale clustering," in Proceedings of IEEE International Conference on Fuzzy Systems (FUZZ-IEEE 2011), Taipei, Taiwan, 2011, pp. 463-470.
  5. W. C. Tjhi and L. Chen, "Possibilistic fuzzy co-clustering of large document collections," Pattern Recognition, vol. 40, no. 12, pp. 3452-3466, 2007. https://doi.org/10.1016/j.patcog.2007.04.017
  6. J. P. Mei, Y. Wang, L. Chen, and C. Miao, "Incremental fuzzy clustering for document categorization," in Proceedings of IEEE International Conference on Fuzzy Systems (FUZZ-IEEE), Beijing, China, 2014, pp. 1518-1525.
  7. J. Liu, X. Wu, and X. Luo, "Fuzzy clustering research based on intelligent computing," in Proceedings of International Conference on Intelligent Transportation, Big Data and Smart City, Halong Bay, Vietnam, 2015, pp. 429-432.
  8. J. C. Bezdek, Pattern Recognition with Fuzzy Objective Function Algorithms. New York, NY: Plenum Press, 1981, pp. 203-239.
  9. M. Hanmandlu, O. P. Verma, S. Susan, and V. K. Madasu. "Color segmentation by fuzzy co-clustering of chrominance color features," Neurocomputing, vol. 120, pp. 235-249, 2013. https://doi.org/10.1016/j.neucom.2012.09.043
  10. C. H. Oh, K. Honda, and H. Ichihashi, "Fuzzy clustering for categorical multivariate data," in Proceedings Joint 9th IFSA World Congress and 20th NAFIPS International Conference (Cat. No. 01TH8569), Vancouver, Canada, 2001, pp. 2154-2159.
  11. W. C. Tjhi and L. Chen, "Robust fuzzy co-clustering algorithm," in Proceedings of 6th International Conference on Information, Communications & Signal Processing, Singapore, Singapore, 2007, pp. 1-5.
  12. J. Leski, "Robust possibilistic clustering," Archives of Control Sciences, vol. 10, no. 3/4, pp. 141-155, 2000.
  13. R. Krishnapuram and J. M. Keller, "A possibilistic approach to clustering," IEEE Transactions on Fuzzy Systems, vol. 1, no. 2, pp. 98-110, 1993. https://doi.org/10.1109/91.227387
  14. M. Barni, V. Cappellini, and A. Mecocci, "Comments on 'A possibilistic approach to clustering'," IEEE Transactions on Fuzzy Systems, vol. 4, no. 3, pp. 393-396, 1996. https://doi.org/10.1109/91.531780
  15. X. Wan, "A novel document similarity measure based on earth mover's distance," Information Sciences, vol. 177, no. 18, pp. 3718-3730, 2007. https://doi.org/10.1016/j.ins.2007.02.045
  16. H. Izakian, W. Pedrycz, and I. Jamal, "Fuzzy clustering of time series data using dynamic time warping distance," Engineering Applications of Artificial Intelligence, vol. 39, pp. 235-244, 2015. https://doi.org/10.1016/j.engappai.2014.12.015
  17. N. Slonim and N. Tishby, "Document clustering using word clusters via the information bottleneck method," in Proceedings of the 23rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Athens, Greece, 2000, pp. 208-215.
  18. N. Slonim, N. Friedman, and N. Tishby. "Unsupervised document classification using sequential information maximization," in Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Tampere, Finland, 2002, pp. 129-136.
  19. B. Abidi and S. B. Yahia, "Multi-PFKCN: a fuzzy possibilistic clustering algorithm based on neural network," in Proceedings of 2013 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE), Hyderabad, India, 2013, pp. 1-8.
  20. K. Duraisamy and K. Haridass. "Modified fuzzy possibilistic C-means," Fuzzy Systems, vol. 6, no. 3, pp. 78-83, 2014.
  21. J. Goldberger, H. Greenspan, and S. Gordon, "Unsupervised image clustering using the information bottleneck method," in Proceedings of 24th DAGM Symposium, Zurich, Switzerland, 2002, pp. 158-165.
  22. Y. Liu, Y. Ouyang, and Z. Xiong. "Incremental clustering using information bottleneck theory," International Journal of Pattern Recognition and Artificial Intelligence, vol. 25, no. 5, pp. 695-712, 2011. https://doi.org/10.1142/S0218001411008622
  23. W. C. Tjhi and L. Chen, "A partitioning based algorithm to fuzzy co-cluster documents and words," Pattern Recognition Letters, vol. 27, no. 3, pp. 151-159, 2006. https://doi.org/10.1016/j.patrec.2005.07.012