Browse > Article
http://dx.doi.org/10.9717/kmms.2018.21.8.952

Performance Improvement of Deep Clustering Networks for Multi Dimensional Data  

Lee, Hyunjin (Division of ICT Engineering, Korea Soongsil Cyber University)
Publication Information
Abstract
Clustering is one of the most fundamental algorithms in machine learning. The performance of clustering is affected by the distribution of data, and when there are more data or more dimensions, the performance is degraded. For this reason, we use a stacked auto encoder, one of the deep learning algorithms, to reduce the dimension of data which generate a feature vector that best represents the input data. We use k-means, which is a famous algorithm, as a clustering. Sine the feature vector which reduced dimensions are also multi dimensional, we use the Euclidean distance as well as the cosine similarity to increase the performance which calculating the similarity between the center of the cluster and the data as a vector. A deep clustering networks combining a stacked auto encoder and k-means re-trains the networks when the k-means result changes. When re-training the networks, the loss function of the stacked auto encoder and the loss function of the k-means are combined to improve the performance and the stability of the network. Experiments of benchmark image ad document dataset empirically validated the power of the proposed algorithm.
Keywords
Deep Learning; Clustering; Auto Encoder; K-means; Dimension Reduction; Deep Clustering Networks;
Citations & Related Records
Times Cited By KSCI : 1  (Citation Analysis)
연도 인용수 순위
1 G. Trigeorgis, K. Bousmalis, S. Zafeiriou, and B. Schuller, "A Deep Semi-nmf Model for Learning Hidden Representations," Proceeding of the 31st International Conference on Machine Learning, Vol. 46, pp. 1692-1700, 2014.
2 J. Yang, D. Parikh, and D. Batra, "Joint Unsupervised Learning of Deep Representations and Image Clusters," Proceeding of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5147-5156, 2016.
3 F. Li, H. Qiao, B. Zhang, and X. Xi, "Discriminatively Boosted Image Clustering with Fully Convolutional Auto-encoders," Pattern Recognition, Vol. 83, pp. 161-173, 2018.   DOI
4 H.J. Lee, "Hierarchical Deep Belief Network for Activity Recognition Using Smartphone Sensor," Journal of Korea Multimedia Society, Vol. 20, No. 8, pp. 1421-1429, 2017.   DOI
5 J. Xie, R. Girshick, and A. Farhadi, "Unsupervised Deep Embedding for Clustering Analysis," Proceeding of the 33rd International Conference on Machine Learning, Vol. 48, pp. 478-487, 2016.
6 C.M. Bishop, Pattern Recognition and Machine Learning, Springer, New York, 2016.
7 U.V. Luxburg, "A Tutorial on Spectral Clustering," Statistics and Computing, Vol. 17, No. 4, pp. 395-416, 2007.   DOI
8 L. Van Der Maaten, "Accelerating t-SNE Using Tree-based Algorithms," The Journal of Machine Learning Research, Vol. 15, No. 1, pp. 3221-3245, 2014.
9 B. Yang, X. Fu, N.D. Sidiropoulos, and M. Hong, "Towards K-means-friendly Spaces: Simultaneous Deep Learning and Clustering," Proceeding of the 34th International Conference on Machine Learning, arXiv:1610.04794, 2017.
10 P. Domingos, The Master Algorithm: How the Quest for the Ultimate Learning Machine Will Remake Our World, Basic Books, New York, 2015.
11 G. Xifeng, L. Xinwang, Z. En, and Y. Jianping, "Deep Clustering with Convolutional Autoencoders," Lecture Notes in Computer Science, Vol. 10635, pp. 373-382, 2017.
12 L.V.D. Maaten and G. Hinton, "Visualizing Data Using Accelerating t-SNE Using Treebased Algorithms,," The Journal of Machine Learning Research, Vol. 9, pp. 2579-2605, 2008.
13 Y. LeCun, C. Cortes, and C.J. Burges, http://yann.lecun.com/exdb/mnist/ (accessed Mar., 20, 2018).
14 D.D. Lewis, Y. Yang, T.G. Rose, and F. Li, "RCV1: A New Benchmark Collection for Text Categorization Research," The Journal of Machine Learning Research, Vol. 5, pp. 361-397, 2004.
15 J. Ye, Z. Zhao, and M. Wu, "Discriminative K-means for Clustering," Proceeding of the 21st Annual Conference on Neural Information Processing Systems, arXiv:1306.2102, 2009.