Browse > Article
http://dx.doi.org/10.9717/kmms.2020.23.7.883

Variational Expectation-Maximization Algorithm in Posterior Distribution of a Latent Dirichlet Allocation Model for Research Topic Analysis  

Kim, Jong Nam (Dept. of IT Convergence & Application Eng. Pukyong National University)
Publication Information
Abstract
In this paper, we propose a variational expectation-maximization algorithm that computes posterior probabilities from Latent Dirichlet Allocation (LDA) model. The algorithm approximates the intractable posterior distribution of a document term matrix generated from a corpus made up by 50 papers. It approximates the posterior by searching the local optima using lower bound of the true posterior distribution. Moreover, it maximizes the lower bound of the log-likelihood of the true posterior by minimizing the relative entropy of the prior and the posterior distribution known as KL-Divergence. The experimental results indicate that documents clustered to image classification and segmentation are correlated at 0.79 while those clustered to object detection and image segmentation are highly correlated at 0.96. The proposed variational inference algorithm performs efficiently and faster than Gibbs sampling at a computational time of 0.029s.
Keywords
Variational Inference; KL-Divergence; Expectation-Maximization; Likelihood;
Citations & Related Records
Times Cited By KSCI : 3  (Citation Analysis)
연도 인용수 순위
1 M. Lamba, "Mapping of Topics in DESIDOC Journal of Library and Information Technology, India: A Study," Scientometrics, Vol. 120, No 20, pp. 477-505, 2019.   DOI
2 D. Blei, “Probabilistic Topic Models,” Communications of ACM, Vol. 55, No. 4, pp. 77-84, 2012.   DOI
3 J. Mlyahilu and J. Kim, “Generative Probabilistic Model with Dirichlet Prior Distribution for Similarity Analysis of Research Topic,” Journal of Korea Multimedia Society, Vol. 23, No. 4, pp. 595-602, 2020.   DOI
4 T. Hofmann, "Probabilistic Latent Semantic Indexing," Proceeding of International ACM Special Interest Group on Information Retrieval Conference on Research and Development in Information Retrieval, pp. 50-57, 1999.
5 D. Blei and J. Lafferty, “Correlated Topics Model of Science,” The Annals of Applied Statistics, Vol. 1, No. 1, pp. 17-35, 2007.   DOI
6 T. Liu, N. Zhang, and P. Chen, "Hierarchical Latent Tree Analysis for Topic Detection," Lecture Notes in Computer Science, Vol. 8725, pp. 256-272, 2014.
7 S. Moghaddam and E. Martin, "On the Design of LDA Models for Aspect-based Opinion Mining," Proceedings of ACM International Conference on Information and Knowledge Management, pp. 803-812, 2012.
8 W. Fox and S. Roberts, "A Tutorial on Variational Bayesian Inference," Artificial Intelligence Review, pp. 1-11, 2012.
9 D. Tzikas, A. Likas, and N. Galatsanos, “The Variational Approximation for Bayesian Inference,” IEEE Signal Processing Magazine, Vol. 25, No. 6, pp. 131-146, 2005.   DOI
10 M. Jordan, Z. Ghahramani, T. Jaakkola, and L. Saul, "An Introduction to Variational Methods for Graphical Models," Machine Learning, Vol. 37. No. 2, pp. 183-233, 1999.   DOI
11 T. Yang, J. Torget, and R. Mihalcea, "Topic Modeling on Historical Newspapers," Proceeding of the Association for Computational Linguistics: Human Language Technologies Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities, pp. 96-104, 2011.
12 Y. Zaho and Y. Cen, Data Mining Applications with R, Academic Press, Cambridge, Massachusetts, 2013.
13 D. Lee and H. Seung, "Learning the Parts of Objects by Non-negative Matrix Factorization," Nature, Vol. 401, pp. 788-791, 1999.   DOI
14 C. Papadimitriou, P. Raghavan, H, Tamaki, and S. Vempala, "Latent Semantic Indexing: A Probabilistic Analysis," Proceedings of the 1998 17th ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, pp. 159-168, 1998.
15 D. Blei, A. Ng, and M. Jordan, "A Latent Dirichlet Allocation," Journal of Machine Learning Research, Vol. 3, pp. 993-1022, 2003.
16 W. Li and A. McCallum, "Pachinko Allocation: DAG-structured Mixture Models of Topic Correlations," Proceedings of the International Conference on Machine Learning, pp. 577-584, 2006.
17 M. Nam, E. Lee, and J. Shin, “A Method for User Sentiment Classification Using Instagram Hashtags,” Journal of Korea Multimedia Society, Vol. 18, No. 11, pp. 1391-1399, 2015.   DOI