DOI QR코드

DOI QR Code

Phrase-based Topic and Sentiment Detection and Tracking Model using Incremental HDP

  • Chen, YongHeng (College of Computer Science, Minnan Normal University) ;
  • Lin, YaoJin (College of Computer Science, Minnan Normal University) ;
  • Zuo, WanLi (College of Computer Science and Technology, Jilin University)
  • Received : 2017.04.28
  • Accepted : 2017.08.25
  • Published : 2017.12.31

Abstract

Sentiments can profoundly affect individual behavior as well as decision-making. Confronted with the ever-increasing amount of review information available online, it is desirable to provide an effective sentiment model to both detect and organize the available information to improve understanding, and to present the information in a more constructive way for consumers. This study developed a unified phrase-based topic and sentiment detection model, combined with a tracking model using incremental hierarchical dirichlet allocation (PTSM_IHDP). This model was proposed to discover the evolutionary trend of topic-based sentiments from online reviews. PTSM_IHDP model firstly assumed that each review document has been composed by a series of independent phrases, which can be represented as both topic information and sentiment information. PTSM_IHDP model secondly depended on an improved time-dependency non-parametric Bayesian model, integrating incremental hierarchical dirichlet allocation, to estimate the optimal number of topics by incrementally building an up-to-date model. To evaluate the effectiveness of our model, we tested our model on a collected dataset, and compared the result with the predictions of traditional models. The results demonstrate the effectiveness and advantages of our model compared to several state-of-the-art methods.

Keywords

References

  1. DM Blei, A. Y. Ng and M. I. Jordan, "Latent dirichlet allocation," Journal of Machine Learning Research, vol. 3, pp. 993-1022, May, 2003.
  2. Y.W Teh, M.I Jordan, MJ Beal and DM Blei, "Hierarchical Dirichlet Processes," Journal of the American Statistical Association, vol. 101, no. 476, pp. 1566-1581, Dec., 2006. https://doi.org/10.1198/016214506000000302
  3. Choi, Yejin and et al, "Identifying sources of opinions with conditional random fields and extraction patterns," in Proc. of Conference on Human Language Technology and Empirical Methods in Natural Language Processing Association for Computational Linguistics, pp. 355-362, Oct., 2005.
  4. B. Liu, M. Hu, and J. Cheng, "Opinion observer: analyzing and comparing opinions on the Web," in Proc. of International Conference on World Wide Web ACM, pp. 342-351, Sep., 2005.
  5. Bo Pang and Lillian Lee, "A sentimental education: sentiment analysis using subjectivity summarization based on minimum cuts," Meeting on Association for Computational Linguistics Association for Computational Linguistics, no. 271, July 21-26, 2004.
  6. B. Pang, L. Lee and S. Vaithyanathan, "Thumbs up: sentiment classification using machine learning techniques," in Proc. of Conference on Empirical Methods in Natural Language Processing Association for Computational Linguistics, vol. 10, pp. 79-86, 2002.
  7. X. Wang and A. Mccallum, "Topics over time: a non-Markov continuous-time model of topical trends," in Proc. of ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 424-433, Aug. 20-23, 2006.
  8. L. Alsumait, D. Barbara and C. Domeniconi, "On-line LDA: Adaptive Topic Models for Mining Text Streams with Applications to Topic Detection and Tracking," IEEE International Conference on Data Mining IEEE Computer Society, pp. 3-12, Dec. 15-19, 2008.
  9. L. Sato and H. Nakagawa, "Stochastic Divergence Minimization for Online Collapsed Variational Bayes Zero Inference of Latent Dirichlet Allocation," in Proc. of ACM SIGKDD International Conference ACM, pp. 1035-1044, 2015.
  10. K. Sasaki, T. Yoshikawa and T. Furuhashi, "Twitter-TTM: An efficient online topic modeling for Twitter considering dynamics of user interests and topic trends," in Proc. of International Symposium on Soft Computing and Intelligent Systems IEEE, pp. 440-445, Dec. 3-6, 2014.
  11. C. Lin and Y. He, "Joint sentiment/topic model for sentiment analysis," in Proc. of ACM Conference on Information and Knowledge Management ACM, vol. 217, no. 4, pp. 375-384, Nov. 2-6, 2009.
  12. Y. Jo and A. H. Oh, "Aspect and sentiment unification model for online review analysis," in Proc. of International Conference on Web Search and Web Data Mining, vol. 81, no. 6, pp. 815-824, Feb. 9-12, 2011.
  13. C. Lin, Y. He, R. Everson and S. Ruger, "Weakly Supervised Joint Sentiment-Topic Detection from Text," IEEE Transactions on Knowledge and Data Engineering, vol. 12, no. 6, pp. 1134-1145, Jun., 2012.
  14. A. Lijoi, R. H. Mena and I. Prunster, "Bayesian Nonparametric Analysis for a Generalized Dirichlet Process Prior," Statistical Inference for Stochastic Processes, vol. 8, no. 3, pp. 283-309, Dec., 2005. https://doi.org/10.1007/s11203-005-6071-z
  15. R. M. Neal, "Markov Chain Sampling Methods for Dirichlet Process Mixture Models," Journal of Computational and Graphical Statistics, vol. 9, no. 2, pp. 249-265, Jun., 2000. https://doi.org/10.2307/1390653
  16. K. Yu and P. M. Djuri, "Dirichlet process mixture models for time-dependent clustering," IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 4383-4387, March 20-25, 2016.
  17. L. Ren, D. B. Dunson and L. Carin, "The dynamic hierarchical Dirichlet process," in Proc. of the international conference on Machine learning, pp.824-831, July 5-9, 2008.
  18. T. Xu, Z. Zhang, P. S. Yu and B. Long, "Dirichlet Process Based Evolutionary Clustering," in Proc. of IEEE International Conference on Data Mining, pp. 648-657, Dec. 15-19, 2008.
  19. T. Xu, Z. Zhang, P. S. Yu and B. Long, "Evolutionary Clustering by Hierarchical Dirichlet Process with Hidden Markov State," in Proc. of IEEE International Conference on Data Mining, pp. 658-667, Dec. 15-19, 2008.
  20. D. Sorensen and D. Gianola, "Implementation and Analysis of MCMC Samples," Likelihood, Bayesian, and MCMC Methods in Quantitative Genetics, pp. 539-560, 2002.
  21. T. L. Griffiths and M. Steyvers, "Finding scientific topics," in Proc. of the National Academy of Sciences of the United States of America, vol. 101, pp. 5228-5235, 12 Nov., 2014.
  22. K. Ganesan and C. Zhai, "Opinion-based entity ranking," Information Retrieval Journal, vol.15, no. 2, pp. 116-150, Apr., 2012. https://doi.org/10.1007/s10791-011-9174-8
  23. T. Wilson, J. Wiebe and P. Hoffmann, "Recognizing contextual polarity in phrase-level sentiment analysis," International Journal of Computer Applications, vol. 7, no. 5, pp. 347-354, Oct. 6-8, 2005.
  24. F. Maes, A. Collignon and et al, "Multimodality image registration by maximization of mutual information," IEEE Transactions on Medical Imaging, vol. 16, no. 2, pp. 187-198, Apr., 1997. https://doi.org/10.1109/42.563664
  25. S. Mousavi, K. Welch and et al, "Non-equilibrium split Hopkinson pressure bar procedure for non-parametric identification of complex modulus," International Journal of Impact Engineering, vol. 31, no. 9, pp. 1133-1151,Oct., 2005. https://doi.org/10.1016/j.ijimpeng.2004.07.002
  26. M. Zhang and B. Kang, "Visual Tracking Algorithm Based on Probabilistic Graphical Model," International Journal of Signal Processing Image Processing and Pattern Recognition, vol. 8, no. 9, pp.157-166, Sep., 2015. https://doi.org/10.14257/ijsip.2015.8.9.16
  27. T. J. Zhan and C. H. Li, "Semantic dependent word pairs generative model for fine-grained product feature mining," Asia conference on Advances in knowledge discovery and data mining, vol. 1, pp. 460-475, May 24-27, 2011.
  28. Lu Yue, C. X. Zhai and N. Sundaresan, "Rated aspect summarization of short comments." in Proc. of the international conference on World Wide Web, pp. 131-140, Apr. 20-24, 2009.
  29. S. Moghaddam and M. Ester, "ILDA: interdependent LDA model for learning latent aspects and their ratings from online product reviews," in Proc. of International ACM SIGIR Conference on Research and development in Information Retrieval, pp. 65-674, July 24-28, 2011.
  30. M. Dermouche, J. Velcin and et al, "A Joint Model for Topic-Sentiment Evolution over Time," in Proc. of IEEE International Conference on Data Mining IEEE, pp. 773-778, Dec. 14-17, 2014.
  31. Y. He, C. Lin, W. Gao and et al, "Dynamic joint sentiment-topic model," Acm Transactions on Intelligent Systems and Technology, vol. 5, no. 1, pp. 1-21, Dec., 2014.

Cited by

  1. Frontier knowledge discovery and visualization in cancer field based on KOS and LDA vol.118, pp.3, 2017, https://doi.org/10.1007/s11192-018-2989-y