DOI QR코드

DOI QR Code

Phrase-Chunk Level Hierarchical Attention Networks for Arabic Sentiment Analysis

  • Abdelmawgoud M. Meabed (Faculty of Statistical Studies and Research Cairo University) ;
  • Sherif Mahdy Abdou (Information Technology Department, Faculty of Computers and Information, Cairo University) ;
  • Mervat Hassan Gheith (Faculty of Statistical Studies and Research Cairo University)
  • Received : 2023.09.05
  • Published : 2023.09.30

Abstract

In this work, we have presented ATSA, a hierarchical attention deep learning model for Arabic sentiment analysis. ATSA was proposed by addressing several challenges and limitations that arise when applying the classical models to perform opinion mining in Arabic. Arabic-specific challenges including the morphological complexity and language sparsity were addressed by modeling semantic composition at the Arabic morphological analysis after performing tokenization. ATSA proposed to perform phrase-chunks sentiment embedding to provide a broader set of features that cover syntactic, semantic, and sentiment information. We used phrase structure parser to generate syntactic parse trees that are used as a reference for ATSA. This allowed modeling semantic and sentiment composition following the natural order in which words and phrase-chunks are combined in a sentence. The proposed model was evaluated on three Arabic corpora that correspond to different genres (newswire, online comments, and tweets) and different writing styles (MSA and dialectal Arabic). Experiments showed that each of the proposed contributions in ATSA was able to achieve significant improvement. The combination of all contributions, which makes up for the complete ATSA model, was able to improve the classification accuracy by 3% and 2% on Tweets and Hotel reviews datasets, respectively, compared to the existing models.

Keywords

References

  1. Serrano-Guerrero, J., et al., Sentiment analysis: A review and comparative analysis of web services. Information Sciences, 2015. Vol. 311: p. pp. 18-38. 
  2. Ibrahim, H., S. Abdou, and M. Gheith, Idioms-Proverbs Lexicon for Modern Standard Arabic and Colloquial Sentiment Analysis. arXiv preprint arXiv:1506.01906, 2015.  https://doi.org/10.5120/20790-3435
  3. Goncalves, P., et al. On the combination of off-the-shelf sentiment analysis methods. in Proceedings of the 31st Annual ACM Symposium on Applied Computing. 2016. ACM. 
  4. Manek, A.S., et al., Aspect term extraction for sentiment analysis in large movie reviews using Gini Index feature selection method and SVM classifier. World Wide Web, 2016: p. pp. 1-20. 
  5. Agarwal, B. and N. Mittal, Machine Learning Approach for Sentiment Analysis, in Prominent Feature Extraction for Sentiment Analysis. 2016, Springer. p. pp. 21-45. 
  6. Al-Mansouri, E., Using Artificial Neural Networks and Sentiment Analysis to Predict Upward Movements in Stock Price. 2016, Worcester Polytechnic Institute. 
  7. Medhat, W., A. Hassan, and H. Korashy, Sentiment analysis algorithms and applications: A survey. Ain Shams Engineering Journal, 2014. Vol. 5(Issue 4): p. pp. 1093-1113. 
  8. Qin, Z., A framework and practical implementation for sentiment analysis and aspect exploration, in Management Sciences and Marketing (MSM) Division. 2017, University of Manchester: Faculty of Humanities. 
  9. Sutskever, I., O. Vinyals, and Q.V. Le. Sequence to sequence learning with neural networks. in Advances in neural information processing systems. 2014. 
  10. Socher, R., Recursive deep learning for natural language processing and computer vision. 2014, Citeseer: Stanford University. 
  11. Al-Sallab, A., et al., Aroma: A recursive deep learning model for opinion mining in arabic as a low resource language. ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP), 2017. 16(4): p. 25. 
  12. Maabid, A., T. Elghazaly, and M. Ghaith, An Enhanced Rule Based Arabic Morphological Analyzer Based on Proposed Assessment Criteria, in The Sixth International Conference on Swarm Intelligence and the Second BRICS Congress on Computational Intelligence (ICSI-CCI'2015) (IEEE Conference #36364). 2015, Springer Lecture Notes in Computer Science series: Beijing, China.
  13. Diab, M. Second generation AMIRA tools for Arabic processing: Fast and robust tokenization, POS tagging, and base phrase chunking. in 2nd International Conference on Arabic Language Resources and Tools. 2009. 
  14. Diab, M.T. Improved Arabic base phrase chunking with a new enriched POS tag set. in Proceedings of the 2007 Workshop on Computational Approaches to Semitic Languages: Common Issues and Resources. 2007. Association for Computational Linguistics. 
  15. Abdul-Mageed, M., S. Kubler, and M. Diab. Samar: A system for subjectivity and sentiment analysis of arabic social media. in Proceedings of the 3rd Workshop in Computational Approaches to Subjectivity and Sentiment Analysis. 2012. Association for Computational Linguistics. 
  16. Yang, Z., et al. Hierarchical attention networks for document classification. in Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2016. 
  17. Bahdanau, D., K. Cho, and Y. Bengio, Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473, 2014. 
  18. Manning, C., et al. The Stanford CoreNLP natural language processing toolkit. in Proceedings of 52nd annual meeting of the association for computational linguistics: system demonstrations. 2014. 
  19. Mikolov, T., et al. Distributed representations of words and phrases and their compositionality. in Advances in neural information processing systems. 2013. 
  20. Zhang, L., et al., Combining lexicon-based and learning-based methods for Twitter sentiment analysis. HP Laboratories, Technical Report HPL-2011, 2011. 89.