Sentiment Analysis of COVID-19 Tweets: Impact of Pre-processing Step

  • Ayadi, Rami (Department of Computer Sciences, College of Science and Arts of Gurayyat, Jouf University) ;
  • Shahin, Osama R. (Department of Computer Sciences, College of Science and Arts of Gurayyat, Jouf University) ;
  • Ghorbel, Osama (Department of Computer Sciences, College of Science and Arts of Gurayyat, Jouf University) ;
  • Alanazi, Rayan (Department of Computer Sciences, College of Science and Arts of Gurayyat, Jouf University) ;
  • Saidi, Anouar (Department of Mathematics, College of Science and Arts of Gurayyat, Jouf University)
  • 투고 : 2021.03.05
  • 발행 : 2021.03.30


Internet users are increasingly invited to express their opinions on various subjects in social networks, e-commerce sites, news sites, forums, etc. Much of this information, which describes feelings, becomes the subject of study in several areas of research such as: "Sensing opinions and analyzing feelings". It is the process of identifying the polarity of the feelings held in the opinions found in the interactions of Internet users on the web and classifying them as positive, negative, or neutral. In this article, we suggest the implementation of a sentiment analysis tool that has the role of detecting the polarity of opinions from people about COVID-19 extracted from social media (tweeter) in the Arabic language and to know the impact of the pre-processing phase on the opinions classification. The results show gaps in this area of research, first of all, the lack of resources when collecting data. Second, Arabic language is more complexes in pre-processing step, especially the dialects in the pre-treatment phase. But ultimately the results obtained are promising.



  1. Dubey, A. D. (2020). Twitter Sentiment Analysis during COVID19 Outbreak. Available at SSRN 3572023..
  2. M. Abdul-Mageed, M. Diab, M. Korayem .Subjectivity and sentiment analysis of modern standard Arabic. Proceedings of the 49th annual meeting of the association for computational linguistics: Human language technologies, Association for Computational Linguistics, Portland, Oregon, USA (2011), pp. 587-591.
  3. M. Abdul-Mageed, M. Diab. AWATIF: A multi-genre corpus for modern standard Arabic subjectivity and sentiment analysis. Proceedings of the eighth international conference on language resources and evaluation (LREC 2012), European Language Resources Association (ELRA), Istanbul, Turkey (2012), pp. 3907-3914
  4. M. Abdul-Mageed, M. Diab, S. Kbler. Samar: Subjectivity and sentiment analysis for arabic social media. Computer Speech & Language, 28 (1) (2014), pp. 20-37.
  5. M. Abdul-Mageed. Modeling arabic subjectivity and sentiment in lexical space. Information Processing & Management, 56 (2) (2019), pp. 291-307.
  6. M. Al-Smadi, O. Qawasmeh, M. Al-Ayyoub, Y. Jararweh, B. Gupta. Deep recurrent neural network vs. support vector machine for aspect-based sentiment analysis of arabic hotels reviews Journal of Computational Science, 27 (2018), pp. 386-393.
  7. M. Pontiki, D. Galanis, H. Papageorgiou, I. Androutsopoulos, S. Manandhar, M. AL-Smadi, G. Eryigit. SemEval-2016 task 5: Aspect based sentiment analysis; Proceedings of the 10th international workshop on semantic evaluation (SemEval-2016), Association for Computational Linguistics, San Diego, California (2016), pp. 19-30.
  8. M. Al-Ayyoub, S.B. Essa, I. Alsmadi. Lexicon-based sentiment analysis of arabic tweets. International Journal of Social Network Mining, 2 (2) (2015), pp. 101-114.
  9. T.H. Soliman, M.A. Elmasry, A. Hedar, M.M. Doss. Sentiment analysis of arabic slang comments on facebook. International Journal of Computers & Technology, 12 (5) (2014), pp. 3470-3478.
  10. N.A. Abdulla, N.A. Ahmed, M.A. Shehab, M. Al-Ayyoub. Arabic sentiment analysis: Lexicon-based and corpus-based. 2013 ieee jordan conference on applied electrical engineering and computing technologies (aeect) (2013), pp. 1-6
  11. A.M. Alayba, V. Palade, M. England, R. Iqbal. Arabic language sentiment analysis on health services. 2017 1st international workshop on arabic script analysis and recognition (asar) (2017), pp. 114-118
  12. Mountassir, A., Benbrahim, H., & Berrada, I. (2012). A cross-study of Sentiment Classification on Arabic corpora. In International Conference on Innovative Techniques and Applications of Artificial Intelligence (pp. 259-272). Springer, London.
  13. M. Rushdi-Saleh, M. T. Martin-Valdivia, L. A. Urena-Lopez et J. M. Perea-Ortega, (2011)≪Bilingual Experiments with an Arabic-English Corpus for Opinion Mining,≫ Proceedings of Recent Advances in Natural Language Processing, p. 740-745.
  14. R. Ayadi, M. Maraoui et M. Zrigui,(2009) ≪Intertextual distance for Arabic texts classification,≫ ICITST, pp. 1-6.
  15. Zrigui, R. Ayadi, M. Mars et M. Maraoui, (2012) ≪Arabic Text Classification Framework Based on Latent Dirichlet Allocation,≫ Journal of Computing and Information Technology - CIT 20, vol. 2, p. 125-140.
  16. S. Khoja,(2002)≪Shereen Khoja - Research,≫ [En ligne]. Available:
  17. Machine Learning Group at the University of Waikato(2013), ≪Weka 3: Data Mining Software in Java,≫. [En ligne]. Available:
  18. J. Chiquet, (2009) ≪Validation croisee pour le choix de paramere de mehodes,≫ Module MPR - option modelisation.
  19. M. Fernandez-Gavilanes, T. Alvarez-Lopez, J. Juncal-Martinez, E. Costa-Montenegro, and F. Javier GonzalezCastano, "Unsupervised method for sentiment analysis in online texts," Expert Systems with Applications, vol. 58, pp. 57-75, 2016
  20. GOEL, Ankur, GAUTAM, Jyoti, et KUMAR, Sitesh. Real time sentiment analysis of tweets using Naive Bayes. In : 2016 2nd International Conference on Next Generation Computing Technologies (NGCT). IEEE, 2016. p. 257-261.
  21. AHMAD, Munir, AFTAB, Shabib, et ALI, Iftikhar. Sentiment analysis of tweets using svm. Int. J. Comput. Appl, 2017, vol. 177, no 5, p. 25-29.