DOI QR코드

DOI QR Code

Classifying Social Media Users' Stance: Exploring Diverse Feature Sets Using Machine Learning Algorithms

  • Kashif Ayyub (Department of Computer Science, COMSATS University Islamabad, Wah Campus) ;
  • Muhammad Wasif Nisar (Department of Computer Science, COMSATS University Islamabad, Wah Campus) ;
  • Ehsan Ullah Munir (Department of Computer Science, COMSATS University Islamabad, Wah Campus) ;
  • Muhammad Ramzan (Department of Computer Science and Information Technology, University of Sargodha)
  • Received : 2024.02.05
  • Published : 2024.02.29

Abstract

The use of the social media has become part of our daily life activities. The social web channels provide the content generation facility to its users who can share their views, opinions and experiences towards certain topics. The researchers are using the social media content for various research areas. Sentiment analysis, one of the most active research areas in last decade, is the process to extract reviews, opinions and sentiments of people. Sentiment analysis is applied in diverse sub-areas such as subjectivity analysis, polarity detection, and emotion detection. Stance classification has emerged as a new and interesting research area as it aims to determine whether the content writer is in favor, against or neutral towards the target topic or issue. Stance classification is significant as it has many research applications like rumor stance classifications, stance classification towards public forums, claim stance classification, neural attention stance classification, online debate stance classification, dialogic properties stance classification etc. This research study explores different feature sets such as lexical, sentiment-specific, dialog-based which have been extracted using the standard datasets in the relevant area. Supervised learning approaches of generative algorithms such as Naïve Bayes and discriminative machine learning algorithms such as Support Vector Machine, Naïve Bayes, Decision Tree and k-Nearest Neighbor have been applied and then ensemble-based algorithms like Random Forest and AdaBoost have been applied. The empirical based results have been evaluated using the standard performance measures of Accuracy, Precision, Recall, and F-measures.

Keywords

References

  1. W. Medhat, A. Hassan, and H. Korashy, "Sentiment analysis algorithms and applications: a survey. Ain Shams Eng J 5 (4): 1093-1113," ed, 2014. https://doi.org/10.1016/j.asej.2014.04.011
  2. T. Nasukawa and J. Yi, "Sentiment analysis: Capturing favorability using natural language processing," in Proceedings of the 2nd international conference on Knowledge capture, 2003, pp. 70-77.
  3. J. Ebrahimi, D. Dou, and D. Lowd, "A joint sentiment-target-stance model for stance classification in tweets," in Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, 2016, pp. 2656-2665.
  4. J. Du, R. Xu, Y. He, and L. Gui, "Stance classification with target-specific neural attention networks," 2017: International Joint Conferences on Artificial Intelligence.
  5. M. Lukasik, P. Srijith, D. Vu, K. Bontcheva, A. Zubiaga, and T. Cohn, "Hawkes processes for continuous time sequence classification: an application to rumour stance classification in twitter," in Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), 2016, pp. 393-398.
  6. A. Mandya, A. Siddharthan, and A. Wyner, "Scrutable feature sets for stance classification," in Proceedings of the Third Workshop on Argument Mining (ArgMining2016), 2016, pp. 60-69.
  7. I. Persing and V. Ng, "Modeling stance in student essays," in Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2016, pp. 2174-2184.
  8. P. Sobhani, D. Inkpen, and X. Zhu, "A dataset for multi-target stance detection," in Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers, 2017, pp. 551-557.
  9. A. Misra, B. Ecker, T. Handleman, N. Hahn, and M. Walker, "Nlds-ucsc at semeval-2016 task 6: A semi-supervised approach to detecting stance in tweets," in Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016), 2016, pp. 420-427.
  10. K. Joseph, L. Friedland, W. Hobbs, O. Tsur, and D. Lazer, "Constance: Modeling annotation contexts to improve stance classification," arXiv preprint arXiv:1708.06309, 2017.
  11. M. Lai, A. T. Cignarella, H. FARIAS, and D. IRAZU, "ITACOS at ibereval2017: detecting stance in Catalan and Spanish tweets," in IberEval 2017, 2017, vol. 1881, pp. 185-192: CEUR-WS. org.
  12. M. Lai, D. I. H. Farias, V. Patti, and P. Rosso, "Friends and enemies of clinton and trump: using context for detecting stance in political tweets," in Mexican International Conference on Artificial Intelligence, 2016, pp. 155-168: Springer.
  13. P. Bourgonje, J. M. Schneider, and G. Rehm, "From clickbait to fake news detection: an approach based on detecting the stance of headlines to articles," in Proceedings of the 2017 EMNLP Workshop: Natural Language Processing meets Journalism, 2017, pp. 84-89.
  14. R. Bar-Haim, L. Edelstein, C. Jochim, and N. Slonim, "Improving claim stance classification with lexical knowledge expansion and context utilization," in Proceedings of the 4th Workshop on Argument Mining, 2017, pp. 32-38.
  15. A. Aker, L. Derczynski, and K. Bontcheva, "Simple open stance classification for rumour analysis," arXiv preprint arXiv:1708.05286, 2017.
  16. F. Barbieri, "Shared Task on Stance and Gender Detection in Tweets on Catalan Independence-LaSTUS System Description," in IberEval@ SEPLN, 2017, pp. 217-221.
  17. R. Bar-Haim, I. Bhattacharya, F. Dinuzzo, A. Saha, and N. Slonim, "Stance classification of context-dependent claims," in Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers, 2017, pp. 251-261.
  18. F. Boltuzic and J. Snajder, "Toward stance classification based on claim microstructures," in 8th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis, 2017.
  19. D. A. Garcia and A. M. L. Flor, "Stance detection at IberEval 2017: A Biased Representation for a Biased Problem," System, vol. 2, p. 1, 2017.
  20. P. Krejzl, B. Hourova, and J. Steinberger, "Stance detection in online discussions," arXiv preprint arXiv:1701.00504, 2017.
  21. D. Kucuk, "Joint named entity recognition and stance detection in tweets," arXiv preprint arXiv:1707.09611, 2017.
  22. D. Kucuk, "Stance detection in Turkish tweets," arXiv preprint arXiv:1706.06894, 2017.
  23. G. G. Shenoy, E. H. Dsouza, and S. Kubler, "Performing stance detection on Twitter data using computational linguistics techniques," arXiv preprint arXiv:1703.02019, 2017.
  24. S. Swami, A. Khandelwal, M. Shrivastava, and S. S. Akhtar, "LTRC IIITH at IBEREVAL 2017: Stance and Gender Detection in Tweets on Catalan Independence," in IberEval@ SEPLN, 2017, pp. 199-203.
  25. S. Vychegzhanin and E. V. Kotelnikov, "Stance Detection in Russian: a Feature Selection and Machine Learning Based Approach," in AIST (Supplement), 2017, pp. 166-177.
  26. A. Sasaki, K. Hanawa, N. Okazaki, and K. Inui, "Predicting stances from social media posts using factorization machines," in Proceedings of the 27th International Conference on Computational Linguistics, 2018, pp. 3381-3390.
  27. S. M. Mohammad and P. D. Turney, "Crowdsourcing a word-emotion association lexicon," Computational Intelligence, vol. 29, no. 3, pp. 436-465, 2013.
  28. R. Plutchik, "The nature of emotions: Human emotions have deep evolutionary roots, a fact that may explain their complexity and provide tools for clinical practice," American scientist, vol. 89, no. 4, pp. 344-350, 2001. https://doi.org/10.1511/2001.28.344
  29. C. Whissell, "Using the revised dictionary of affect in language to quantify the emotional undertones of samples of natural language," Psychological reports, vol. 105, no. 2, pp. 509-521, 2009. https://doi.org/10.2466/PR0.105.2.509-521
  30. C. E. Osgood and G. J. Suci, "& Tannenbaum, PH (1957). The measurement of meaning," Urbana: University of Illinois Press, vol. 335.
  31. J. W. Pennebaker, M. E. Francis, and R. J. Booth, "Linguistic inquiry and word count: LIWC 2001," Mahway: Lawrence Erlbaum Associates, vol. 71, no. 2001, p. 2001, 2001.
  32. K. Krippendorff, Content analysis: An introduction to its methodology. Sage publications, 2018.
  33. J. Haidt, The righteous mind: Why good people are divided by politics and religion. Vintage, 2012.
  34. D. Riff, S. Lacy, F. Fico, and B. Watson, Analyzing media messages: Using quantitative content analysis in research. Routledge, 2019.
  35. J. Graham, J. Haidt, and B. A. Nosek, "Liberals and conservatives rely on different sets of moral foundations," Journal of personality and social psychology, vol. 96, no. 5, p. 1029, 2009.
  36. J. W. Pennebaker, M. R. Mehl, and K. G. Niederhoffer, "Psychological aspects of natural language use: Our words, our selves," Annual review of psychology, vol. 54, no. 1, pp. 547-577, 2003. https://doi.org/10.1146/annurev.psych.54.101601.145041
  37. M. Hossin and M. Sulaiman, "A review on evaluation metrics for data classification evaluations," International Journal of Data Mining & Knowledge Management Process, vol. 5, no. 2, p. 1, 2015.