[KSCI] Korea Science Citation Index Service

http://dx.doi.org/10.22937/IJCSNS.2021.21.1.1

Comparison of Machine Learning Techniques for Cyberbullying Detection on YouTube Arabic Comments

Alsubait, Tahani (College of Computer and Information Systems, Umm Al-Qura University)
Alfageh, Danyah (College of Computer and Information Systems, Umm Al-Qura University)

Publication Information

International Journal of Computer Science & Network Security / v.21, no.1, 2021 , pp. 1-5 More about this Journal

Abstract

Cyberbullying is a problem that is faced in many cultures. Due to their popularity and interactive nature, social media platforms have also been affected by cyberbullying. Social media users from Arab countries have also reported being a target of cyberbullying. Machine learning techniques have been a prominent approach used by scientists to detect and battle this phenomenon. In this paper, we compare different machine learning algorithms for their performance in cyberbullying detection based on a labeled dataset of Arabic YouTube comments. Three machine learning models are considered, namely: Multinomial Naïve Bayes (MNB), Complement Naïve Bayes (CNB), and Linear Regression (LR). In addition, we experiment with two feature extraction methods, namely: Count Vectorizer and Tfidf Vectorizer. Our results show that, using count vectroizer feature extraction, the Logistic Regression model can outperform both Multinomial and Complement Naïve Bayes models. However, when using Tfidf vectorizer feature extraction, Complement Naive Bayes model can outperform the other two models.

Keywords

Cyberbullying; Arabic dataset; Machine Learning; YouTube;

Citations & Related Records

Reference

1	Djedjiga Mouheb, Masa Hilal Abushamleh, Maya Hilal Abushamleh, Zaher Al Aghbari, and Ibrahim Kamel. "Real-time detection of cyberbullying in Arabic twitter streams". In: 2019 10th IFIP International Conference on New Technologies, Mobility and Security (NTMS). IEEE. 2019, pp. 1-5.
2	Ted Feinberg and Nicole Robey. "Cyberbullying". In: The education digest74.7 (2009), p. 26.
3	Batoul Haidar, Maroun Chamoun, and Ahmed Serhrouchni. "Arabic Cyberbullying Detection: Enhancing Performance by Using Ensemble Machine Learning". In: 2019 International Conference on Internet of Things (iThings) and IEEE Green Computing and Communications (GreenCom) and IEEE Cyber, Physical and Social Computing (CPSCom) and IEEE Smart Data (Smart-Data). IEEE. 2019, pp. 323-327.
4	Peter K Smith, Jess Mahdavi, Manuel Carvalho, Sonja Fisher, Shanette Russell, and Neil Tippett. "Cyberbullying: Its nature and impact in secondary school pupils". In: Journal of child psychology and psychiatry 49.4 (2008), pp. 376-385. DOI
5	Djedjiga Mouheb, Rutana Ismail, Shaheen Al Qaraghuli, Zaher Al Aghbari, and Ibrahim Kamel. "Detection of Offensive Messages in Arabic Social Media Communications". In: 2018 International Conference on Innovations in Information Technology (IIT). IEEE. 2018, pp. 24-29.
6	Azalden Alakrot, Liam Murray, and Nikola S Nikolov. "Dataset Construction for the Detection of Anti-Social Behaviour in Online Communication in Arabic". In: Procedia computer science. vol.142 (2018), pp. 174-181. DOI
7	Batoul Haidar, Maroun Chamoun, and Ahmed Serhrouchni. "A multilingual system for cyberbullying detection: Arabic content detection using machine learning". In: Advances in Science, Technology and Engineering Systems Journal 2.6 (2017), pp. 275-284. DOI
8	A. Prabhat and V. Khullar. "Sentiment classification on big data using Naive bayes and logistic regression". In: 2017 International Conference on Computer Communication and Informatics (ICCCI). 2017, pp. 1-5.doi:10.1109/ICCCI.2017.8117734. DOI
9	Azalden Alakrot, Liam Murray, and Nikola S Nikolov. "Towards accurate detection of offensive language in online communication in Arabic". In: Procedia computer science. vol142 (2018), pp. 315-320. DOI
10	Jason D Rennie, Lawrence Shih, Jaime Teevan, and David R Karger. "Tack-ling the poor assumptions of naive bayes text classifiers". In: Proceedings of the 20th international conference on machine learning (ICML-03). 2003, pp. 616-623.
11	Benaissa Azzeddine Rachid, Harbaoui Azza, and Hajjami Henda Ben Ghezala."Classification of Cyberbullying Text in Arabic". In: 2020 International Joint Conference on Neural Networks (IJCNN). IEEE. 2020, pp. 1-7.
12	G. Singh, B. Kumar, L. Gaur, and A. Tyagi. "Comparison between Multinomial and Bernoulli Naive Bayes for Text Classification". In:2019 International Conference on Automation, Computational and Technology Management (ICACTM). 2019, pp. 593- 596.doi:10.1109/ICACTM.2019.8776800. DOI
13	UNICEF. Cyberbullying: What is it and how to stop it. Feb. 2020.
14	Ghada M Abaido. "Cyberbullying on social media platforms among university students in the United Arab Emirates". In: International Journal of Adolescence and Youth 25.1 (2020), pp. 407-420. DOI
15	Djedjiga Mouheb, Raghad Albarghash, Mohamad Fouzi Mowakeh, ZaherAl Aghbari, and Ibrahim Kamel. "Detection of Arabic Cyberbullying on Social Networks using Machine Learning". In:2019 IEEE/ACS 16th International Conference on Computer Systems and Applications (AICCSA). IEEE. 2019, pp. 1-5.
16	Alaa Tharwat. "Classification assessment methods". In: Applied Computing and Informatics (2020).