DOI QR코드

DOI QR Code

An Enhanced Text Mining Approach using Ensemble Algorithm for Detecting Cyber Bullying

  • Z.Sunitha Bai (Department of Computer Science and Engineering, R.V.R. and J.C. College of Engineering, Acharya Nagarjuna University) ;
  • Sreelatha Malempati (Department of Computer Science and Engineering, R.V.R. and J.C. College of Engineering)
  • Received : 2023.05.05
  • Published : 2023.05.30

Abstract

Text mining (TM) is most widely used to process the various unstructured text documents and process the data present in the various domains. The other name for text mining is text classification. This domain is most popular in many domains such as movie reviews, product reviews on various E-commerce websites, sentiment analysis, topic modeling and cyber bullying on social media messages. Cyber-bullying is the type of abusing someone with the insulting language. Personal abusing, sexual harassment, other types of abusing come under cyber-bullying. Several existing systems are developed to detect the bullying words based on their situation in the social networking sites (SNS). SNS becomes platform for bully someone. In this paper, An Enhanced text mining approach is developed by using Ensemble Algorithm (ETMA) to solve several problems in traditional algorithms and improve the accuracy, processing time and quality of the result. ETMA is the algorithm used to analyze the bullying text within the social networking sites (SNS) such as facebook, twitter etc. The ETMA is applied on synthetic dataset collected from various data a source which consists of 5k messages belongs to bullying and non-bullying. The performance is analyzed by showing Precision, Recall, F1-Score and Accuracy.

Keywords

References

  1. F. Elsafoury, S. Katsigiannis, Z. Pervez and N. Ramzan, "When the Timeline Meets the Pipeline: A Survey on Automated Cyberbullying Detection," in IEEE Access, vol. 9, pp. 103541-103563, 2021, doi:10.1109/ACCESS.2021.3098979.
  2. M. A. Al-garadi, M. R. Hussain, N. Khan, G. Murtaza, H. F. Nweke, I. Ali, G. Mujtaba, H. Chiroma, H. A. Khattak, and A. Gani, "Predicting cyberbullying on social media in the big data era using machine learning algorithms: Review of literature and open challenges," IEEE Access, vol. 7, pp. 70701-70718, 2019. https://doi.org/10.1109/ACCESS.2019.2918354
  3. M. A. Al-Garadi, K. D. Varathan and S. D. Ravana, "Cybercrime detection in online communications: The experimental case of cyberbullying detection in the Twitter network", Comput. Hum. Behav., vol. 63, pp. 433-443, Oct. 2016. https://doi.org/10.1016/j.chb.2016.05.051
  4. G. M. Abaido, "Cyberbullying on social media platforms among university students in the united arab emirates", Int. J. Adolescence Youth, vol. 25, no. 1, pp. 407-420, Dec. 2020. https://doi.org/10.1080/02673843.2019.1669059
  5. D. Chatzakou, N. Kourtellis, J. Blackburn, E. De Cristofaro, G. Stringhini and A. Vakali, "Mean birds: Detecting aggression and bullying on Twitter", Proc. ACM Conf. Web Sci. (WebSci), pp. 13-22, 2017.
  6. E. Raisi and B. Huang, "Cyberbullying detection with weakly supervised machine learning", Proc. IEEE/ACM Int. Conf. Adv. Social Netw. Anal. Mining, pp. 409-416, Jul. 2017.
  7. E. Raisi and B. Huang, "Weakly supervised cyberbullying detection using co-trained ensembles of embedding models", Proc. IEEE/ACM Int. Conf. Adv. Social Netw. Anal. Mining (ASONAM), pp. 479-486, Aug. 2018.
  8. Vijay B, Jui T, Pooja G, Pallavi V., "Detection of Cyberbullying Using Deep Neural Network", in 5th International Conference on Advanced Computing & Communication Systems (ICACCS), pp.604-607, 2019
  9. Monirah A., Mourad Y., "Optimized Twitter Cyberbullying Detection based on Deep Learning" in 21st Saudi Computer Society National Computer Conference (NCC), 2018.
  10. Batoul H, Maroun C, Fadi Y., "Cyberbullying Detection: A Survey on Multilingual Techniques" in European Modelling Symposium (EMS), pp. 165-171, 2016.
  11. Xiang Z, Jonathan T, Nishant V, Elizabeth W., "Cyberbullying Detection with a Pronunciation Based Convolutional Neural Network", in 15th IEEE International Conference on Machine Learning and Applications (ICMLA), pp. 740-745, 2016.
  12. Lin L., Linlong X., Nanzhi W., GuocaiY. "Text classification method based on convolution neural network", in 3rd IEEE International Conference on Computer and Communications (ICCC), pp . 1985-1989, 2017.
  13. A. Onan, "Sentiment analysis on product reviews based on weighted word embeddings and deep neural networks," Concurrency and Computation: Practice and Experience, p. e5909, 2020.
  14. M. Lan, C. L. Tan, J. Su, and Y. Lu, "Supervised and traditional term weighting methods for automatic text categorization," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 31, no. 4, pp. 721-735, 2008. https://doi.org/10.1109/TPAMI.2008.110
  15. R. Johnson and T. Zhang, "Effective use of word order for text categorization with convolutional neural networks," arXivPrepr. arXiv1412.1058, 2014.
  16. X. Zhang, J. Zhao, and Y. LeCun, "Character-level convolutional networks for text classification," in Advances in neural information processing systems, 2015, pp. 649-657.
  17. A. J. McMinn, Y. Moshfeghi, and J. M. Jose, "Building a large-scale corpus for evaluating event detection on twitter," in Proceedings of the 22nd ACM international conference on Information & Knowledge Management, 2013, pp. 409-418.