Browse > Article
http://dx.doi.org/10.22937/IJCSNS.2021.21.11.40

Slangs and Short forms of Malay Twitter Sentiment Analysis using Supervised Machine Learning  

Yin, Cheng Jet (Fakulti Teknologi Maklumat dan Komunikasi, Universiti Teknikal Malaysia Melaka (UTeM))
Ayop, Zakiah (Information Security Forensics and Computer Networking (INSFORNET))
Anawar, Syarulnaziah (Information Security Forensics and Computer Networking (INSFORNET))
Othman, Nur Fadzilah (Information Security Forensics and Computer Networking (INSFORNET))
Zainudin, Norulzahrah Mohd (Jabatan Sains Komputer, Fakulti Sains dan Teknologi Pertahanan, Universiti Pertahanan Nasional Malaysia (UPNM))
Publication Information
International Journal of Computer Science & Network Security / v.21, no.11, 2021 , pp. 294-300 More about this Journal
Abstract
The current society relies upon social media on an everyday basis, which contributes to finding which of the following supervised machine learning algorithms used in sentiment analysis have higher accuracy in detecting Malay internet slang and short forms which can be offensive to a person. This paper is to determine which of the algorithms chosen in supervised machine learning with higher accuracy in detecting internet slang and short forms. To analyze the results of the supervised machine learning classifiers, we have chosen two types of datasets, one is political topic-based, and another same set but is mixed with 50 tweets per targeted keyword. The datasets are then manually labelled positive and negative, before separating the 275 tweets into training and testing sets. Naïve Bayes and Random Forest classifiers are then analyzed and evaluated from their performances. Our experiment results show that Random Forest is a better classifier compared to Naïve Bayes.
Keywords
Naive Bayes; Random Forest; Supervised Machine Learning; Twitter;
Citations & Related Records
연도 인용수 순위
  • Reference
1 L. Mahan, "Youthsplaining: Everything You Need to Know About Cancel Culture - InsideHook," 20-Aug-2019. [Online]. Available: https://www.insidehook.com/article/internet/youthsplaining-everything-you-need-to-know-about-cancel-culture. [Accessed: 12-Oct-2021].
2 J. Duribe, "Here's what being ratioed on Twitter actually means - PopBuzz," 04-Nov-2020. [Online]. Available: https://www.popbuzz.com/internet/social-media/ratioed-meaning-twitter/. [Accessed: 12-Oct-2021].
3 Z. Z. Izazi and T. M. Tengku-Sepora, "Slangs on Social Media: Variations among Malay Language Users on Twitter.," Pertanika J. Soc. Sci. ¥& Humanit., vol. 28, no. 1, 2020.
4 A. Reddy, D. N. Vasundhara, and P. Subhash, "Sentiment Research on Twitter Data," Int. J. Recent Technol. Eng., vol. 8, pp. 1068-1070, 2019.
5 Q. Li, S. Shah, R. Fang, A. Nourbakhsh, and X. Liu, "Tweet sentiment analysis by incorporating sentiment-specific word embedding and weighted text features," in 2016 IEEE/WIC/ACM International Conference on Web Intelligence (WI), 2016, pp. 568-571.
6 J. Sultan, "Developing an Automated Machine Learning Based Sentiment Analysis for Afaan Oromoo," ASTU, 2021.
7 H. Rosa et al., "Automatic cyberbullying detection: A systematic review," Comput. Human Behav., vol. 93, pp. 333-345, 2019.   DOI
8 N. I. Zabha, Z. Ayop, S. Anawar, E. Hamid, and Z. Z. Abidin, "Developing cross-lingual sentiment analysis of Malay Twitter data using lexicon-based approach," Int. J. Adv. Comput. Sci. Appl., vol. 10, no. 1, 2019, doi: 10.14569/IJACSA.2019.0100146.   DOI
9 "ABBREVIATION | meaning in the Cambridge English Dictionary." [Online]. Available: https://dictionary.cambridge.org/dictionary/english/abbreviation. [Accessed: 15-Oct-2021].
10 G. Zammarchi, F. Mola, and C. Conversano, "Impact of the COVID-19 outbreak on Italy's country reputation and stock market performance: a sentiment analysis approach," arXiv Prepr. arXiv2103.13871, 2021.
11 V. S. Lakshmi, K. Janan, J. P. S. Joshua, and M. Sharoz, "Predicting Supervised Machine Learning Performances for Sentiment Analysis Using Contextual Based Approaches," in Journal of Physics: Conference Series, 2021, vol. 1916, no. 1, p. 12117.   DOI
12 S. V Praveen, R. Ittamalla, and G. Deepak, "Analyzing the attitude of Indian citizens towards COVID-19 vaccine--A text analytics study," Diabetes ¥& Metab. Syndr. Clin. Res. ¥& Rev., vol. 15, no. 2, pp. 595-599, 2021.   DOI
13 N. R. Bhowmik, M. Arifuzzaman, M. R. H. Mondal, and M. S. Islam, "Bangla Text Sentiment Analysis Using Supervised Machine Learning with Extended Lexicon Dictionary," Nat. Lang. Process. Res., vol. 1, no. 3-4, pp. 34-45, 2021.   DOI
14 "Malay Slang Wiki | Fandom." [Online]. Available: https://malayslang.fandom.com/wiki/Malay_Slang_Wiki. [Accessed: 19-Oct-2021].
15 H. Zolkepli, "Twitter Political Sentiment in Bahasa | Kaggle," vol. 1. 11-Apr-2018.
16 "Understanding These Weird Malay Code on Message World - EverydayOnSales.com News." [Online]. Available: https://www.everydayonsales.com/news/understanding-these-weird-malay-code-on-message-world. [Accessed: 19-Oct-2021].
17 W. Ali, "Phishing website detection based on supervised machine learning with wrapper features selection," Int. J. Adv. Comput. Sci. Appl., vol. 8, no. 9, pp. 72-78, 2017.
18 A. Chakure, "Random Forest Regression. In this blog we'll try to understand... | by Afroz Chakure | The Startup | Medium," 29-Jun-2019. [Online]. Available: https://medium.com/swlh/random-forest-and-its-implementation-71824ced454f. [Accessed: 19-Oct-2021].
19 H. Elzayady, K. M. Badran, and G. I. Salama, "Sentiment Analysis on Twitter Data using Apache Spark Framework," in 2018 13th International Conference on Computer Engineering and Systems (ICCES), 2018, pp. 171-176.
20 S. A. El Rahman, F. A. AlOtaibi, and W. A. AlShehri, "Sentiment analysis of twitter data," in 2019 International Conference on Computer and Information Sciences (ICCIS), 2019, pp. 1-4.
21 S. Almatarneh and P. Gamallo, "Automatic construction of domain-specific sentiment lexicons for polarity classification," in International Conference on Practical Applications of Agents and Multi-Agent Systems, 2017, pp. 175-182.
22 A. Messaoudi, H. Haddad, M. Ben HajHmida, C. Fourati, and A. Ben Hamida, "Learning word representations for tunisian sentiment analysis," in Mediterranean Conference on Pattern Recognition and Artificial Intelligence, 2020, pp. 329-340.
23 R. Batra, Z. Kastrati, A. S. Imran, S. M. Daudpota, and A. Ghafoor, "A Large Scale Tweet Dataset for Urdu Text Sentiment Analysis," Mendeley Data, vol. 1, 2020, doi: 10.17632/RZ3XG97RM5.1.   DOI
24 A. Gupte, S. Joshi, P. Gadgul, A. Kadam, and A. Gupte, "Comparative study of classification algorithms used in sentiment analysis," Int. J. Comput. Sci. Inf. Technol., vol. 5, no. 5, pp. 6261-6264, 2014.