DOI QR코드

DOI QR Code

An Extended Work Architecture for Online Threat Prediction in Tweeter Dataset

  • Received : 2021.01.05
  • Published : 2021.01.30

Abstract

Social networking platforms have become a smart way for people to interact and meet on internet. It provides a way to keep in touch with friends, families, colleagues, business partners, and many more. Among the various social networking sites, Twitter is one of the fastest-growing sites where users can read the news, share ideas, discuss issues etc. Due to its vast popularity, the accounts of legitimate users are vulnerable to the large number of threats. Spam and Malware are some of the most affecting threats found on Twitter. Therefore, in order to enjoy seamless services it is required to secure Twitter against malicious users by fixing them in advance. Various researches have used many Machine Learning (ML) based approaches to detect spammers on Twitter. This research aims to devise a secure system based on Hybrid Similarity Cosine and Soft Cosine measured in combination with Genetic Algorithm (GA) and Artificial Neural Network (ANN) to secure Twitter network against spammers. The similarity among tweets is determined using Cosine with Soft Cosine which has been applied on the Twitter dataset. GA has been utilized to enhance training with minimum training error by selecting the best suitable features according to the designed fitness function. The tweets have been classified as spammer and non-spammer based on ANN structure along with the voting rule. The True Positive Rate (TPR), False Positive Rate (FPR) and Classification Accuracy are considered as the evaluation parameter to evaluate the performance of system designed in this research. The simulation results reveals that our proposed model outperform the existing state-of-arts.

Keywords

References

  1. Kuss, D. J., & Griffiths, M. D. (2017). Social networking sites and addiction: Ten lessons learned. International journal of environmental research and public health, 14(3), 311. https://doi.org/10.3390/ijerph14030311
  2. Phua, J., Jin, S. V., & Kim, J. J. (2017). Uses and gratifications of social networking sites for bridging and bonding social capital: A comparison of Facebook, Twitter, Instagram, and Snapchat. Computers in human behavior, 72, (pp. 115-122). https://doi.org/10.1016/j.chb.2017.02.041
  3. Panek, E. T., Nardis, Y., &Konrath, S. (2013). Defining social networking sites and measuring their use: How narcissists differ in their use of Facebook and Twitter. Comput. Hum. Behav., 29(5), 2004-2012. https://doi.org/10.1016/j.chb.2013.04.012
  4. Myers, S. A., Sharma, A., Gupta, P., & Lin, J. (2014, April). Information network or social network? The structure of the Twitter follow graph. In Proceedings of the 23rd International Conference on World Wide Web (pp. 493-498)
  5. Wang, D., Navathe, S. B., Liu, L., Irani, D., Tamersoy, A., &Pu, C. (2013, October). Click traffic analysis of short url spam on twitter. In 9th IEEE International Conference on Collaborative Computing: Networking, Applications and Worksharing (pp. 250-259). IEEE.
  6. Thomas, K., Grier, C., Song, D., and Paxson, V. (2011, November). Suspended accounts in retrospect: an analysis of twitter spam. In Proceedings of the 2011 ACM SIGCOMM conference on Internet measurement conference (pp. 243-258).
  7. Yip, M., Shadbolt, N., & Webber, C. (2012, June). Structural analysis of online criminal social networks. In 2012 IEEE International Conference on Intelligence and Security Informatics (pp. 60-65). IEEE.
  8. Kay, A. (2006). Social capital, the social economy and community development. Community Development Journal, 41(2), (pp. 160-173). https://doi.org/10.1093/cdj/bsi045
  9. Beutel, A., Xu, W., Guruswami, V., Palow, C., &Faloutsos, C. (2013, May). Copycatch: stopping group attacks by spotting lockstep behavior in social networks. In Proceedings of the 22nd international conference on World Wide Web (pp. 119-130).
  10. Ahmed, F., &Abulaish, M. (2012, June). An mcl-based approach for spam profile detection in online social networks. In 2012 IEEE 11th international conference on trust, security and privacy in computing and communications (pp. 602-608). IEEE
  11. Rieck, K., Holz, T., Willems, C., Dussel, P., &Laskov, P. (2008, July). Learning and classification of malware behavior. In International Conference on Detection of Intrusions and Malware, and Vulnerability Assessment (pp. 108-125). Springer, Berlin, Heidelberg.
  12. Mohtasebi, S., &Dehghantanha, A. (2011, July). A mitigation approach to the privacy and malware threats of social network services. In International Conference on Digital Information Processing and Communications (pp. 448-459). Springer, Berlin, Heidelberg.
  13. Blanzieri, E., &Bryl, A. (2008). A survey of learning-based techniques of email spam filtering. Artificial Intelligence Review, 29(1), (pp. 63-92). https://doi.org/10.1007/s10462-009-9109-6
  14. Sahami, M., Dumais, S., Heckerman, D., & Horvitz, E. (1998, July). A Bayesian approach to filtering junk e-mail. In Learning for Text Categorization: Papers from the 1998 workshop (Vol. 62, (pp. 98-105).
  15. Grier, C., Thomas, K., Paxson, V., & Zhang, M. (2010, October). @ spam: the underground on 140 characters or less. In Proceedings of the 17th ACM conference on Computer and communications security (pp. 27-37).
  16. Song, J., Lee, S., & Kim, J. (2011, September). Spam filtering in twitter using sender-receiver relationship. In International workshop on recent advances in intrusion detection (pp. 301-317). Springer, Berlin, Heidelberg.
  17. Lin, G., Sun, N., Nepal, S., Zhang, J., Xiang, Y., & Hassan, H. (2017). Statistical twitter spam detection demystified: performance, stability and scalability. IEEE access, 5, (pp.11142-11154). https://doi.org/10.1109/ACCESS.2017.2710540
  18. Hai, Q. T., & Hwang, S. O. (2018). An efficient classification of malware behavior using deep neural network. Journal of Intelligent & Fuzzy Systems, 35(6), (pp. 5801-5814). https://doi.org/10.3233/JIFS-169823
  19. Kaur, J., &Sabharwal, M. (2018). Spam detection in online social networks using feed forward neural network. In RSRI conference on recent trends in science and engineering 2, (pp. 69-78.
  20. Jain, G., Sharma, M., & Agarwal, B. (2019). Spam detection in social media using convolutional and long short term memory neural network. Annals of Mathematics and Artificial Intelligence, 85(1), 21-44. https://doi.org/10.1007/s10472-018-9612-z
  21. https://www.kaggle.com/uciml/sms-spam-collection-dataset. Accessed on 22.02.2020.
  22. https://gist.github.com/sebleier/554280 Accessed on 22.02.2020.
  23. Wilbur, W. J., &Sirotkin, K. (1992). The automatic identification of stop words. Journal of information science, 18(1), 45-55. https://doi.org/10.1177/016555159201800106
  24. Anger, I., &Kittl, C. (2011, September). Measuring influence on Twitter. In Proceedings of the 11th international conference on knowledge management and knowledge technologies (pp. 1-4).
  25. Yang, X., Macdonald, C., &Ounis, I. (2018). Using word embeddings in twitter election classification. Information Retrieval Journal, 21(2-3), 183-207. https://doi.org/10.1007/s10791-017-9319-5
  26. Sidorov, G., Gelbukh, A., Gomez-Adorno, H., & Pinto, D. (2014). Soft similarity and soft cosine measure: Similarity of features in vector space model. Computacion y Sistemas, 18(3), 491-504.
  27. Salehi, S., Selamat, A., &Bostanian, M. (2011, July). Enhanced genetic algorithm for spam detection in email. In 2011 IEEE 2nd international conference on software engineering and service science (pp. 594-597). IEEE.
  28. Sivanandam, S. N., &Deepa, S. N. (2008). Genetic algorithm optimization problems. In Introduction to genetic algorithms (pp. 165-209). Springer, Berlin, Heidelberg.
  29. Feng, W., Sun, J., Zhang, L., Cao, C., & Yang, Q. (2016, December). A support vector machine based naive Bayes algorithm for spam filtering. In 2016 IEEE 35th International Performance Computing and Communications Conference (IPCCC) (pp. 1-8). IEEE.
  30. Diale, M., Van Der Walt, C., Celik, T., &Modupe, A. (2016, November). Feature selection and support vector machine hyperparameter optimisation for spam detection. In 2016 Pattern Recognition Association of South Africa and Robotics and Mechatronics International Conference (PRASA-RobMech) (pp. 1-7). IEEE.
  31. Madisetty, S., &Desarkar, M. S. (2018). A neural network-based ensemble approach for spam detection in Twitter. IEEE Transactions on Computational Social Systems, 5(4), (pp. 973-984). https://doi.org/10.1109/tcss.2018.2878852
  32. Murugan, N. S., & Devi, G. U. (2018). Detecting streaming of Twitter spam using hybrid method. Wireless Personal Communications, 103(2), (pp. 1353-1374). https://doi.org/10.1007/s11277-018-5513-z