[KSCI] Korea Science Citation Index Service

http://dx.doi.org/10.3837/tiis.2015.08.014

Modeling and Evaluating Information Diffusion for Spam Detection in Micro-blogging Networks

Chen, Kan (School of Computer, National University of Defense Techology)
Zhu, Peidong (School of Computer, National University of Defense Techology)
Chen, Liang (School of Computer, National University of Defense Techology)
Xiong, Yueshan (School of Computer, National University of Defense Techology)

Publication Information

KSII Transactions on Internet and Information Systems (TIIS) / v.9, no.8, 2015 , pp. 3005-3027 More about this Journal

Abstract

Spam has become one of the top threats of micro-blogging networks as the representations of rumor spreading, advertisement abusing and malware distribution. With the increasing popularity of micro-blogging, the problems will exacerbate. Prior detection tools are either designed for specific types of spams or not robust enough. Spammers may escape easily from being detected by adjusting their behaviors. In this paper, we present a novel model to quantitatively evaluate information diffusion in micro-blogging networks. Under this model, we found that spam posts differ wildly from the non-spam ones. First, the propagations of non-spam posts mostly result from their followers, but those of spam posts are mainly from strangers. Second, the non-spam posts relatively last longer than the spam posts. Besides, the non-spam posts always get their first reposts/comments much sooner than the spam posts. With the features defined in our model, we propose an RBF-based approach to detect spams. Different from the previous works, in which the features are extracted from individual profiles or contents, the diffusion features are not determined by any single user but the crowd. Thus, our method is more robust because any single user's behavior changes will not affect the effectiveness. Besides, although the spams vary in types and forms, they're propagated in the same way, so our method is effective for all types of spams. With the real data crawled from the leading micro-blogging services of China, we are able to evaluate the effectiveness of our model. The experiment results show that our model can achieve high accuracy both in precision and recall.

Keywords

spam detection; information diffusion; micro-blogging; RBF;

Citations & Related Records

Reference

1	S. Haykin, Neural Networks: A Comprehensive Foundation. NJ: Predice Hall, Upper Saddle River, 1994. Article (CrossRef Link).
2	G. Daniel, et al. "Information diffusion through blogspace," in Proc. of the 13th international conference on World Wide Web, ACM, 2004. Article (CrossRef Link).
3	Iribarren, et al. "Impact of human activity patterns on the dynamics of information diffusion," Physical review letters, 2009. Article (CrossRef Link).
4	J. Yang and J. Leskovec, "Modeling Information Diffusion in Implicit Networks," in Proc. of Data Mining (ICDM), 2010 IEEE 10th International Conference, pp.599,608, 13-17 Dec. 2010. Article (CrossRef Link).
5	M. Hall, et al. "The WEKA Data Mining Software: An Update, " in SIGKDD Explorations, vol. 11, Issue 1, 2009. Article (CrossRef Link).
6	Y. Liu, et al., "User behavior oriented web spam detection," in Proc. of 17th International Conference on World Wide Web 2008, WWW'08, April 21, 2008 - April 25, 2008, Beijing, China, 2008, pp. 1039-1040. Article (CrossRef Link).
7	Wang, Kaiyu, et al. "A new approach for detecting spam microblogs based on text and user's social network features," Proc. of Wireless Communications, Vehicular Technology, Information Theory and Aerospace & Electronic Systems (VITAE), 2014 4th International Conference on. IEEE, 2014. Article (CrossRef Link).
8	X. Zhang, et al., "Detecting spam and promoting campaigns in the Twitter social network," in Proc. of 12th IEEE International Conference on Data Mining, ICDM 2012, December 10, 2012 - December 13, 2012, Brussels, Belgium, 2012, pp. 1194-1199. Article (CrossRef Link).
9	Y. Liu, et al., "Identifying web spam with user behavior analysis," presented at the Proceedings of the 4th international workshop on Adversarial information retrieval on the web, Beijing, China, 2008. Article (CrossRef Link).
10	M. Kimura and K. Saito. "Tractable Models for Information Diffusion in Social Networks," Knowledge Discovery in Databases: PKDD, Kimura, Masahiro, 2006. Article (CrossRef Link).
11	H.Zhongmin, et al., "Probabilistic Graphical Model for Identifying Water Army in Microblogging System," Journal of Computer Research and Development, 2013, S2:180-186. Article (CrossRef Link).
12	Y.-M. Wang, et al., "Spam double-funnel: connecting web spammers with advertisers," in Proc. of the Proceedings of the 16th international conference on World Wide Web, Banff, Alberta, Canada, 2007. Article (CrossRef Link).
13	D. Strang, et al., "Soule. Diffusion in organizations and social movements: From hybrid corn to poison pills," in Annual Review of Sociology, 1998,24(1):265-290. Article (CrossRef Link). DOI
14	Goldenberg J, et al., "Talk of the network: A complex systems look at the underlying process of word-of-mouth," Marketing Letters, 2001, 12(3):211–223. Article (CrossRef Link). DOI
15	D. Irani, et al., "Study of Static Classification of Social Spam Profiles in MySpace," in Proc. of presented at the 4th Int'l AAAI Conference on Weblogs and Social Media, 2010. Article (CrossRef Link).
16	Y. Shin, et al., "Prevalence and mitigation of forum spamming," in Proc. of IEEE INFOCOM 2011, April 10, 2011 - April 15, 2011, Shanghai, China, 2011, pp. 2309-2317. Article (CrossRef Link).
17	F. Benevenuto, et al., "Identifying video spammers in online social networks," in Proc. of the 4th international workshop on Adversarial information retrieval on the web, Beijing, China, 2008. Article (CrossRef Link).
18	A. Rajadesingan and A. Mahendran, "Comment spam classification in blogs through comment analysis and comment-blog post relationships," in Proc. of the 13th international conference on Computational Linguistics and Intelligent Text Processing - Volume Part II, New Delhi, India, 2012. Article (CrossRef Link).
19	R. B. Cialdini and N. J. Goldstein, "Social influence: Compliance and conformity," Annu. Rev. Psychol, vol. 55, pp. 591-621, 2004. Article (CrossRef Link). DOI
20	Z. Zhang and B. Varadarajan, "Utility scoring of product reviews," in Proc. of ACM Conference on Information and Knowleged Management(CIKM2006), Arlington, VA, USA, 2006. Article (CrossRef Link).
21	N. Jindal and B. Liu, "Opinion spam and analysis," in Proc. of the international conference on Web search and web data mining, Palo Alto, California, USA, 2008. Article (CrossRef Link).
22	Zhang, Qunyan, et al. "Duplicate Detection for Identifying Social Spam in Microblogs." In Big Data (BigData Congress), 2013 IEEE International Congress. IEEE, 2013. Article (CrossRef Link).
23	K. Thomas, et al., "Design and evaluation of a real-time URL spam filtering service," in Proc. of 2011 IEEE Symposium on Security and Privacy, SP 2011, May 22, 2011 - May 25, 2011, Berkeley, CA, United states, 2011, pp. 447-462. Article (CrossRef Link).
24	L. Ze and S. Haiying, "SOAP: A Social network Aided Personalized and effective spam filter to clean your e-mail box," in Proc. of 2011 IEEE INFOCOM, 2011, pp. 1835-1843. Article (CrossRef Link).
25	S. Khanna, et al., "Inbound & Outbound Email Traffic Analysis and Its SPAM Impact," in Proc. of 2012 Fourth International Conference on Computational Intelligence, Communication Systems and Networks (CICSyN), 2012, pp. 181-186. Article (CrossRef Link).
26	L. Xiao Mang and K. Ung Mo, "A hierarchical framework for content-based image spam filtering," in Proc. of Information Science and Digital Content Technology (ICIDT), 2012 8th International Conference on, 2012, pp. 149-155. Article (CrossRef Link).
27	K. Thomas, et al., "Suspended accounts in retrospect: an analysis of twitter spam," in Proc. of the the 2011 ACM SIGCOMM conference on Internet measurement conference, Berlin, Germany, 2011. Article (CrossRef Link).
28	H. Gao, et al., "Detecting and characterizing social spam campaigns," in Proc. of the 10th ACM SIGCOMM conference on Internet measurement, Melbourne, Australia, 2010. Article (CrossRef Link).
29	R. Bhattacharjee and A. Goel, "Avoiding ballot stuffing in eBay-like reputation systems," in Proc. of the 2005 ACM SIGCOMM workshop on Economics of peer-to-peer systems, Philadelphia, Pennsylvania, USA, 2005. Article (CrossRef Link).
30	C. Grier, et al., "@spam: the underground on 140 characters or less," in Proc. of the 17th ACM conference on Computer and communications security, Chicago, Illinois, USA, 2010. Article (CrossRef Link).
31	W. Peng and M. Uehara, "Multiple Filters of Spam Using Sobel Operators and OCR," in Complex, Intelligent and Software Intensive Systems (CISIS), in Proc. of 2012 Sixth International Conference on, 2012, pp. 164-169. Article (CrossRef Link).
32	H. Tran, et al., "Spam detection in online classified advertisements," in Proc. of the 2011 Joint WICOW/AIRWeb Workshop on Web Quality, Hyderabad, India, 2011. Article (CrossRef Link).
33	D. Liu and X. Chen, "Rumor Propagation in Online Social Networks Like Twitter—A Simulation Study," in Proc. of presented at the 2011 Third International Conference on Multimedia Information Networking and Security, 2011. Article (CrossRef Link).
34	C. Wisniewski. Twitter hack demonstrates the power of weak passwords. http://www.sophos.com/blogs/chetw/g/2010/03/07/twitter-hack-demonstrates-power-weak-passwords/, March 2010. Article (CrossRef Link).
35	Weibo. Available: http://www.weibo.com. Article (CrossRef Link).
36	K. Levchenko, et al., "Click Trajectories: End-to-End Analysis of the Spam Value Chain," in Proc. of Security and Privacy (SP), 2011 IEEE Symposium on, 2011, pp. 431-446. Article (CrossRef Link).