DOI QR코드

DOI QR Code

A Study on Clustering of SNS SPAM using Heuristic Method

경험기법을 사용한 SNS 스팸의 클러스터링에 관한 연구

  • 권영만 (을지대학교 의료IT마케팅학과) ;
  • 이인락 (을지대학교 의료IT마케팅학과) ;
  • 김명관 (을지대학교 의료IT마케팅학과)
  • Received : 2014.09.19
  • Accepted : 2014.12.12
  • Published : 2014.12.31

Abstract

It has good features for social networking with friends SNS is maintained. However, various enterprises, individuals invading the inconvenience spammers have exposure to a number of users to tweet spam. The study was conducted in the existing research on these spam tweets. However, the results showed a more accurate classification and detection is difficult because of the lack of precision and different causes. In this paper, we describe how to classify the characteristics of spammers, classification criteria. Also has a link rate and difference between followers and following, these features were present classification criteria for spammers account. This experiment was performed according to the criteria. Randomized trial of spam and non-spam accounts were selected and account type was conducted according to the criteria 68% of the link ratio of spam accounts. Followers / Following ratio was 27581.5. Non-spam accounts was 6.12%. Followers / Following ratio was 1.26.

SNS는 친구들의 친목과 인맥유지를 위한 순기능을 가지고 있다. 그러나 각종 기업, 개인 스패머들이 팔로잉을 통해 스팸 트윗하여 다수의 이용자들에게 노출, 불편을 끼치고 있다. 기존 연구에서 이러한 스팸 트윗에 대해 연구를 실시한경우가 있다. 그러나 정교함의 부족함과 여러 원인들로 인해 보다 정확한 분류 및 검출이 어려운 결과를 나타내었다. 본 논문에서는 스패머들의 특징, 분류기준, 분류방법에 대해 기술하였다. 또한 이러한 특징 중 링크율과 자신을 팔로워한 부류와 자신이 팔로잉한 부류와의 차이를 통하여 스패머 계정에 대한 분류기준을 제시하였다. 실험은 무작위 스팸 계정과 일반 계정을 선정하였으며 분류기준에 따라 진행하였다. 결과로 스팸 계정은 링크율 68%, 팔로워 / 팔로잉 비율은 27581.5 였고 일반 계정은 6.12%, 팔로워 / 팔로잉 비율은 1.26 였다.

Keywords

References

  1. Radina Kalpakova Peter Dancso, "Spam on twitter: Reverse enginerring", Darwin Research Project, December 10, 2012
  2. Zolt'an Gyongyi, "Web Spam Taxonomy", Computer Science Department Stanford University, 2005
  3. Mike Chrzanowski, "Using Twitter to Predict Voting Behavior", December 14, 2012
  4. Xiaodan Song, "Why We Twitter: Understanding Microblogging Usage and Communities", NEC Laboratories America, 2007
  5. Mandy korpusik, "Tweet-to-Scene Generation: WordsEye with Twitter", August 1, 2012
  6. Kristina Lerman and Rumi Ghosh, "Information Contagion: an Empirical Study of the Spread of News on Digg and Twitter Social Networks", ICWSM, 2010
  7. Nitin Jindal and Bing Liu, "Opinion Spam andAnalysis", Proceedings of the international conference on Web, 2008
  8. Mary Beth Rosson, "How and Why People Twitter: The Role that Microblogging Plays in Informal Communication at Work", Proceedings of the ACM, 2009
  9. A Ntoulas, M Najork, "Detecting Spam Web Pages through Content Analysis", Proceedings of the ACM, 2006
  10. Pedram Hayati, Vidyasagar Potdar, "Toward Spam 2.0: An Evaluation of Web 2.0 Anti-Spam Methods", Informatics, 2009
  11. G Caruana, M LiA, "Survey of Emerging Approaches to Spam Filtering", ACM Computing Surveys (CSUR), 2012