Browse > Article
http://dx.doi.org/10.3745/KIPSTC.2006.13C.6.733

Spam Message Filtering with Bayesian Approach for Internet Communities  

Kim, Bum-Bae (성균관대학교 컴퓨터공학과)
Choi, Hyoung-Kee (성균관대학교 정보통신공학부)
Abstract
Spam Message has been Causing widespread damages on the Internet. One source of the problems is rooted from an anonymously posted message in the bulletin board in Internet communities. This type of the Spam messages tries to advertise products, to harm other's reputation, to deliver religious messages and so on. In this paper we present the Spam message filtering using the Bayesian approach. In order to increase usefulness of the Spam filter in the bulletin board in Internet communities, we made the Spam filter which can divide the Spam message into six categories such as advertisement, pornography, abuse, religion and other. The test conducted against messages posted on the popular web sites.
Keywords
Spam Message; Communitiy Spam Machine-learning Filtering Technique; Bayesian Approach;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Trend Micro Inc., 'Nominations', available at http://www.mail-abuse.com/nominats.html
2 Ian Stuart, Sung-Hyuk Cha, Charles C. Tappert, 'A Neural Network Classifier for Junk E-Mail,' Proc. Document Analysis System VI, 6th International Workshop, Springer-Verlag, pp.442-450, 2004
3 Sam Holden, 'Spam Filters,' Category Reviews, Aug. 2003, available at http://freshmeat.net/articles/view/964
4 Roger Burton, 'Mail::SpamTest::Bayesian,' available at http://search.cpan.org/~firedrake/Mail-SpamTest-Bayesian-0.02/Bayesian.pm
5 Thornsten Joachims, 'Text categorization with support vector machines: learning with many relevant features,' Proc. European Conference on Machine Learning, Springer-Verlag, pp.137-142, 1998
6 Hongrak Lee and Andrew Y. Ng, 'Spam Deobfuscation using a Hidden Markov Model,' Second Conference on Email and Anti-Spam (CEAS2005), 2005, available at http://www.ceas.cc/papers-2005/166.pdf
7 Microsoft SenderID, 'Sender ID Framework Overview,' available at http://www.microsoft.com/mscorp/safety/technologies/senderid/overview.mspx
8 SpamCop, 'SpamCop Blocking List,' available at http://www.spamcop.net/bl.shtml
9 Spamhaus, 'The Spamhaus Block List,' available at http://www.spamhaus.org/sbl/index.lasso
10 Pobox, SPF, 'How it works,' available at http://spf.pobox.com/howworks.html
11 Yahoo! DomainKeys, 'Domainkeys: Proving and Protecting Email Sender Identity,' available at http://antispam.yahoo.com/domainkey
12 Jim Fenton, 'Identified Internet Mail,' Cisco System, 2004 available at http://antiphishing.kavi.com/events/Conference_Notes/Jim_Fenton_on_Cisco_Internet_Identified_Mail.pdf
13 The Radicati Group Inc., 'Email Sent and Received Growth Statistic, 2003-2005', Jul. 2003
14 SpamAssassin, 'The Apache SpamAssassin Project,' available at http://spamassassin.apache.org
15 TopTenReviews, 'Spam Statistics 2006,' available at http://spam-filter-review.toptenreviews.com/spam-statistics.html
16 Paulson, L.D, 'Spam hits instant messaging,' IEEE Computer, IEEE Computer Society, Volume 37, Issue 4, April 2004 pp. 18   DOI   ScienceOn
17 Graham Paul, 'A Plan For Spam,' available at http://www.paulgraham.com/spam.html, 2002
18 Graham Paul, 'Better Bayesian Filtering,' available at http://paulgraham.com/better.html, Jan. 2003