Browse > Article

Features Reduction using Logistic Regression for Spam Filtering  

Jung, Yong-Gyu (을지대학교 의료IT마케팅학과)
Lee, Bum-Joon (을지대학교 의료산업학부 의료전산학전공)
Publication Information
The Journal of the Institute of Internet, Broadcasting and Communication / v.10, no.2, 2010 , pp. 13-18 More about this Journal
Abstract
Today, The much amount of spam that occupies the mail server and network storage occurs the lack of negative issues, such as overload, and for users to delete the spam should spend time, resources have a problem. Automatic spam filtering on the incidence to solve the problem is essential. A lot of Spam filters have tried to solve the problem emerged as an essential element automatically. Unlike traditional method such as Naive Bayesian, PCA through the many-dimensional data set of spam with a few spindle-dimensional process that narrowed the operation to reduce the burden on certain groups for classification Logistic regression analysis method was used to filter the spam. Through the speed and performance, it was able to get the positive results.
Keywords
Logistic Regression Analysis; Feature Reduction; Principal Component Analysis; Spam mail;
Citations & Related Records
연도 인용수 순위
  • Reference
1 M. Sahami, S. Dumais, D. Heckerman, and E. Horvitz, "A Bayesian Approach to Filtering Junk E-Mail," AAAI Technical Report WS-98-05, 1998
2 Vikas P. Deshpande, Robert F. Erbacher, and Chris Harris" An Evaluation of Naïve Bayesian Anti-Spam Filtering Techniques" Proceedings of the 2007 IEEE Workshop on Information Assurance United States Military Academy, West Point, NY 20-22 June 2007
3 Toby Segaran, "Programming collective intelligence", O'REILLY, 2007
4 Ian H.Witten, Frank Eibe, "Data Mining: Practical Machine Learning Tools and Techniques" Morgan Kaufmann, 2000
5 Pang-Ning Tan & Michael Steinbach & Vipin Kumar, "Introduction to Data Mining", ELSEVIER, 2006
6 H. Drucker, D. Wu, and V. N. Vapnik., "Support Vector Machines for Spam Categorization", IEEE Trans. on Neural networks, 1999.
7 D. Mertz, "Spam Filtering Techniques. Six approaches to eliminating unwanted e-mail.", Gnosis Software Inc., September, 2002. Ciencias Físicas, Universidad de Valencia, 1992.
8 M. Vinther, "Junk Detection using neural networks", MeeSoft Technical Report, June 2002. Available: http://logicnet.dk/reports/ JunkDetection/JunkDetection.htm.
9 Deerwester, S., Dumais, S. T., Furnas, G. W., Landauer, T. K., & Harshman, R. "Indexing By Latent Semantic Analysis", Journal of the American Society For Information Science, 41, 391-407. (1990)   DOI
10 Jiawei Han, Micheline Kamber, "Data mining - Concepts and Techniques", Morgan Kaufmann Publishers, 2001.