• Title/Summary/Keyword: spam mail filter

Search Result 22, Processing Time 0.027 seconds

Improved Spam Filter via Handling of Text Embedded Image E-mail

  • Youn, Seongwook;Cho, Hyun-Chong
    • Journal of Electrical Engineering and Technology
    • /
    • v.10 no.1
    • /
    • pp.401-407
    • /
    • 2015
  • The increase of image spam, a kind of spam in which the text message is embedded into attached image to defeat spam filtering technique, is a major problem of the current e-mail system. For nearly a decade, content based filtering using text classification or machine learning has been a major trend of anti-spam filtering system. Recently, spammers try to defeat anti-spam filter by many techniques. Text embedding into attached image is one of them. We proposed an ontology spam filters. However, the proposed system handles only text e-mail and the percentage of attached images is increasing sharply. The contribution of the paper is that we add image e-mail handling capability into the anti-spam filtering system keeping the advantages of the previous text based spam e-mail filtering system. Also, the proposed system gives a low false negative value, which means that user's valuable e-mail is rarely regarded as a spam e-mail.

A design of the SMBC Platform using the Fit FA-Finder (Fit-FA Finder를 이용한 SMBC 플랫폼 설계)

  • Park, Nho-Kyung;Han, Sung-Ho;Seo, Sang-Jin;Jin, Hyun-Joon
    • Journal of IKEEE
    • /
    • v.10 no.1 s.18
    • /
    • pp.49-54
    • /
    • 2006
  • Recently, e-mail has become an important way of communications in IT societies, but it creates various social problems due to increase of spam mails. Even though many organizations and cooperation have been trying researches to develop spam mail blocking technologies, a lot of cost and system complexities are required because of varieties of spam blocking technologies. In this paper, we designed of the SMBC(Spam Mail Blocking Center) using the Fit FA(Filtering Algorithm) Finder. Fit-FA Finder that search and applises spam mail filtering algorithm of the most suitable confrontation according to type of spam mail. The system of spam mail filtering is decided performance of the system by procedure that spam filter is used. Go through designed Fit-FA Finder and reduced unnecessary filtering process and processing time and load than appointment order filter application way of existent spam mail interception system.

  • PDF

An Automatic Spam e-mail Filter System Using χ2 Statistics and Support Vector Machines (카이 제곱 통계량과 지지벡터기계를 이용한 자동 스팸 메일 분류기)

  • Lee, Songwook
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2009.05a
    • /
    • pp.592-595
    • /
    • 2009
  • We propose an automatic spam mail classifier for e-mail data using Support Vector Machines (SVM). We use a lexical form of a word and its part of speech (POS) tags as features. We select useful features with ${\chi}^2$ statistics and represent each feature using text frequency (TF) and inversed document frequency (IDF) values for each feature. After training SVM with the features, SVM classifies each email as spam mail or not. In experiment, we acquired 82.7% of accuracy with e-mail data collected from a web mail system.

  • PDF

Studying on Expansion of Realtime Blocking List Conception for Spam E-mail Filtering (스팸 메일 차단을 위한 RBL개념의 확장에 관한 연구)

  • Kim, Jong-Min;Kim, Hion-Gun;Kim, Bong-Gi
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.12 no.10
    • /
    • pp.1808-1814
    • /
    • 2008
  • In addition to RBL function, which is used to applying for spam e-mail filtering, as an effective way to deal with the recently widespread spam types, this paper proposes how to extract URL that was comprised in the original e-mail, apply it to RBL, and expand it. The BotNet, which is used to using for sending spam mails these days, has a problem that it is not able to solve with the distributed addresses of sent mails in spam e-mails. In general, as these spam e-mails are sent from the infected Zombi PC of individual user, the sent address itself is not efficient and is meaningless to use in RBL. As an effective way to filter spam e-mail sent by BotNet, this paper analyzes URLs that contained in the original spam e-mail and proposes how to effectively improve filter rate, based on the distribution data of URL site tempting users. This paper proposes the sending mechanism of spam e-mails from BotNet and the methods to realize those types of spam e-mails. In order to gather analyzable spam e-mails, this paper also carries out an experiment by configuring trap system of spam e-mail. By analyzing spam e-mails, which have been received during the certain period of experiment, this paper shows that the expanded RBL method, using URLs that contained in spam e-mails, is effective way to improve the filter distribution of spam e-mail.

Spam Filter by Using X2 Statistics and Support Vector Machines (카이제곱 통계량과 지지벡터기계를 이용한 스팸메일 필터)

  • Lee, Song-Wook
    • The KIPS Transactions:PartB
    • /
    • v.17B no.3
    • /
    • pp.249-254
    • /
    • 2010
  • We propose an automatic spam filter for e-mail data using Support Vector Machines(SVM). We use a lexical form of a word and its part of speech(POS) tags as features and select features by chi square statistics. We represent each feature by TF(text frequency), TF-IDF, and binary weight for experiments. After training SVM with the selected features, SVM classifies each e-mail as spam or not. In experiment, the selected features improve the performance of our system and we acquired overall 98.9% of accuracy with TREC05-p1 spam corpus.

Comparing Feature Selection Methods in Spam Mail Filtering

  • Kim, Jong-Wan;Kang, Sin-Jae
    • Proceedings of the Korea Society of Information Technology Applications Conference
    • /
    • 2005.11a
    • /
    • pp.17-20
    • /
    • 2005
  • In this work, we compared several feature selection methods in the field of spam mail filtering. The proposed fuzzy inference method outperforms information gain and chi squared test methods as a feature selection method in terms of error rate. In the case of junk mails, since the mail body has little text information, it provides insufficient hints to distinguish spam mails from legitimate ones. To address this problem, we follow hyperlinks contained in the email body, fetch contents of a remote web page, and extract hints from both original email body and fetched web pages. A two-phase approach is applied to filter spam mails in which definite hint is used first, and then less definite textual information is used. In our experiment, the proposed two-phase method achieved an improvement of recall by 32.4% on the average over the $1^{st}$ phase or the $2^{nd}$ phase only works.

  • PDF

An Approach to Detect Spam E-mail with Abnormal Character Composition (비정상 문자 조합으로 구성된 스팸 메일의 탐지 방법)

  • Lee, Ho-Sub;Cho, Jae-Ik;Jung, Man-Hyun;Moon, Jong-Sub
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.18 no.6A
    • /
    • pp.129-137
    • /
    • 2008
  • As the use of the internet increases, the distribution of spam mail has also vastly increased. The email's main use was for the exchange of information, however, currently it is being more frequently used for advertisement and malware distribution. This is a serious problem because it consumes a large amount of the limited internet resources. Furthermore, an extensive amount of computer, network and human resources are consumed to prevent it. As a result much research is being done to prevent and filter spam. Currently, research is being done on readable sentences which do not use proper grammar. This type of spam can not be classified by previous vocabulary analysis or document classification methods. This paper proposes a method to filter spam by using the subject of the mail and N-GRAM for indexing and Bayesian, SVM algorithms for classification.

A fasrter Spam Mail Prevention Algorithm on userID based (userID 기반의 빠른 메일 차단 알고리즘)

  • 심재창;고주영;김현기
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2003.10a
    • /
    • pp.211-214
    • /
    • 2003
  • The problem of unsolicited e-mail has been increasing for years, so many researchers has studied about spam filtering and prevention. In this article, we proposed a faster spam prevention algorithm based on userID instead of full email address. But there are 2% of false-negatives by userID. In this case, we store those domains in a DB and filter them out. The proposed algorithm requires small DB and 3.7 times faster than the e-mail address comparison algorithm. We implemented this algorithm using SPRSW(Spam Prevention using Replay Secrete Words) to register userID automatically in userID DB.

  • PDF

A Chinese Spam Filter Using Keyword and Text-in-Image Features

  • Chen, Ying-Nong;Wang, Cheng-Tzu;Lo, Chih-Chung;Han, Chin-Chuan;Fana, Kuo-Chin
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2009.01a
    • /
    • pp.32-37
    • /
    • 2009
  • Recently, electronic mail(E-mail) is the most popular communication manner in our society. In such conventional environments, spam increasingly congested in Internet. In this paper, Chinese spam could be effectively detected using text and image features. Using text features, keywords and reference templates in Chinese mails are automatically selected using genetic algorithm(GA). In addition, spam containing a promotion image is also filtered out by detecting the text characters in images. Some experimental results are given to show the effectiveness of our proposed method.

  • PDF

Performance Improvement of Spam Filtering Using User Actions (사용자 행동을 이용한 쓰레기편지 여과의 성능 개선)

  • Kim Jae-Hoon;Kim Kang-Min
    • The KIPS Transactions:PartB
    • /
    • v.13B no.2 s.105
    • /
    • pp.163-170
    • /
    • 2006
  • With rapidly developing Internet applications, an e-mail has been considered as one of the most popular methods for exchanging information. The e-mail, however, has a serious problem that users ran receive a lot of unwanted e-mails, what we called, spam mails, which cause big problems economically as well as socially. In order to block and filter out the spam mails, many researchers and companies have performed many sorts of research on spam filtering. In general, users of e-mail have different criteria on deciding if an e-mail is spam or not. Furthermore, in e-mail client systems, users do different actions according to a spam mail or not. In this paper, we propose a mail filtering system using such user actions. The proposed system consists of two steps: One is an action inference step to draw user actions from an e-mail and the other is a mail classification step to decide if the e-mail is spam or not. All the two steps use incremental learning, of which an algorithm is IB2 of TiMBL. To evaluate the proposed system, we collect 12,000 mails of 12 persons. The accuracy is $81{\sim}93%$ according to each person. The proposed system outperforms, at about 14% on the average, a system that does not use any information about user actions.