• Title/Summary/Keyword: Spam URL

Search Result 14, Processing Time 0.027 seconds

A spam mail blocking method using URL frequency analysis (URL 빈도분석을 이용한 스팸메일 차단 방법)

  • Baek Ki-young;Lee Chul-soo;Ryou Jae-cheol
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.14 no.6
    • /
    • pp.135-148
    • /
    • 2004
  • Recently, it is difficult to block the spam mail that changes variously with past spam distinction method by words. To solve such problem, This paper propose the method of generating spam distinction rule using URL frequency analysis. It is consist of collecting spam, drawing URL that get into characteristic from collected spam mail. URL noonalizing, generating spam distinction rule by time frequency, and blocking mail. It can effectively block various types of spam mail and various forms of spam mail that change.

A Spam Filter System Based on Maximum Entropy Model Using Co-training with Spamminess Features and URL Features (스팸성 자질과 URL 자질의 공동 학습을 이용한 최대 엔트로피 기반 스팸메일 필터 시스템)

  • Gong, Mi-Gyoung;Lee, Kyung-Soon
    • The KIPS Transactions:PartB
    • /
    • v.15B no.1
    • /
    • pp.61-68
    • /
    • 2008
  • This paper presents a spam filter system using co-training with spamminess features and URL features based on the maximum entropy model. Spamminess features are the emphasizing patterns or abnormal patterns in spam messages used by spammers to express their intention and to avoid being filtered by the spam filter system. Since spammers use URLs to give the details and make a change to the URL format not to be filtered by the black list, normal and abnormal URLs can be key features to detect the spam messages. Co-training with spamminess features and URL features uses two different features which are independent each other in training. The filter system can learn information from them independently. Experiment results on TREC spam test collection shows that the proposed approach achieves 9.1% improvement and 6.9% improvement in accuracy compared to the base system and bogo filter system, respectively. The result analysis shows that the proposed spamminess features and URL features are helpful. And an experiment result of the co-training shows that two feature sets are useful since the number of training documents are reduced while the accuracy is closed to the batch learning.

Studying on Expansion of Realtime Blocking List Conception for Spam E-mail Filtering (스팸 메일 차단을 위한 RBL개념의 확장에 관한 연구)

  • Kim, Jong-Min;Kim, Hion-Gun;Kim, Bong-Gi
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.12 no.10
    • /
    • pp.1808-1814
    • /
    • 2008
  • In addition to RBL function, which is used to applying for spam e-mail filtering, as an effective way to deal with the recently widespread spam types, this paper proposes how to extract URL that was comprised in the original e-mail, apply it to RBL, and expand it. The BotNet, which is used to using for sending spam mails these days, has a problem that it is not able to solve with the distributed addresses of sent mails in spam e-mails. In general, as these spam e-mails are sent from the infected Zombi PC of individual user, the sent address itself is not efficient and is meaningless to use in RBL. As an effective way to filter spam e-mail sent by BotNet, this paper analyzes URLs that contained in the original spam e-mail and proposes how to effectively improve filter rate, based on the distribution data of URL site tempting users. This paper proposes the sending mechanism of spam e-mails from BotNet and the methods to realize those types of spam e-mails. In order to gather analyzable spam e-mails, this paper also carries out an experiment by configuring trap system of spam e-mail. By analyzing spam e-mails, which have been received during the certain period of experiment, this paper shows that the expanded RBL method, using URLs that contained in spam e-mails, is effective way to improve the filter distribution of spam e-mail.

A Method to Block Spam Mail Automatically Through the Connection to Link URL (링크 유알엘 접속을 통한 스팸메일 자동 차단 방법에 관한 연구)

  • Jung, Nam-Cheol
    • Journal of Digital Contents Society
    • /
    • v.8 no.4
    • /
    • pp.451-458
    • /
    • 2007
  • In this paper, I developed a method whereby spam mail is automatically blocked through the connection to link URL. The blocking system works as follows. First, the system extracts information of URL linked to electronic mail which was delivered from any server on the internet. Next, the system lets itself be connected to the web pages through this URL. Last, the system blocks the electronic mail if those web pages contain any key word which was defined as a clue to spam mail.

  • PDF

The Suggestion of a New Control Method for SPAM Mail Prevention Solution (스팸 메일 차단솔루션의 새로운 제어 방식 제안)

  • 김민홍;두창호
    • Journal of the Korea Computer Industry Society
    • /
    • v.5 no.4
    • /
    • pp.453-460
    • /
    • 2004
  • SPAM mails become a serious social problem all of the world and the products for SPAM prevention are coming to the market. This study classifies the existing SPAM prevention solutions according to the patterns to be set up and the judging SPAM methods, and analyses the merits and demerits of them. This study also draws problems of the existing SPAM Prevention solutions and suggests a new URL Prefetch method, a new filtering method which have been out of use. And it draws synergistic effects of SPAM prevention by this new method and suggests SPAM Prevention solution by HTML Pattern method

  • PDF

Spam-mail Filtering based on Lexical Information and Thesaurus (어휘정보와 시소러스에 기반한 스팸메일 필터링)

  • Kang Shin-Jae;Kim Jong-Wan
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.11 no.1
    • /
    • pp.13-20
    • /
    • 2006
  • In this paper, we constructed a spam-mail filtering system based on the lexical and conceptual information. There are two kinds of information that can distinguish the spam mail from the legitimate mil. The definite information is the mail sender's information, URL, a certain spam keyword list, and the less definite information is the word lists and concept codes extracted from the mail body. We first classified the spam mail by using the definite information, and then used the less definite information. We used the lexical information and concept codes contained in the email body for SVM learning. According to our results the spam precision was increased if more lexical information was used as features, and the spam recall was increased when the concept codes were included in features as well.

  • PDF

Analyzing the correlation of Spam Recall and Thesaurus

  • Kang, Sin-Jae;Kim, Jong-Wan
    • Proceedings of the Korea Society of Information Technology Applications Conference
    • /
    • 2005.11a
    • /
    • pp.21-25
    • /
    • 2005
  • In this paper, we constructed a two-phase spam-mail filtering system based on the lexical and conceptual information. There are two kinds of information that can distinguish the spam mail from the legitimate mail. The definite information is the mail sender's information, URL, a certain spam list, and the less definite information is the word list and concept codes extracted from the mail body. We first classified the spam mail by using the definite information, and then used the less definite information. We used the lexical information and concept codes contained in the email body for SVM learning in the $2^{nd}$ phase. According to our results the spam precision was increased if more lexical information was used as features, and the spam recall was increased when the concept codes were included in features as well.

  • PDF

Analyzing the Effect of Lexical and Conceptual Information in Spam-mail Filtering System

  • Kang Sin-Jae;Kim Jong-Wan
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • v.6 no.2
    • /
    • pp.105-109
    • /
    • 2006
  • In this paper, we constructed a two-phase spam-mail filtering system based on the lexical and conceptual information. There are two kinds of information that can distinguish the spam mail from the ham (non-spam) mail. The definite information is the mail sender's information, URL, a certain spam keyword list, and the less definite information is the word list and concept codes extracted from the mail body. We first classified the spam mail by using the definite information, and then used the less definite information. We used the lexical information and concept codes contained in the email body for SVM learning in the 2nd phase. According to our results the ham misclassification rate was reduced if more lexical information was used as features, and the spam misclassification rate was reduced when the concept codes were included in features as well.

A Spam Filter System based on Maximum Entropy Model Using Spamness Features and URL Features (스팸성 자질과 URL 자질을 이용한 최대엔트로피모델 기반 스팸메일 필터 시스템)

  • Gong, Mi-Gyoung;Lee, Kyung-Soon
    • Annual Conference on Human and Language Technology
    • /
    • 2006.10e
    • /
    • pp.213-219
    • /
    • 2006
  • 본 논문에서는 스팸메일에 나타나는 스팸성 자질과 URL 자질을 이용한 최대엔트로피모델 기반 스팸 필터 시스템을 제안한다. 스팸성 자질은 스패머들이 스팸메일에 인위적으로 넣는 강조 패턴이나 필터 시스템을 통과하기 위해 비정상적으로 변형시킨 단어들을 말한다. 스팸성 자질 외에 반복적으로 나타나는 URL과 비정상적인 Ink도 자질로 사용하였다. 메일 수신자에게 추가적인 정보 제공을 목적으로 하이퍼링크로 연결시키거나 메일에 직접 타이핑한 URL 중 필터 시스템을 피하기 위해 유효하지 알은 비정상적인 URL들이 스팸 메일을 걸러내는데 도움을 줄 수 있기 때문이다. 또한 스팸성 자질과 URL을 각각 적용한 두 분류기를 통합하였다. 분류기의 통합은 각 분류기에 이용된 자질을 독립적으로 사용할 수 있다는 장점을 가지고 있다. 실험 결과를 통해 스팸성 자질과 URL을 이용함으로써 스팸 필터 시스템의 성능을 향상시킬 수 있음을 확인할 수 있었다.

  • PDF

Detecting Method for URL Redirection Spam (URL 리다이렉션 스팸 탐지 기법)

  • Baek, Jee-Hyun;Kim, Sung-Kwon
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2007.10d
    • /
    • pp.540-544
    • /
    • 2007
  • 인터넷의 급속한 성장은 사람들의 정보 습득 방식에 큰 변화를 주었다. 인터넷 이용자들은 과거와 비교도 할 수 없을 만큼의 많은 지식을 손쉽게 접할 수 있게 되었다. 하지만, 그로 인해 여러 가지 문제점들이 생겨나게 됐는데, 웹 스팸도 그 중 하나이다. 웹 스팸은 웹을 통한 불법적인 활동으로 이득을 보려는 활동을 통칭할 수 있다. 웹 스팸은 검색 엔진 결과 리스트의 순위를 올리기 위해 사용되는 것이 대부분이지만, 점점 검색 엔진 결과 리스트의 순위와 관련 없는 것들에서도 나타나 생겨나고 있다. 웹 스팸은 종류도 다양할뿐더러, 아직까지 모든 웹 스팸을 예방할 확실한 방법이 제시되지 못하고 있다. 이 논문에서는 여러 웹 스팸 중 페이지-하이딩 스팸에 속하는 URL 리다이렉션에 대해 다루고자 한다. 다른 웹 스팸과 마찬가지로, 현재까지 자동적으로 URL 리다이렉션을 탐지하는 방법이 제시되지 못하고 있는 실정이다. 이 논문에서는 검색 엔진 결과 리스트의 순위를 사용하여 URL 리다이렉션을 탐지 기법을 제안하고자 한다.

  • PDF