한국정보기술응용학회:학술대회논문집 (Proceedings of the Korea Society of Information Technology Applications Conference)
- 한국정보기술응용학회 2005년도 6th 2005 International Conference on Computers, Communications and System
- /
- Pages.21-25
- /
- 2005
Analyzing the correlation of Spam Recall and Thesaurus
- Kang, Sin-Jae (School of Computer and Information Technology, Daegu University) ;
- Kim, Jong-Wan (School of Computer and Information Technology, Daegu University)
- 발행 : 2005.11.25
초록
In this paper, we constructed a two-phase spam-mail filtering system based on the lexical and conceptual information. There are two kinds of information that can distinguish the spam mail from the legitimate mail. The definite information is the mail sender's information, URL, a certain spam list, and the less definite information is the word list and concept codes extracted from the mail body. We first classified the spam mail by using the definite information, and then used the less definite information. We used the lexical information and concept codes contained in the email body for SVM learning in the