• Title/Summary/Keyword: 스팸 메일

Search Result 135, Processing Time 0.017 seconds

A study on the Filtering of Spam E-mail using n-Gram indexing and Support Vector Machine (n-Gram 색인화와 Support Vector Machine을 사용한 스팸메일 필터링에 대한 연구)

  • 서정우;손태식;서정택;문종섭
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.14 no.2
    • /
    • pp.23-33
    • /
    • 2004
  • Because of a rapid growth of internet environment, it is also fast increasing to exchange message using e-mail. But, despite the convenience of e-mail, it is rising a currently bi9 issue to waste their time and cost due to the spam mail in an individual or enterprise. Many kinds of solutions have been studied to solve harmful effects of spam mail. Such typical methods are as follows; pattern matching using the keyword with representative method and method using the probability like Naive Bayesian. In this paper, we propose a classification method of spam mails from normal mails using Support Vector Machine, which has excellent performance in pattern classification problems, to compensate for the problems of existing research. Especially, the proposed method practices efficiently a teaming procedure with a word dictionary including a generated index by the n-Gram. In the conclusion, we verified the proposed method through the accuracy comparison of spm mail separation between an existing research and proposed scheme.

A New Bot Disinfection Method Based on DNS Sinkhole (DNS 싱크홀에 기반한 새로운 악성봇 치료 기법)

  • Kim, Young-Baek;Youm, Heung-Youl
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.18 no.6A
    • /
    • pp.107-114
    • /
    • 2008
  • The Bot is a kind of worm/virus that can be used to launch the distributed denial-of-service(DDoS) attacks or send massive amount of spam e-mails, etc. A lot of organizations make an effort to counter the Botnet's attacks. In Korea, we use DNS sinkhole system to protect from the Botnet's attack, while in Japan "so called" CCC(Cyber Clean Center) has been developed to protect from the Botnet's attacks. But in case of DNS sinkhole system, there is a problem since it cannot cure the Bot infected PCs themselves and in case of CCC there is a problem since only 30% of users with the Botnet-infected PCs can cooperate to cure themself. In this paper we propose a new method that prevent the Botnet's attacks and cure the Bot-infected PCs at the same time.

On the Security of Image-based CAPTCHA using Multi-image Composition (복수의 이미지를 합성하여 사용하는 캡차의 안전성 검증)

  • Byun, Je-Sung;Kang, Jeon-Il;Nyang, Dae-Hun;Lee, Kyung-Hee
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.22 no.4
    • /
    • pp.761-770
    • /
    • 2012
  • CAPTCHAs(Completely Automated Public Turing tests to tell Computer and Human Apart) have been widely used for preventing the automated attacks such as spam mails, DDoS attacks, etc.. In the early stages, the text-based CAPTCHAs that were made by distorting random characters were mainly used for frustrating automated-bots. Many researches, however, showed that the text-based CAPTCHAs were breakable via AI or image processing techniques. Due to the reason, the image-based CAPTCHAs, which employ images instead of texts, have been considered and suggested. In many image-based CAPTCHAs, however, the huge number of source images are required to guarantee a fair level of security. In 2008, Kang et al. suggested a new image-based CAPTCHA that uses test images made by composing multiple source images, to reduce the number of source images while it guarantees the security level. In their paper, the authors showed the convenience of their CAPTCHA in use through the use study, but they did not verify its security level. In this paper, we verify the security of the image-based CAPTCHA suggested by Kang et al. by performing several attacks in various scenarios and consider other possible attacks that can happen in the real world.

Breaking character-based CAPTCHA using color information (색상 정보를 이용한 문자 기반 CAPTCHA의 무력화)

  • Kim, Sung-Ho;Nyang, Dae-Hun;Lee, Kyung-Hee
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.19 no.6
    • /
    • pp.105-112
    • /
    • 2009
  • Nowadays, completely automated public turing tests to tell computers and humans apart(CAPTCHAs) are widely used to prevent various attacks by automated software agents such as creating accounts, advertising, sending spam mails, and so on. In early CAPTCHAs, the characters were simply distorted, so that users could easily recognize the characters. From that reason, using various techniques such as image processing, artificial intelligence, etc., one could easily break many CAPTCHAs, either. As an alternative, By adding noise to CAPTCHAs and distorting the characters in CAPTCHAs, it made the attacks to CAPTCHA more difficult. Naturally, it also made users more difficult to read the characters in CAPTCHAs. To improve the readability of CAPTCHAs, some CAPTCHAs used different colors for the characters. However, the usage of the different colors gives advantages to the adversary who wants to break CAPTCHAs. In this paper, we suggest a method of increasing the recognition ratio of CAPTCHAs based on colors.

A Text Mining-based Intrusion Log Recommendation in Digital Forensics (디지털 포렌식에서 텍스트 마이닝 기반 침입 흔적 로그 추천)

  • Ko, Sujeong
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.2 no.6
    • /
    • pp.279-290
    • /
    • 2013
  • In digital forensics log files have been stored as a form of large data for the purpose of tracing users' past behaviors. It is difficult for investigators to manually analysis the large log data without clues. In this paper, we propose a text mining technique for extracting intrusion logs from a large log set to recommend reliable evidences to investigators. In the training stage, the proposed method extracts intrusion association words from a training log set by using Apriori algorithm after preprocessing and the probability of intrusion for association words are computed by combining support and confidence. Robinson's method of computing confidences for filtering spam mails is applied to extracting intrusion logs in the proposed method. As the results, the association word knowledge base is constructed by including the weights of the probability of intrusion for association words to improve the accuracy. In the test stage, the probability of intrusion logs and the probability of normal logs in a test log set are computed by Fisher's inverse chi-square classification algorithm based on the association word knowledge base respectively and intrusion logs are extracted from combining the results. Then, the intrusion logs are recommended to investigators. The proposed method uses a training method of clearly analyzing the meaning of data from an unstructured large log data. As the results, it complements the problem of reduction in accuracy caused by data ambiguity. In addition, the proposed method recommends intrusion logs by using Fisher's inverse chi-square classification algorithm. So, it reduces the rate of false positive(FP) and decreases in laborious effort to extract evidences manually.