• 제목/요약/키워드: SPAM

검색결과 284건 처리시간 0.028초

The Adaptive SPAM Mail Detection System using Clustering based on Text Mining

  • Hong, Sung-Sam;Kong, Jong-Hwan;Han, Myung-Mook
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제8권6호
    • /
    • pp.2186-2196
    • /
    • 2014
  • Spam mail is one of the most general mail dysfunctions, which may cause psychological damage to internet users. As internet usage increases, the amount of spam mail has also gradually increased. Indiscriminate sending, in particular, occurs when spam mail is sent using smart phones or tablets connected to wireless networks. Spam mail consists of approximately 68% of mail traffic; however, it is believed that the true percentage of spam mail is at a much more severe level. In order to analyze and detect spam mail, we introduce a technique based on spam mail characteristics and text mining; in particular, spam mail is detected by extracting the linguistic analysis and language processing. Existing spam mail is analyzed, and hidden spam signatures are extracted using text clustering. Our proposed method utilizes a text mining system to improve the detection and error detection rates for existing spam mail and to respond to new spam mail types.

Spam Image Detection Model based on Deep Learning for Improving Spam Filter

  • Seong-Guk Nam;Dong-Gun Lee;Yeong-Seok Seo
    • Journal of Information Processing Systems
    • /
    • 제19권3호
    • /
    • pp.289-301
    • /
    • 2023
  • Due to the development and dissemination of modern technology, anyone can easily communicate using services such as social network service (SNS) through a personal computer (PC) or smartphone. The development of these technologies has caused many beneficial effects. At the same time, bad effects also occurred, one of which was the spam problem. Spam refers to unwanted or rejected information received by unspecified users. The continuous exposure of such information to service users creates inconvenience in the user's use of the service, and if filtering is not performed correctly, the quality of service deteriorates. Recently, spammers are creating more malicious spam by distorting the image of spam text so that optical character recognition (OCR)-based spam filters cannot easily detect it. Fortunately, the level of transformation of image spam circulated on social media is not serious yet. However, in the mail system, spammers (the person who sends spam) showed various modifications to the spam image for neutralizing OCR, and therefore, the same situation can happen with spam images on social media. Spammers have been shown to interfere with OCR reading through geometric transformations such as image distortion, noise addition, and blurring. Various techniques have been studied to filter image spam, but at the same time, methods of interfering with image spam identification using obfuscated images are also continuously developing. In this paper, we propose a deep learning-based spam image detection model to improve the existing OCR-based spam image detection performance and compensate for vulnerabilities. The proposed model extracts text features and image features from the image using four sub-models. First, the OCR-based text model extracts the text-related features, whether the image contains spam words, and the word embedding vector from the input image. Then, the convolution neural network-based image model extracts image obfuscation and image feature vectors from the input image. The extracted feature is determined whether it is a spam image by the final spam image classifier. As a result of evaluating the F1-score of the proposed model, the performance was about 14 points higher than the OCR-based spam image detection performance.

스팸메일 관리지표 개선에 관한 연구 (A Study on Improving Spam Management Index)

  • 유진호;임종인
    • 정보보호학회논문지
    • /
    • 제19권3호
    • /
    • pp.133-142
    • /
    • 2009
  • 국내에서 정량적인 스팸 수신량이 실제로 줄어들고 있음에도 불구하고, 일각에서는 스팸이 늘고 있다고 피해를 호소하기도 하고, 스팸 수신량 조사 결과는 개인이 실질적으로 느끼는 체감량과 차이가 있다고 주장하는 이용자들이 있다. 본고에서는 이러한 주장이 나오는 원인에 대한 분석과 더불어 스팸메일 관리지표와 관련된 주요 쟁점들을 살펴보고, 기존의 정량적인 스팸 수신량을 보완할 수 있는 스팸메일 관리지표 모델을 제시하고자 한다. 특히, 이용자가 스팸에 대하여 느끼는 체감 스트레스를 측정하여 이를 정량적인 수신량 지표와 더불어 정성적인 산출지표로 활용하고자 한다. 또한 본고에서 제시한 모형으로 실제 국내 현실 상황에 적용한 결과를 산출하고 그 의미를 분석하여 스팸대응정책에 활용하고자 한다.

Improved Spam Filter via Handling of Text Embedded Image E-mail

  • Youn, Seongwook;Cho, Hyun-Chong
    • Journal of Electrical Engineering and Technology
    • /
    • 제10권1호
    • /
    • pp.401-407
    • /
    • 2015
  • The increase of image spam, a kind of spam in which the text message is embedded into attached image to defeat spam filtering technique, is a major problem of the current e-mail system. For nearly a decade, content based filtering using text classification or machine learning has been a major trend of anti-spam filtering system. Recently, spammers try to defeat anti-spam filter by many techniques. Text embedding into attached image is one of them. We proposed an ontology spam filters. However, the proposed system handles only text e-mail and the percentage of attached images is increasing sharply. The contribution of the paper is that we add image e-mail handling capability into the anti-spam filtering system keeping the advantages of the previous text based spam e-mail filtering system. Also, the proposed system gives a low false negative value, which means that user's valuable e-mail is rarely regarded as a spam e-mail.

정보통신망법 스팸 규제 개선 방안 연구 (Analysis of Anti-SPAM Regulations in Korean IT Law)

  • 김성준;김범수
    • 한국IT서비스학회지
    • /
    • 제10권1호
    • /
    • pp.21-34
    • /
    • 2011
  • Spam refers to any unwanted or unauthorized commercial messages. Spam may violate individuals' privacy or other personal rights. Spam often overloads network traffic, wastes individuals' time, lowers productivity and quality of life, and limits the trustworthiness of Internet businesses. As the use of mobile messaging services and social networking services both on mobile communication networks and on the Internet increase, newer and more complex types of IT applications and services are often used as new means of spam. In this research, the characteristics and impact of new and future forms of spam, and anti-spam related policies and regulations are surveyed. To improve the effectiveness of anti-spam policies and regulations in Korea, adding a definition of spam in the law, changing policies to focus on the 'type of services' rather on the medium of transmission, and redefining the scope of 'commercial purposes' in Korean law are suggested.

통계적 기법을 이용한 스팸메시지 필터링 기법 (A Technique of Statistical Message Filtering for Blocking Spam Message)

  • 김성윤;차태수;박제원;최재현;이남용
    • 한국IT서비스학회지
    • /
    • 제13권3호
    • /
    • pp.299-308
    • /
    • 2014
  • Due to indiscriminately received spam messages on information society, spam messages cause damages not only to person but also to our community. Nowadays a lot of spam filtering techniques, such as blocking characters, are studied actively. Most of these studies are content-based spam filtering technologies through machine learning.. Because of a spam message transmission techniques are being developed, spammers have to send spam messages using term spamming techniques. Spam messages tend to include number of nouns, using repeated words and inserting special characters between words in a sentence. In this paper, considering three features, SPSS statistical program were used in parameterization and we derive the equation. And then, based on this equation we measured the performance of classification of spam messages. The study compared with previous studies FP-rate in terms of further minimizing the cost of product was confirmed to show an excellent performance.

URL 빈도분석을 이용한 스팸메일 차단 방법 (A spam mail blocking method using URL frequency analysis)

  • 백기영;이철수;류재철
    • 정보보호학회논문지
    • /
    • 제14권6호
    • /
    • pp.135-148
    • /
    • 2004
  • 최근 다양하게 변하는 스팸메일은 단어에 의한 기존의 스팸메일 판별 방법으로는 차단하기 어렵다. 이와 같은 문제를 해결하고자 URL 빈도분석을 이용한 스팸메일 관별 규칙 생성 방법을 제안한다. 제안한 방법은 스팸메일을 수집하고, 수집된 스팸메일에서 특징이 되는 URL을 추출하고, 이를 정규화하여 시간 빈도에 따른 스팸메일 판별 규칙 생성하여 스팸메일을 차단하는 단계로 구성된다. 이는 다양한 스팸메일에 대응할 수 있으며 변화하는 스팸메일의 형태에 대해서도 대응할 수 있는 구조를 가지고 있다.

Detection of Zombie PCs Based on Email Spam Analysis

  • Jeong, Hyun-Cheol;Kim, Huy-Kang;Lee, Sang-Jin;Kim, Eun-Jin
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제6권5호
    • /
    • pp.1445-1462
    • /
    • 2012
  • While botnets are used for various malicious activities, it is well known that they are widely used for email spam. Though the spam filtering systems currently in use block IPs that send email spam, simply blocking the IPs of zombie PCs participating in a botnet is not enough to prevent the spamming activities of the botnet because these IPs can easily be changed or manipulated. This IP blocking is also insufficient to prevent crimes other than spamming, as the botnet can be simultaneously used for multiple purposes. For this reason, we propose a system that detects botnets and zombie PCs based on email spam analysis. This study introduces the concept of "group pollution level" - the degree to which a certain spam group is suspected of being a botnet - and "IP pollution level" - the degree to which a certain IP in the spam group is suspected of being a zombie PC. Such concepts are applied in our system that detects botnets and zombie PCs by grouping spam mails based on the URL links or attachments contained, and by assessing the pollution level of each group and each IP address. For empirical testing, we used email spam data collected in an "email spam trap system" - Korea's national spam collection system. Our proposed system detected 203 botnets and 18,283 zombie PCs in a day and these zombie PCs sent about 70% of all the spam messages in our analysis. This shows the effectiveness of detecting zombie PCs by email spam analysis, and the possibility of a dramatic reduction in email spam by taking countermeasure against these botnets and zombie PCs.

Analyzing the correlation of Spam Recall and Thesaurus

  • Kang, Sin-Jae;Kim, Jong-Wan
    • 한국정보기술응용학회:학술대회논문집
    • /
    • 한국정보기술응용학회 2005년도 6th 2005 International Conference on Computers, Communications and System
    • /
    • pp.21-25
    • /
    • 2005
  • In this paper, we constructed a two-phase spam-mail filtering system based on the lexical and conceptual information. There are two kinds of information that can distinguish the spam mail from the legitimate mail. The definite information is the mail sender's information, URL, a certain spam list, and the less definite information is the word list and concept codes extracted from the mail body. We first classified the spam mail by using the definite information, and then used the less definite information. We used the lexical information and concept codes contained in the email body for SVM learning in the $2^{nd}$ phase. According to our results the spam precision was increased if more lexical information was used as features, and the spam recall was increased when the concept codes were included in features as well.

  • PDF

An Architecture for Certificate and Agent Based E-mailing to Block Spam Mail

  • Nam, Sang-Zo
    • 지능정보연구
    • /
    • 제9권2호
    • /
    • pp.39-50
    • /
    • 2003
  • Deleting unsolicited email, popularly known as spam mail, is an annoying task for Internet users. Moreover, spam mail causes a variety of social problems. At present, legal restrictions cannot eradicate spam senders. As a result, many technical methods to eliminate spam mail such as spam filtering and online stamps have been introduced. However, the process of blocking spam mail can inadvertently result in suspension of indispensable or beneficial communication. In this paper, we propose a certificate and agent based emailing architecture that can block spam mail, while at the same time approve certified mail. This architecture can be accelerated by synergistic utilization of digital signature and electronic document interchange.

  • PDF