• Title/Summary/Keyword: Malicious URLs

Search Result 21, Processing Time 0.027 seconds

Design and Implementation of Malicious URL Prediction System based on Multiple Machine Learning Algorithms (다중 머신러닝 알고리즘을 이용한 악성 URL 예측 시스템 설계 및 구현)

  • Kang, Hong Koo;Shin, Sam Shin;Kim, Dae Yeob;Park, Soon Tai
    • Journal of Korea Multimedia Society
    • /
    • v.23 no.11
    • /
    • pp.1396-1405
    • /
    • 2020
  • Cyber threats such as forced personal information collection and distribution of malicious codes using malicious URLs continue to occur. In order to cope with such cyber threats, a security technologies that quickly detects malicious URLs and prevents damage are required. In a web environment, malicious URLs have various forms and are created and deleted from time to time, so there is a limit to the response as a method of detecting or filtering by signature matching. Recently, researches on detecting and predicting malicious URLs using machine learning techniques have been actively conducted. Existing studies have proposed various features and machine learning algorithms for predicting malicious URLs, but most of them are only suggesting specialized algorithms by supplementing features and preprocessing, so it is difficult to sufficiently reflect the strengths of various machine learning algorithms. In this paper, a system for predicting malicious URLs using multiple machine learning algorithms was proposed, and an experiment was performed to combine the prediction results of multiple machine learning models to increase the accuracy of predicting malicious URLs. Through experiments, it was proved that the combination of multiple models is useful in improving the prediction performance compared to a single model.

A Discovery System of Malicious Javascript URLs hidden in Web Source Code Files

  • Park, Hweerang;Cho, Sang-Il;Park, Jungkyu;Cho, Youngho
    • Journal of the Korea Society of Computer and Information
    • /
    • v.24 no.5
    • /
    • pp.27-33
    • /
    • 2019
  • One of serious security threats is a botnet-based attack. A botnet in general consists of numerous bots, which are computing devices with networking function, such as personal computers, smartphones, or tiny IoT sensor devices compromised by malicious codes or attackers. Such botnets can launch various serious cyber-attacks like DDoS attacks, propagating mal-wares, and spreading spam e-mails over the network. To establish a botnet, attackers usually inject malicious URLs into web source codes stealthily by using data hiding methods like Javascript obfuscation techniques to avoid being discovered by traditional security systems such as Firewall, IPS(Intrusion Prevention System) or IDS(Intrusion Detection System). Meanwhile, it is non-trivial work in practice for software developers to manually find such malicious URLs which are hidden in numerous web source codes stored in web servers. In this paper, we propose a security defense system to discover such suspicious, malicious URLs hidden in web source codes, and present experiment results that show its discovery performance. In particular, based on our experiment results, our proposed system discovered 100% of URLs hidden by Javascript encoding obfuscation within sample web source files.

Malicious URL Detection by Visual Characteristics with Machine Learning: Roles of HTTPS (시각적 특징과 머신 러닝으로 악성 URL 구분: HTTPS의 역할)

  • Sung-Won HONG;Min-Soo KANG
    • Journal of Korea Artificial Intelligence Association
    • /
    • v.1 no.2
    • /
    • pp.1-6
    • /
    • 2023
  • In this paper, we present a new method for classifying malicious URLs to reduce cases of learning difficulties due to unfamiliar and difficult terms related to information protection. This study plans to extract only visually distinguishable features within the URL structure and compare them through map learning algorithms, and to compare the contribution values of the best map learning algorithm methods to extract features that have the most impact on classifying malicious URLs. As research data, Kaggle used data that classified 7,046 malicious URLs and 7.046 normal URLs. As a result of the study, among the three supervised learning algorithms used (Decision Tree, Support Vector Machine, and Logistic Regression), the Decision Tree algorithm showed the best performance with 83% accuracy, 83.1% F1-score and 83.6% Recall values. It was confirmed that the contribution value of https is the highest among whether to use https, sub domain, and prefix and suffix, which can be visually distinguished through the feature contribution of Decision Tree. Although it has been difficult to learn unfamiliar and difficult terms so far, this study will be able to provide an intuitive judgment method without explanation of the terms and prove its usefulness in the field of malicious URL detection.

Development of Rule-Based Malicious URL Detection Library Considering User Experiences (사용자 경험을 고려한 규칙기반 악성 URL 탐지 라이브러리 개발)

  • Kim, Bo-Min;Han, Ye-Won;Kim, Ga-Young;Kim, Ye-Bun;Kim, Hyung-Jong
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.30 no.3
    • /
    • pp.481-491
    • /
    • 2020
  • The malicious URLs which can be used for sending malicious codes and illegally acquiring private information is one of the biggest threat of information security field. Particularly, recent prevalence of smart-phone increases the possibility of the user's exposing to malicious URLs. Since the way of hiding the URL from the user is getting more sophisticated, it is getting harder to detect it. In this paper, after conducting a survey of the user experiences related to malicious URLs, we are proposing the rule-based malicious URL detection method. In addition, we have developed java library which can be applied to any other applications which need to handle the malicious URL. Each class of the library is implementation of a rule for detecting a characteristics of a malicious URL and the library itself is the set of rule which can have the chain of rule for deteciing more complicated situation and enhancing the accuracy. This kinds of rule based approach can enhance the extensibility considering the diversity of malicious URLs.

URL Filtering by Using Machine Learning

  • Saqib, Malik Najmus
    • International Journal of Computer Science & Network Security
    • /
    • v.22 no.8
    • /
    • pp.275-279
    • /
    • 2022
  • The growth of technology nowadays has made many things easy for humans. These things are from everyday small task to more complex tasks. Such growth also comes with the illegal activities that are perform by using technology. These illegal activities can simple as displaying annoying message to big frauds. The easiest way for the attacker to perform such activities is to convenience user to click on the malicious link. It has been a great concern since a decay to classify URLs as malicious or benign. The blacklist has been used initially for that purpose and is it being used nowadays. It is efficient but has a drawback to update blacklist automatically. So, this method is replace by classification of URLs based on machine learning algorithms. In this paper we have use four machine learning classification algorithms to classify URLs as malicious or benign. These algorithms are support vector machine, random forest, n-nearest neighbor, and decision tree. The dataset that is used in this research has 36694 instances. A comparison of precision accuracy and recall values are shown for dataset with and without preprocessing.

Machine Learning-Based Malicious URL Detection Technique (머신러닝 기반 악성 URL 탐지 기법)

  • Han, Chae-rim;Yun, Su-hyun;Han, Myeong-jin;Lee, Il-Gu
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.32 no.3
    • /
    • pp.555-564
    • /
    • 2022
  • Recently, cyberattacks are using hacking techniques utilizing intelligent and advanced malicious codes for non-face-to-face environments such as telecommuting, telemedicine, and automatic industrial facilities, and the damage is increasing. Traditional information protection systems, such as anti-virus, are a method of detecting known malicious URLs based on signature patterns, so unknown malicious URLs cannot be detected. In addition, the conventional static analysis-based malicious URL detection method is vulnerable to dynamic loading and cryptographic attacks. This study proposes a technique for efficiently detecting malicious URLs by dynamically learning malicious URL data. In the proposed detection technique, malicious codes are classified using machine learning-based feature selection algorithms, and the accuracy is improved by removing obfuscation elements after preprocessing using Weighted Euclidean Distance(WED). According to the experimental results, the proposed machine learning-based malicious URL detection technique shows an accuracy of 89.17%, which is improved by 2.82% compared to the conventional method.

An Implementation of System for Detecting and Filtering Malicious URLs (악성 URL 탐지 및 필터링 시스템 구현)

  • Chang, Hye-Young;Kim, Min-Jae;Kim, Dong-Jin;Lee, Jin-Young;Kim, Hong-Kun;Cho, Seong-Je
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.16 no.4
    • /
    • pp.405-414
    • /
    • 2010
  • According to the statistics of SecurityFocus in 2008, client-side attacks through the Microsoft Internet Explorer have increased by more than 50%. In this paper, we have implemented a behavior-based malicious web page detection system and a blacklist-based malicious web page filtering system. To do this, we first efficiently collected the target URLs by constructing a crawling system. The malicious URL detection system, run on a specific server, visits and renders actively the collected web pages under virtual machine environment. To detect whether each web page is malicious or not, the system state changes of the virtual machine are checked after rendering the page. If abnormal state changes are detected, we conclude the rendered web page is malicious, and insert it into the blacklist of malicious web pages. The malicious URL filtering system, run on the web client machine, filters malicious web pages based on the blacklist when a user visits web sites. We have enhanced system performance by automatically handling message boxes at the time of ULR analysis on the detection system. Experimental results show that the game sites contain up to three times more malicious pages than the other sites, and many attacks incur a file creation and a registry key modification.

A Study on Collection and Analysis Method of Malicious URLs Based on Darknet Traffic for Advanced Security Monitoring and Response (효율적인 보안관제 수행을 위한 다크넷 트래픽 기반 악성 URL 수집 및 분석방법 연구)

  • Kim, Kyu-Il;Choi, Sang-So;Park, Hark-Soo;Ko, Sang-Jun;Song, Jung-Suk
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.24 no.6
    • /
    • pp.1185-1195
    • /
    • 2014
  • Domestic and international CERTs are carrying out security monitoring and response services based on security devices for intrusion incident prevention and damage minimization of the organizations. However, the security monitoring and response service has a fatal limitation in that it is unable to detect unknown attacks that are not matched to the predefined signatures. In recent, many approaches have adopted the darknet technique in order to overcome the limitation. Since the darknet means a set of unused IP addresses, no real systems connected to the darknet. Thus, all the incoming traffic to the darknet can be regarded as attack activities. In this paper, we present a collection and analysis method of malicious URLs based on darkent traffic for advanced security monitoring and response service. The proposed method prepared 8,192 darknet space and extracted all of URLs from the darknet traffic, and carried out in-depth analysis for the extracted URLs. The analysis results can contribute to the emergence response of large-scale cyber threats and it is able to improve the performance of the security monitoring and response if we apply the malicious URLs into the security devices, DNS sinkhole service, etc.

MALICIOUS URL RECOGNITION AND DETECTION USING ATTENTION-BASED CNN-LSTM

  • Peng, Yongfang;Tian, Shengwei;Yu, Long;Lv, Yalong;Wang, Ruijin
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.13 no.11
    • /
    • pp.5580-5593
    • /
    • 2019
  • A malicious Uniform Resource Locator (URL) recognition and detection method based on the combination of Attention mechanism with Convolutional Neural Network and Long Short-Term Memory Network (Attention-Based CNN-LSTM), is proposed. Firstly, the WHOIS check method is used to extract and filter features, including the URL texture information, the URL string statistical information of attributes and the WHOIS information, and the features are subsequently encoded and pre-processed followed by inputting them to the constructed Convolutional Neural Network (CNN) convolution layer to extract local features. Secondly, in accordance with the weights from the Attention mechanism, the generated local features are input into the Long-Short Term Memory (LSTM) model, and subsequently pooled to calculate the global features of the URLs. Finally, the URLs are detected and classified by the SoftMax function using global features. The results demonstrate that compared with the existing methods, the Attention-based CNN-LSTM mechanism has higher accuracy for malicious URL detection.

JsSandbox: A Framework for Analyzing the Behavior of Malicious JavaScript Code using Internal Function Hooking

  • Kim, Hyoung-Chun;Choi, Young-Han;Lee, Dong-Hoon
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.6 no.2
    • /
    • pp.766-783
    • /
    • 2012
  • Recently, many malicious users have attacked web browsers using JavaScript code that can execute dynamic actions within the browsers. By forcing the browser to execute malicious JavaScript code, the attackers can steal personal information stored in the system, allow malware program downloads in the client's system, and so on. In order to reduce damage, malicious web pages must be located prior to general users accessing the infected pages. In this paper, a novel framework (JsSandbox) that can monitor and analyze the behavior of malicious JavaScript code using internal function hooking (IFH) is proposed. IFH is defined as the hooking of all functions in the modules using the debug information and extracting the parameter values. The use of IFH enables the monitoring of functions that API hooking cannot. JsSandbox was implemented based on a debugger engine, and some features were applied to detect and analyze malicious JavaScript code: detection of obfuscation, deobfuscation of the obfuscated string, detection of URLs related to redirection, and detection of exploit codes. Then, the proposed framework was analyzed for specific features, and the results demonstrate that JsSandbox can be applied to the analysis of the behavior of malicious web pages.