• Title/Summary/Keyword: Malicious URL Detection

Search Result 28, Processing Time 0.027 seconds

DL-ML Fusion Hybrid Model for Malicious Web Site URL Detection Based on URL Lexical Features (악성 URL 탐지를 위한 URL Lexical Feature 기반의 DL-ML Fusion Hybrid 모델)

  • Dae-yeob Kim
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.33 no.6
    • /
    • pp.881-891
    • /
    • 2023
  • Recently, various studies on malicious URL detection using artificial intelligence have been conducted, and most of the research have shown great detection performance. However, not only does classical machine learning require a process of analyzing features, but the detection performance of a trained model also depends on the data analyst's ability. In this paper, we propose a DL-ML Fusion Hybrid Model for malicious web site URL detection based on URL lexical features. the propose model combines the automatic feature extraction layer of deep learning and classical machine learning to improve the feature engineering issue. 60,000 malicious and normal URLs were collected for the experiment and the results showed 23.98%p performance improvement in maximum. In addition, it was possible to train a model in an efficient way with the automation of feature engineering.

Machine Learning-Based Malicious URL Detection Technique (머신러닝 기반 악성 URL 탐지 기법)

  • Han, Chae-rim;Yun, Su-hyun;Han, Myeong-jin;Lee, Il-Gu
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.32 no.3
    • /
    • pp.555-564
    • /
    • 2022
  • Recently, cyberattacks are using hacking techniques utilizing intelligent and advanced malicious codes for non-face-to-face environments such as telecommuting, telemedicine, and automatic industrial facilities, and the damage is increasing. Traditional information protection systems, such as anti-virus, are a method of detecting known malicious URLs based on signature patterns, so unknown malicious URLs cannot be detected. In addition, the conventional static analysis-based malicious URL detection method is vulnerable to dynamic loading and cryptographic attacks. This study proposes a technique for efficiently detecting malicious URLs by dynamically learning malicious URL data. In the proposed detection technique, malicious codes are classified using machine learning-based feature selection algorithms, and the accuracy is improved by removing obfuscation elements after preprocessing using Weighted Euclidean Distance(WED). According to the experimental results, the proposed machine learning-based malicious URL detection technique shows an accuracy of 89.17%, which is improved by 2.82% compared to the conventional method.

A Study proposal for URL anomaly detection model based on classification algorithm (분류 알고리즘 기반 URL 이상 탐지 모델 연구 제안)

  • Hyeon Wuu Kim;Hong-Ki Kim;DongHwi Lee
    • Convergence Security Journal
    • /
    • v.23 no.5
    • /
    • pp.101-106
    • /
    • 2023
  • Recently, cyberattacks are increasing in social engineering attacks using intelligent and continuous phishing sites and hacking techniques using malicious code. As personal security becomes important, there is a need for a method and a solution for determining whether a malicious URL exists using a web application. In this paper, we would like to find out each feature and limitation by comparing highly accurate techniques for detecting malicious URLs. Compared to classification algorithm models using features such as web flat panel DB and based URL detection sites, we propose an efficient URL anomaly detection technique.

Development of Rule-Based Malicious URL Detection Library Considering User Experiences (사용자 경험을 고려한 규칙기반 악성 URL 탐지 라이브러리 개발)

  • Kim, Bo-Min;Han, Ye-Won;Kim, Ga-Young;Kim, Ye-Bun;Kim, Hyung-Jong
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.30 no.3
    • /
    • pp.481-491
    • /
    • 2020
  • The malicious URLs which can be used for sending malicious codes and illegally acquiring private information is one of the biggest threat of information security field. Particularly, recent prevalence of smart-phone increases the possibility of the user's exposing to malicious URLs. Since the way of hiding the URL from the user is getting more sophisticated, it is getting harder to detect it. In this paper, after conducting a survey of the user experiences related to malicious URLs, we are proposing the rule-based malicious URL detection method. In addition, we have developed java library which can be applied to any other applications which need to handle the malicious URL. Each class of the library is implementation of a rule for detecting a characteristics of a malicious URL and the library itself is the set of rule which can have the chain of rule for deteciing more complicated situation and enhancing the accuracy. This kinds of rule based approach can enhance the extensibility considering the diversity of malicious URLs.

An Implementation of System for Detecting and Filtering Malicious URLs (악성 URL 탐지 및 필터링 시스템 구현)

  • Chang, Hye-Young;Kim, Min-Jae;Kim, Dong-Jin;Lee, Jin-Young;Kim, Hong-Kun;Cho, Seong-Je
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.16 no.4
    • /
    • pp.405-414
    • /
    • 2010
  • According to the statistics of SecurityFocus in 2008, client-side attacks through the Microsoft Internet Explorer have increased by more than 50%. In this paper, we have implemented a behavior-based malicious web page detection system and a blacklist-based malicious web page filtering system. To do this, we first efficiently collected the target URLs by constructing a crawling system. The malicious URL detection system, run on a specific server, visits and renders actively the collected web pages under virtual machine environment. To detect whether each web page is malicious or not, the system state changes of the virtual machine are checked after rendering the page. If abnormal state changes are detected, we conclude the rendered web page is malicious, and insert it into the blacklist of malicious web pages. The malicious URL filtering system, run on the web client machine, filters malicious web pages based on the blacklist when a user visits web sites. We have enhanced system performance by automatically handling message boxes at the time of ULR analysis on the detection system. Experimental results show that the game sites contain up to three times more malicious pages than the other sites, and many attacks incur a file creation and a registry key modification.

MALICIOUS URL RECOGNITION AND DETECTION USING ATTENTION-BASED CNN-LSTM

  • Peng, Yongfang;Tian, Shengwei;Yu, Long;Lv, Yalong;Wang, Ruijin
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.13 no.11
    • /
    • pp.5580-5593
    • /
    • 2019
  • A malicious Uniform Resource Locator (URL) recognition and detection method based on the combination of Attention mechanism with Convolutional Neural Network and Long Short-Term Memory Network (Attention-Based CNN-LSTM), is proposed. Firstly, the WHOIS check method is used to extract and filter features, including the URL texture information, the URL string statistical information of attributes and the WHOIS information, and the features are subsequently encoded and pre-processed followed by inputting them to the constructed Convolutional Neural Network (CNN) convolution layer to extract local features. Secondly, in accordance with the weights from the Attention mechanism, the generated local features are input into the Long-Short Term Memory (LSTM) model, and subsequently pooled to calculate the global features of the URLs. Finally, the URLs are detected and classified by the SoftMax function using global features. The results demonstrate that compared with the existing methods, the Attention-based CNN-LSTM mechanism has higher accuracy for malicious URL detection.

Design of detection method for malicious URL based on Deep Neural Network (뉴럴네트워크 기반에 악성 URL 탐지방법 설계)

  • Kwon, Hyun;Park, Sangjun;Kim, Yongchul
    • Journal of Convergence for Information Technology
    • /
    • v.11 no.5
    • /
    • pp.30-37
    • /
    • 2021
  • Various devices are connected to the Internet, and attacks using the Internet are occurring. Among such attacks, there are attacks that use malicious URLs to make users access to wrong phishing sites or distribute malicious viruses. Therefore, how to detect such malicious URL attacks is one of the important security issues. Among recent deep learning technologies, neural networks are showing good performance in image recognition, speech recognition, and pattern recognition. This neural network can be applied to research that analyzes and detects patterns of malicious URL characteristics. In this paper, performance analysis according to various parameters was performed on a method of detecting malicious URLs using neural networks. In this paper, malicious URL detection performance was analyzed while changing the activation function, learning rate, and neural network structure. The experimental data was crawled by Alexa top 1 million and Whois to build the data, and the machine learning library used TensorFlow. As a result of the experiment, when the number of layers is 4, the learning rate is 0.005, and the number of nodes in each layer is 100, the accuracy of 97.8% and the f1 score of 92.94% are obtained.

Development of a Malicious URL Machine Learning Detection Model Reflecting the Main Feature of URLs (URL 주요특징을 고려한 악성URL 머신러닝 탐지모델 개발)

  • Kim, Youngjun;Lee, Jaewoo
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.26 no.12
    • /
    • pp.1786-1793
    • /
    • 2022
  • Cyber-attacks such as smishing and hacking mail exploiting COVID-19, political and social issues, have recently been continuous. Machine learning and deep learning technology research are conducted to prevent any damage due to cyber-attacks inducing malicious links to breach personal data. It has been concluded as a lack of basis to judge the attacks to be malicious in previous studies since the features of data set were excessively simple. In this paper, nine main features of three types, "URL Days", "URL Word", and "URL Abnormal", were proposed in addition to lexical features of URL which have been reflected in previous research. F1-Score and accuracy index were measured through four different types of machine learning algorithms. An improvement of 0.9% in a result and the highest value, 98.5%, were examined in F1-Score and accuracy through comparatively analyzing an existing research. These outcomes proved the main features contribute to elevating the values in both accuracy and performance.

Malicious URL Detection by Visual Characteristics with Machine Learning: Roles of HTTPS (시각적 특징과 머신 러닝으로 악성 URL 구분: HTTPS의 역할)

  • Sung-Won HONG;Min-Soo KANG
    • Journal of Korea Artificial Intelligence Association
    • /
    • v.1 no.2
    • /
    • pp.1-6
    • /
    • 2023
  • In this paper, we present a new method for classifying malicious URLs to reduce cases of learning difficulties due to unfamiliar and difficult terms related to information protection. This study plans to extract only visually distinguishable features within the URL structure and compare them through map learning algorithms, and to compare the contribution values of the best map learning algorithm methods to extract features that have the most impact on classifying malicious URLs. As research data, Kaggle used data that classified 7,046 malicious URLs and 7.046 normal URLs. As a result of the study, among the three supervised learning algorithms used (Decision Tree, Support Vector Machine, and Logistic Regression), the Decision Tree algorithm showed the best performance with 83% accuracy, 83.1% F1-score and 83.6% Recall values. It was confirmed that the contribution value of https is the highest among whether to use https, sub domain, and prefix and suffix, which can be visually distinguished through the feature contribution of Decision Tree. Although it has been difficult to learn unfamiliar and difficult terms so far, this study will be able to provide an intuitive judgment method without explanation of the terms and prove its usefulness in the field of malicious URL detection.

A Study on SMiShing Detection Technique using TaintDroid (테인트드로이드를 이용한 스미싱 탐지 기법 연구)

  • Cho, Jiho;Shin, Jiyong;Lee, Geuk
    • Convergence Security Journal
    • /
    • v.15 no.1
    • /
    • pp.3-9
    • /
    • 2015
  • In this paper, a detection technique of smishing using a TaintDroid is suggested. Suggesting system detects malicious acts by transmitting a URL to the TaintDroid server and installing a relevant application to a virtual device of the TaintDroid server, when a smartphone user receives a text message including the URL suspected as a smishing. Through this we want to distinguish an application that can not install because of suspicion of a smishing in an actual smartphone whether said application is malicious application or not by testing with the virtual device of said system. The detection technique of a smishing using the TaintDroid suggested in this paper is possible to detect in a new form a smishing with a text message and to identifying which application it is through analysis of results from a user.