Malicious URL Detection by Visual Characteristics with Machine Learning: Roles of HTTPS

Sung-Won HONG;Min-Soo KANG;

doi:10.24225/jkaia.2023.1.2.1

Journal of Korean Artificial Intelligence Association

제1권2호
/
Pages.1-6
/
2023
/
3022-5388(eISSN)

한국유통과학회 (Korea Distribution Science Association)

DOI QR Code

시각적 특징과 머신 러닝으로 악성 URL 구분: HTTPS의 역할

Malicious URL Detection by Visual Characteristics with Machine Learning: Roles of HTTPS

Sung-Won HONG (Medical IT, Eulji University) ;
Min-Soo KANG (Medical IT, Eulji University)

투고 : 2023.10.26
심사 : 2023.12.30
발행 : 2023.12.31

https://doi.org/10.24225/jkaia.2023.1.2.1 인용 PDF

PDF 다운로드

⟨ 이전 논문 다음 논문 ⟩

초록

In this paper, we present a new method for classifying malicious URLs to reduce cases of learning difficulties due to unfamiliar and difficult terms related to information protection. This study plans to extract only visually distinguishable features within the URL structure and compare them through map learning algorithms, and to compare the contribution values of the best map learning algorithm methods to extract features that have the most impact on classifying malicious URLs. As research data, Kaggle used data that classified 7,046 malicious URLs and 7.046 normal URLs. As a result of the study, among the three supervised learning algorithms used (Decision Tree, Support Vector Machine, and Logistic Regression), the Decision Tree algorithm showed the best performance with 83% accuracy, 83.1% F1-score and 83.6% Recall values. It was confirmed that the contribution value of https is the highest among whether to use https, sub domain, and prefix and suffix, which can be visually distinguished through the feature contribution of Decision Tree. Although it has been difficult to learn unfamiliar and difficult terms so far, this study will be able to provide an intuitive judgment method without explanation of the terms and prove its usefulness in the field of malicious URL detection.

키워드

참고문헌

AhnLab. (2023, July 18). Retrieved November 5, 2023, from https://www.ahnlab.com/ko/contents/content-center/33769
Han, C. R., Yun, S. H., Han, M. J., & Lee, I. G. (2022). Machine Learning-Based Malicious URL Detection Technique. Journal of the Korea Institute of Information Security & Cryptology, 32(3), 555-564.
Jang, J. Y., Lim, K. D., & Lee, S. J. (2022). An Harmful site collection system using Characteristic of HTML and URL. Journal of Digital Forensics, 16(1), 54-63. https://doi.org/10.22798/KDFS.2022.16.1.54
Kang, H. K., Shin, S. S., Kim, D. Y., & Park, S. T. (2020). Design and Implementation of Malicious URL Prediction System based on Multiple Machine Learning Algorithms. Journal of Korea Multimedia Society, 23(11), 1396-1405.
Kim, B. M., Han, Y. W., Kim, G. Y., Kim, Y. B., & Kim, H. J. (2020). Development of Rule-Based Malicious URL Detection Library Considering User Experiences*. Journal of the Korea Institute of Information Security & Cryptology, 30(3), 481-491.
Kim, J. K., Jang, M. H., Lim, S. N., & Kim, M. S. (2021). A Study on the Detection Method of Malicious URLs based on the Internet Search Engines using the Machine Learning. The transactions of The Korean Institute of Electrical Engineers, 70(1), 114-120, 10.5370/KIEE.2021.70.1.114
Kim, Y. J., & Lee, J. W. (2022). Development of a Malicious URL Machine Learning Detection Model Reflecting the Main Feature of URLs. Journal of the Korea Institute of Information and Communication Engineering, 26(12), 1786-1793.
KOSIS. (2023, March 7). Retrieved November 3, 2023, from https://kostat.go.kr/ansk/
KOSIS. (2023, August 25). Retrieved November 7, 2023, from https://kosis.kr/index/index.do
Microsoft. (2023, December 6). Retrieved December 10, 2023, from https://learn.microsoft.com/ko-kr/azure/machinelearning/overview-what-is-azure-machinelearning?view=azureml-api-2

Journal of Korean Artificial Intelligence Association

시각적 특징과 머신 러닝으로 악성 URL 구분: HTTPS의 역할

Malicious URL Detection by Visual Characteristics with Machine Learning: Roles of HTTPS

초록

키워드

참고문헌

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)