Search | Korea Science

Developing a Framework for Detecting Phishing URLs Using Machine Learning

Nguyen Tung Lam
- International Journal of Computer Science & Network Security
- /
- v.23 no.10
- /
- pp.157-163
- /
- 2023
The attack technique targeting end-users through phishing URLs is very dangerous nowadays. With this technique, attackers could steal user data or take control of the system, etc. Therefore, early detecting phishing URLs is essential. In this paper, we propose a method to detect phishing URLs based on supervised learning algorithms and abnormal behaviors from URLs. Finally, based on the research results, we build a framework for detecting phishing URLs through end-users. The novelty and advantage of our proposed method are that abnormal behaviors are extracted based on URLs which are monitored and collected directly from attack campaigns instead of using inefficient old datasets.
https://doi.org/10.22937/IJCSNS.2023.23.10.19 인용 PDF

A Unknown Phishing Site Detection Method in the Interior Network Environment (내부 네트워크에서 알려지지 않은 피싱사이트 탐지방안)

Park, Jeonguk;Cho, Gihwan
- Journal of the Korea Institute of Information Security & Cryptology
- /
- v.25 no.2
- /
- pp.313-320
- /
- 2015
While various phishing attacks are getting to be increased in constant, their response methods still stay on the stage of responding after identifying an attack. To detect a phishing site ahead of an attack, a method has been suggested with utilizing the Referer header field of HTTP. However, it has a limitation to implement a traffic gathering system for each of prospective target hosts. This paper presents a unknown phishing site detection method in the Interior network environment. Whenever a user try to connect a phishing site, its traffic is pre-processed with considering of the characteristics of HTTP protocol and phishing site. The phishing site detection phase detects a suspicious site under phishing with analysing HTTP content. To validate the proposed method, some evaluations were conducted with 100 phishing URLs along with 100 normal URLs. The experimental results show that our method achieves higher phishing site detection rate than that of existing detection methods, as 66% detection rate for the phishing URLs, and 0% false negative rate for the normal URLs.
https://doi.org/10.13089/JKIISC.2015.25.2.313 인용 PDF KSCI HTML

URL Phishing Detection System Utilizing Catboost Machine Learning Approach

Fang, Lim Chian;Ayop, Zakiah;Anawar, Syarulnaziah;Othman, Nur Fadzilah;Harum, Norharyati;Abdullah, Raihana Syahirah
- International Journal of Computer Science & Network Security
- /
- v.21 no.9
- /
- pp.297-302
- /
- 2021
The development of various phishing websites enables hackers to access confidential personal or financial data, thus, decreasing the trust in e-business. This paper compared the detection techniques utilizing URL-based features. To analyze and compare the performance of supervised machine learning classifiers, the machine learning classifiers were trained by using more than 11,005 phishing and legitimate URLs. 30 features were extracted from the URLs to detect a phishing or legitimate URL. Logistic Regression, Random Forest, and CatBoost classifiers were then analyzed and their performances were evaluated. The results yielded that CatBoost was much better classifier than Random Forest and Logistic Regression with up to 96% of detection accuracy.
https://doi.org/10.22937/IJCSNS.2021.21.9.39 인용 PDF KSCI

Hybrid phishing site detection system with GRU-based shortened URL determination technique (GRU 기반 단축 URL 판별 기법을 적용한 하이브리드 피싱 사이트 탐지 시스템)

Hae-Soo Kim;Mi-Hui Kim
- Journal of IKEEE
- /
- v.27 no.3
- /
- pp.213-219
- /
- 2023
According to statistics from the National Police Agency, smishing crimes using texts or messengers have increased dramatically since COVID-19. In addition, most of the cases of impersonation of public institutions reported to agency were related to vaccination and reward, and many methods were used to trick people into clicking on fake URLs (Uniform Resource Locators). When detecting them, URL-based detection methods cannot detect them properly if the information of the URL is hidden, and content-based detection methods are slow and use a lot of resources. In this paper, we propose a system for URL-based detection using transformer for regular URLs and content-based detection using XGBoost for shortened URLs through the process of determining shortened URLs using GRU(Gated Recurrent Units). The F1-Score of the proposed detection system was 94.86, and its average processing time was 5.4 seconds.
https://doi.org/10.7471/ikeee.2023.27.3.213 인용 PDF

Robust URL Phishing Detection Based on Deep Learning

Al-Alyan, Abdullah;Al-Ahmadi, Saad
- KSII Transactions on Internet and Information Systems (TIIS)
- /
- v.14 no.7
- /
- pp.2752-2768
- /
- 2020
Phishing websites can have devastating effects on governmental, financial, and social services, as well as on individual privacy. Currently, many phishing detection solutions are evaluated using small datasets and, thus, are prone to sampling issues, such as representing legitimate websites by only high-ranking websites, which could make their evaluation less relevant in practice. Phishing detection solutions which depend only on the URL are attractive, as they can be used in limited systems, such as with firewalls. In this paper, we present a URL-only phishing detection solution based on a convolutional neural network (CNN) model. The proposed CNN takes the URL as the input, rather than using predetermined features such as URL length. For training and evaluation, we have collected over two million URLs in a massive URL phishing detection (MUPD) dataset. We split MUPD into training, validation and testing datasets. The proposed CNN achieves approximately 96% accuracy on the testing dataset; this accuracy is achieved with URL schemes (such as HTTP and HTTPS) removed from the URL. Our proposed solution achieved better accuracy compared to an existing state-of-the-art URL-only model on a published dataset. Finally, the results of our experiment suggest keeping the CNN up-to-date for better results in practice.
https://doi.org/10.3837/tiis.2020.07.001 인용 PDF KSCI HTML

Real-time Phishing Site Detection Method (피싱사이트 실시간 탐지 기법)

Sa, Joon-Ho;Lee, Sang-Jin
- Journal of the Korea Institute of Information Security & Cryptology
- /
- v.22 no.4
- /
- pp.819-825
- /
- 2012
Nowadays many phishing sites contain HTTP links to victim web-site's contents such as images, bulletin board etc. to make the phishing sites look more real and similar to the victim web-site. We introduce a real-time phishing site detection system which makes use of the characteristic that the phishing sites' URLs flow into the victim web-site via the HTTP referer header field when the phishing site is visited. The detection system is designed to adopt an out-of-path network configuration to minimize effect on the running system, and a phishing site source code analysis technique to alert administrators in real-time when phishing site is detected. The detection system was installed on a company's web-site which had been targeted for phishing. As result, the detection system detected 40 phishing sites in 6 days of test period.
https://doi.org/10.13089/JKIISC.2012.22.4.819 인용 PDF KSCI HTML

AutoML Machine Learning-Based for Detecting Qshing Attacks Malicious URL Classification Technology Research and Service Implementation (큐싱 공격 탐지를 위한 AutoML 머신러닝 기반 악성 URL 분류 기술 연구 및 서비스 구현)

Dong-Young Kim;Gi-Seong Hwang
- Smart Media Journal
- /
- v.13 no.6
- /
- pp.9-15
- /
- 2024
In recent trends, there has been an increase in 'Qshing' attacks, a hybrid form of phishing that exploits fake QR (Quick Response) codes impersonating government agencies to steal personal and financial information. Particularly, this attack method is characterized by its stealthiness, as victims can be redirected to phishing pages or led to download malicious software simply by scanning a QR code, making it difficult for them to realize they have been targeted. In this paper, we have developed a classification technique utilizing machine learning algorithms to identify the maliciousness of URLs embedded in QR codes, and we have explored ways to integrate this with existing QR code readers. To this end, we constructed a dataset from 128,587 malicious URLs and 428,102 benign URLs, extracting 35 different features such as protocol and parameters, and used AutoML to identify the optimal algorithm and hyperparameters, achieving an accuracy of approximately 87.37%. Following this, we designed the integration of the trained classification model with existing QR code readers to implement a service capable of countering Qshing attacks. In conclusion, our findings confirm that deriving an optimized algorithm for classifying malicious URLs in QR codes and integrating it with existing QR code readers presents a viable solution to combat Qshing attacks.
https://doi.org/10.30693/SMJ.2024.13.6.9 인용 PDF

Short URLs Verification Approach for Phishing Site Detection Improvement (피싱 사이트 탐지 성능 향상을 위한 단축 URL 검증 기법)

Kim, Yun-Gi;Kim, Hae-Soo;Kim, Mi-Hui
- Proceedings of the Korea Information Processing Society Conference
- /
- 2022.11a
- /
- pp.80-81
- /
- 2022
최근 소셜 미디어 서비스의 성장과 접근성이 편해짐에 따라 피싱 URL 자동 분류가 필요하다. 그런데 단축 URL 서비스가 대중화되면서 피싱 URL 또한 단축 URL 서비스를 이용하여 피싱 사이트로 통하는지 정상적인 사이트로 통하는지 알 수 없게 되었다. 이런 경우 콘텐츠 기반 탐지를 통해 확인할 수 있지만 URL 기반 방법보다 느리고 리소스를 많이 차지한다는 단점이 있어 본 논문에서는 단축 URL 여부를 판단하고 좀더 효율적으로 피싱 사이트를 탐지 기법을 제안한다.
https://doi.org/10.3745/PKIPS.y2022m11a.80 인용 PDF

A Study proposal for URL anomaly detection model based on classification algorithm (분류 알고리즘 기반 URL 이상 탐지 모델 연구 제안)

Hyeon Wuu Kim;Hong-Ki Kim;DongHwi Lee
- Convergence Security Journal
- /
- v.23 no.5
- /
- pp.101-106
- /
- 2023
Recently, cyberattacks are increasing in social engineering attacks using intelligent and continuous phishing sites and hacking techniques using malicious code. As personal security becomes important, there is a need for a method and a solution for determining whether a malicious URL exists using a web application. In this paper, we would like to find out each feature and limitation by comparing highly accurate techniques for detecting malicious URLs. Compared to classification algorithm models using features such as web flat panel DB and based URL detection sites, we propose an efficient URL anomaly detection technique.
https://doi.org/10.33778/kcsa.2023.23.5.101 인용 PDF HTML

SHRT : New Method of URL Shortening including Relative Word of Target URL (SHRT : 유사 단어를 활용한 URL 단축 기법)

Yoon, Soojin;Park, Jeongeun;Choi, Changkuk;Kim, Seungjoo
- The Journal of Korean Institute of Communications and Information Sciences
- /
- v.38B no.6
- /
- pp.473-484
- /
- 2013
Shorten URL service is the method of using short URL instead of long URL, it redirect short url to long URL. While the users of microblog increased rapidly, as the creating and usage of shorten URL is convenient, shorten url became common under the limited length of writing on microblog. E-mail, SMS and books use shorten URL well, because of its simplicity. But, there is no relativeness between the most of shorten URLs and their target URLs, user can not expect the target URL. To cover this problem, there is attempts such as changing the shorten URL service name, inserting the information of website into shorten URL, and the usage of shortcode of physical address. However, each ones has the limits, so these are the trouble of automation, relatively long address, and the narrowness of applicable targets. SHRT is complementary to the attempts, as getting the idea from the writing system of Arabic. Though the writing system of Arabic has no vowel alphabet, Arabs have no difficult to understand their writing. This paper proposes SHRT, new method of URL Shortening. SHRT makes user guess the target URL using Relative word of the lowest domain of target URL without vowels.
https://doi.org/10.7840/kics.2013.38B.6.473 인용 PDF KSCI

Search Result 13, Processing Time 0.021 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)