[KSCI] Korea Science Citation Index Service

http://dx.doi.org/10.22156/CS4SMB.2021.11.05.030

Design of detection method for malicious URL based on Deep Neural Network

Kwon, Hyun (Department of Electrical Engineering, Korea Military Academy)
Park, Sangjun (Department of Electrical Engineering, Korea Military Academy)
Kim, Yongchul (Department of Electrical Engineering, Korea Military Academy)

Publication Information

Journal of Convergence for Information Technology / v.11, no.5, 2021 , pp. 30-37 More about this Journal

Abstract

Various devices are connected to the Internet, and attacks using the Internet are occurring. Among such attacks, there are attacks that use malicious URLs to make users access to wrong phishing sites or distribute malicious viruses. Therefore, how to detect such malicious URL attacks is one of the important security issues. Among recent deep learning technologies, neural networks are showing good performance in image recognition, speech recognition, and pattern recognition. This neural network can be applied to research that analyzes and detects patterns of malicious URL characteristics. In this paper, performance analysis according to various parameters was performed on a method of detecting malicious URLs using neural networks. In this paper, malicious URL detection performance was analyzed while changing the activation function, learning rate, and neural network structure. The experimental data was crawled by Alexa top 1 million and Whois to build the data, and the machine learning library used TensorFlow. As a result of the experiment, when the number of layers is 4, the learning rate is 0.005, and the number of nodes in each layer is 100, the accuracy of 97.8% and the f1 score of 92.94% are obtained.

Keywords

Malicious URL; Machine learning; Detection method; Neural network; Pattern recognition;

Citations & Related Records

Reference

1	N. Hason, A. Dvir & C. Hajaj. (2020, July). Robust Malicious Domain Detection. In International Symposium on Cyber Security Cryptography and Machine Learning (pp. 45-61). Springer, Cham. DOI : 10.1007/978-3-030-49785-9_4 DOI
2	D. P. Kingma & J. Ba. (2014). Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980.
3	L. Bottou. (2010). Large-scale machine learning with stochastic gradient descent. In Proceedings of COMPSTAT'2010 (pp. 177-186). Physica-Verlag HD. DOI : 10.1007/978-3-7908-2604-3_16 DOI
4	A. Creswell et al. (2018). Generative adversarial networks: An overview. IEEE Signal Processing Magazine, 35(1), 53-65. DOI : 10.1109/MSP.2017.2765202 DOI
5	E. Kodirov, T. Xiang & S. Gong. (2017). Semantic autoencoder for zero-shot learning. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 3174-3183).
6	H. Kwon, H. Yoon & D. Choi. (2019). Restricted evasion attack: Generation of restricted-area adversarial example. IEEE Access, 7, 60908-60919. DOI : 10.1109/ACCESS.2019.2915971 DOI
7	H. Kwon, Y. Kim, H. Yoon & D. Choi. (2018). Random untargeted adversarial example on deep neural network. Symmetry, 10(12), 738. DOI : 10.3390/sym10120738 DOI
8	F. Yu. (2015). Malicious url detection algorithm based on bm pattern matching. International Journal of Security and Its Applications, 9(9), 33-44. DOI
9	J. Klensin. (2003). Role of the domain name system (dns). Internet Request for Comments: RFC, 3467.
10	M. Anthony & P. L. Bartlett. (2009). Neural network learning: Theoretical foundations. cambridge university press.
11	R. Patgiri, A. Biswas & S. Nayak. (2021). deepBF: Malicious URL detection using Learned Bloom Filter and Evolutionary Deep Learning. arXiv preprint arXiv:2103.12544.
12	S. Yadav, A. K. K. Reddy, A. N. Reddy & S. Ranjan. (2012). Detecting algorithmically generated domain-flux attacks with DNS traffic analysis. IEEE/Acm Transactions on Networking, 20(5), 1663-1677. DOI : 10.1109/TNET.2012.2184552 DOI
13	B. Rahbarinia, R. Perdisci & M. Antonakakis. (2016). Efficient and accurate behavior-based tracking of malware-control domains in large ISP networks. ACM Transactions on Privacy and Security (TOPS), 19(2), 1-31. DOI : 10.1145/2960409 DOI
14	J. Yuan, G. Chen, S. Tian & X. Pei. (2021). Malicious URL Detection Based on a Parallel Neural Joint Model. IEEE Access, 9, 9464-9472. DOI : 10.1109/ACCESS.2021.3049625. DOI
15	B. M. Kim, Y. W. Han, G. Y. Kim, Y. B. Kim & H. J. Kim. (2020). Development of Rule-Based Malicious URL Detection Library Considering User Experiences. Journal of the Korea Institute of Information Security & Cryptology, 30(3), 481-491. DOI : 10.13089/JKIISC.2020.30.3.481 DOI
16	M. Abadi et al. (2016). Tensorflow: A system for large-scale machine learning. In 12th {USENIX} symposium on operating systems design and implementation ({OSDI} 16) (pp. 265-283).
17	https://www.alexa.com
18	https://gnso.icann.org
19	S. Du et al. (2019, May). Gradient descent finds global minima of deep neural networks. In International Conference on Machine Learning (pp. 1675-1685). PMLR.
20	H. Kwon, H. Yoon & K. W. Park. (2019, November). POSTER: Detecting audio adversarial example through audio modification. In Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security (pp. 2521-2523). DOI : 10.1145/3319535.3363246 DOI
21	L. Dolberg, J. Francois & T. Engel. (2012). Efficient multidimensional aggregation for large scale monitoring. In 26th Large Installation System Administration Conference ({LISA} 12) (pp. 163-180).
22	D. F. Specht. (1990). Probabilistic neural networks. Neural networks, 3(1), 109-118. DOI
23	D. M. Kline & V. L. Berardi. (2005). Revisiting squared-error and cross-entropy functions for training neural network classifiers. Neural Computing & Applications, 14(4), 310-318. DOI : 10.1007/s00521-005-0467-y DOI
24	P. Zhao & S. C. Hoi. (2013, August). Cost-sensitive online active learning with application to malicious URL detection. In Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 919-927). DOI : 10.1145/2487575.2487647 DOI
25	X. Sun, M. Tong, J. Yang, L. Xinran & L. Heng. (2019). Hindom: A robust malicious domain detection system based on heterogeneous information network with transductive classification. In 22nd International Symposium on Research in Attacks, Intrusions and Defenses ({RAID} 2019) (pp. 399-412).
26	Y. Shi, G. Chen & J. Li. (2018). Malicious domain name detection based on extreme machine learning. Neural Processing Letters, 48(3), 1347-1357. DOI : 10.1007/s11063-017-9666-7 DOI
27	L. Bilge, S. Sen, D. Balzarotti, E. Kirda & C. Kruegel. (2014). Exposure: A passive dns analysis service to detect and report malicious domains. ACM Transactions on Information and System Security (TISSEC), 16(4), 1-28. DOI : 10.1145/2584679 DOI
28	H. Kwon, H. Yoon & K. W. Park. (2020). Acoustic-decoy: Detection of adversarial examples through audio modification on speech recognition system. Neurocomputing, 417, 357-370. DOI : 10.1016/j.neucom.2020.07.101 DOI
29	H. Kwon, Y. Kim, K. W. Park, H. Yoon & D. Choi. (2018). Advanced ensemble adversarial example on unknown deep neural network classifiers. IEICE TRANSACTIONS on Information and Systems, 101(10), 2485-2500. DOI : 10.1587/transinf.2018EDP7073 DOI

KSCI

Design of detection method for malicious URL based on Deep Neural Network 뉴럴네트워크 기반에 악성 URL 탐지방법 설계

Design of detection method for malicious URL based on Deep Neural Network