Browse > Article
http://dx.doi.org/10.22156/CS4SMB.2021.11.05.030

Design of detection method for malicious URL based on Deep Neural Network  

Kwon, Hyun (Department of Electrical Engineering, Korea Military Academy)
Park, Sangjun (Department of Electrical Engineering, Korea Military Academy)
Kim, Yongchul (Department of Electrical Engineering, Korea Military Academy)
Publication Information
Journal of Convergence for Information Technology / v.11, no.5, 2021 , pp. 30-37 More about this Journal
Abstract
Various devices are connected to the Internet, and attacks using the Internet are occurring. Among such attacks, there are attacks that use malicious URLs to make users access to wrong phishing sites or distribute malicious viruses. Therefore, how to detect such malicious URL attacks is one of the important security issues. Among recent deep learning technologies, neural networks are showing good performance in image recognition, speech recognition, and pattern recognition. This neural network can be applied to research that analyzes and detects patterns of malicious URL characteristics. In this paper, performance analysis according to various parameters was performed on a method of detecting malicious URLs using neural networks. In this paper, malicious URL detection performance was analyzed while changing the activation function, learning rate, and neural network structure. The experimental data was crawled by Alexa top 1 million and Whois to build the data, and the machine learning library used TensorFlow. As a result of the experiment, when the number of layers is 4, the learning rate is 0.005, and the number of nodes in each layer is 100, the accuracy of 97.8% and the f1 score of 92.94% are obtained.
Keywords
Malicious URL; Machine learning; Detection method; Neural network; Pattern recognition;
Citations & Related Records
연도 인용수 순위
  • Reference
1 N. Hason, A. Dvir & C. Hajaj. (2020, July). Robust Malicious Domain Detection. In International Symposium on Cyber Security Cryptography and Machine Learning (pp. 45-61). Springer, Cham. DOI : 10.1007/978-3-030-49785-9_4   DOI
2 D. P. Kingma & J. Ba. (2014). Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980.
3 L. Bottou. (2010). Large-scale machine learning with stochastic gradient descent. In Proceedings of COMPSTAT'2010 (pp. 177-186). Physica-Verlag HD. DOI : 10.1007/978-3-7908-2604-3_16   DOI
4 A. Creswell et al. (2018). Generative adversarial networks: An overview. IEEE Signal Processing Magazine, 35(1), 53-65. DOI : 10.1109/MSP.2017.2765202   DOI
5 E. Kodirov, T. Xiang & S. Gong. (2017). Semantic autoencoder for zero-shot learning. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 3174-3183).
6 H. Kwon, H. Yoon & D. Choi. (2019). Restricted evasion attack: Generation of restricted-area adversarial example. IEEE Access, 7, 60908-60919. DOI : 10.1109/ACCESS.2019.2915971   DOI
7 H. Kwon, Y. Kim, H. Yoon & D. Choi. (2018). Random untargeted adversarial example on deep neural network. Symmetry, 10(12), 738. DOI : 10.3390/sym10120738   DOI
8 F. Yu. (2015). Malicious url detection algorithm based on bm pattern matching. International Journal of Security and Its Applications, 9(9), 33-44.   DOI
9 J. Klensin. (2003). Role of the domain name system (dns). Internet Request for Comments: RFC, 3467.
10 M. Anthony & P. L. Bartlett. (2009). Neural network learning: Theoretical foundations. cambridge university press.
11 R. Patgiri, A. Biswas & S. Nayak. (2021). deepBF: Malicious URL detection using Learned Bloom Filter and Evolutionary Deep Learning. arXiv preprint arXiv:2103.12544.
12 S. Yadav, A. K. K. Reddy, A. N. Reddy & S. Ranjan. (2012). Detecting algorithmically generated domain-flux attacks with DNS traffic analysis. IEEE/Acm Transactions on Networking, 20(5), 1663-1677. DOI : 10.1109/TNET.2012.2184552   DOI
13 B. Rahbarinia, R. Perdisci & M. Antonakakis. (2016). Efficient and accurate behavior-based tracking of malware-control domains in large ISP networks. ACM Transactions on Privacy and Security (TOPS), 19(2), 1-31. DOI : 10.1145/2960409   DOI
14 J. Yuan, G. Chen, S. Tian & X. Pei. (2021). Malicious URL Detection Based on a Parallel Neural Joint Model. IEEE Access, 9, 9464-9472. DOI : 10.1109/ACCESS.2021.3049625.   DOI
15 B. M. Kim, Y. W. Han, G. Y. Kim, Y. B. Kim & H. J. Kim. (2020). Development of Rule-Based Malicious URL Detection Library Considering User Experiences. Journal of the Korea Institute of Information Security & Cryptology, 30(3), 481-491. DOI : 10.13089/JKIISC.2020.30.3.481   DOI
16 M. Abadi et al. (2016). Tensorflow: A system for large-scale machine learning. In 12th {USENIX} symposium on operating systems design and implementation ({OSDI} 16) (pp. 265-283).
17 https://www.alexa.com
18 https://gnso.icann.org
19 S. Du et al. (2019, May). Gradient descent finds global minima of deep neural networks. In International Conference on Machine Learning (pp. 1675-1685). PMLR.
20 H. Kwon, H. Yoon & K. W. Park. (2019, November). POSTER: Detecting audio adversarial example through audio modification. In Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security (pp. 2521-2523). DOI : 10.1145/3319535.3363246   DOI
21 L. Dolberg, J. Francois & T. Engel. (2012). Efficient multidimensional aggregation for large scale monitoring. In 26th Large Installation System Administration Conference ({LISA} 12) (pp. 163-180).
22 D. F. Specht. (1990). Probabilistic neural networks. Neural networks, 3(1), 109-118.   DOI
23 D. M. Kline & V. L. Berardi. (2005). Revisiting squared-error and cross-entropy functions for training neural network classifiers. Neural Computing & Applications, 14(4), 310-318. DOI : 10.1007/s00521-005-0467-y   DOI
24 P. Zhao & S. C. Hoi. (2013, August). Cost-sensitive online active learning with application to malicious URL detection. In Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 919-927). DOI : 10.1145/2487575.2487647   DOI
25 X. Sun, M. Tong, J. Yang, L. Xinran & L. Heng. (2019). Hindom: A robust malicious domain detection system based on heterogeneous information network with transductive classification. In 22nd International Symposium on Research in Attacks, Intrusions and Defenses ({RAID} 2019) (pp. 399-412).
26 Y. Shi, G. Chen & J. Li. (2018). Malicious domain name detection based on extreme machine learning. Neural Processing Letters, 48(3), 1347-1357. DOI : 10.1007/s11063-017-9666-7   DOI
27 L. Bilge, S. Sen, D. Balzarotti, E. Kirda & C. Kruegel. (2014). Exposure: A passive dns analysis service to detect and report malicious domains. ACM Transactions on Information and System Security (TISSEC), 16(4), 1-28. DOI : 10.1145/2584679   DOI
28 H. Kwon, H. Yoon & K. W. Park. (2020). Acoustic-decoy: Detection of adversarial examples through audio modification on speech recognition system. Neurocomputing, 417, 357-370. DOI : 10.1016/j.neucom.2020.07.101   DOI
29 H. Kwon, Y. Kim, K. W. Park, H. Yoon & D. Choi. (2018). Advanced ensemble adversarial example on unknown deep neural network classifiers. IEICE TRANSACTIONS on Information and Systems, 101(10), 2485-2500. DOI : 10.1587/transinf.2018EDP7073   DOI