Browse > Article
http://dx.doi.org/10.7472/jksii.2022.23.1.87

Bidirectional LSTM based light-weighted malware detection model using Windows PE format binary data  

PARK, Kwang-Yun (Cyber Warfare(Integrated course), Korea National Defense Univ.)
LEE, Soo-Jin (Dept. of National Defense Science, Korea National Defnese Univ.)
Publication Information
Journal of Internet Computing and Services / v.23, no.1, 2022 , pp. 87-93 More about this Journal
Abstract
Since 99% of PCs operating in the defense domain use the Windows operating system, detection and response of Window-based malware is very important to keep the defense cyberspace safe. This paper proposes a model capable of detecting malware in a Windows PE (Portable Executable) format. The detection model was designed with an emphasis on rapid update of the training model to efficiently cope with rapidly increasing malware rather than the detection accuracy. Therefore, in order to improve the training speed, the detection model was designed based on a Bidirectional LSTM (Long Short Term Memory) network that can detect malware with minimal sequence data without complicated pre-processing. The experiment was conducted using the EMBER2018 dataset, As a result of training the model with feature sets consisting of three type of sequence data(Byte-Entropy Histogram, Byte Histogram, and String Distribution), accuracy of 90.79% was achieved. Meanwhile, it was confirmed that the training time was shortened to 1/4 compared to the existing detection model, enabling rapid update of the detection model to respond to new types of malware on the surge.
Keywords
Bidirectional LSTM; Windows PE malware; Detection; EMBER2018;
Citations & Related Records
연도 인용수 순위
  • Reference
1 R. Kath, "The Portable Executable File Format from Top to Bottom", MSDN Libary, Microsoft Corporation, 1993. http://www.csn.ul.ie/~caolan/pub/winresdump/winresdump/doc/pefile2.html
2 J. Saxe, H. Sanders, "Malware Data Science", Youngjin, 2020.
3 J. Saxe, K. Berlin, "Deep neural network based malware detection using two dimensional binary program features", 2015 10th International Conference on Malicious and Unwanted Software(MALWARE), pp. 11-20, 2015. http://dx.doi.org/10.1109/MALWARE.2015.7413680   DOI
4 H. S. Anderson, P. Roth, "EMBER: An Open Dataset for Training Static PE Malware Machine Learning Models", arXiv preprint arXiv:1804.04637, 2018. https://arxiv.org/abs/1804.04637v2
5 Shim Seung-bae, "Military introduction direction and tasks of Open O.S", 33rd SPRi Forum, 2017. https://spri.kr/download/21770
6 Clarence Chio, David Freeman, "Machine Learning and Security", pp. 175, O'Reilly Media, Inc., 2018.
7 Y. Oyama, T. Miyashita, H. Kokubo, "Identifying Useful Features for Malware Detection in the Ember Dataset", 2019 7th International Symposium on Computing and Networking Workshops(CANDARW). IEEE, pp. 360-366, 2019. http://dx.doi.org/10.1109/CANDARW.2019.00069   DOI
8 S. Parmanik, H. Teja, "EMBER - Analysis of Malware Dataset Using Convolutional Neural Networks", 2019 3rd International Conference on Inventive Systems and Control(ICISC), pp. 286-291, 2019. http://dx.doi.org/9/ICISC44355.2019.9036424
9 KISA, "2021 First Half KISA Cyber Security Issue Report", 2021. https://www.krcert.or.kr/filedownload.do?attach_file_seq=3431&attach_file_id=EpF3431.pdf
10 Mcafee, "Mcafee ATR Threats Report 4.21", 2021. https://www.mcafee.com/enterprise/en-us/lp/threats-repor ts/apr-2021.html
11 S. Hocheriter, J. Schmidhuber, "Long short-term memory", Nneurl computation 9, no. 8, pp. 1735-1780, 1997. http://dx.doi.org/10.1162/neco.1997.9.8.1735   DOI
12 Yukinaga Azuma, "Introduction to core deep learning", onlybook, 2020.
13 M. Ahmadi, D. Ulyanov, S. Semenov, M. Trofimov, G. Giacinto, "Novel Feature Extraction, Selection and Fusion for Effective Malware Family Classification", In Proceedings of the 6th ACM CODASPY '18, pp. 183-194, 2016. http://dx.doi.org/10.1145/2857705.2857713   DOI
14 Mike Schuster, Kuldip K. Paliwal, "Bidirectional Recurrent Neural Networks", IEEE Transactions on signal processing, Vol. 45, No. 11, 1997. http://dx.doi.org/10.1109/78.650093   DOI
15 Yunseok Rhee, "Malicious Code Detection Method Using LSTM Learning on the File Access Behavior", The Journal of Korean Institute of Information Technology, Vol.18, no. 2, pp.25-32, 2020. http://dx.doi.org/10.14801/jkiit.2020.18.2.25   DOI