[KSCI] Korea Science Citation Index Service

http://dx.doi.org/10.7472/jksii.2022.23.1.87

Bidirectional LSTM based light-weighted malware detection model using Windows PE format binary data

PARK, Kwang-Yun (Cyber Warfare(Integrated course), Korea National Defense Univ.)
LEE, Soo-Jin (Dept. of National Defense Science, Korea National Defnese Univ.)

Publication Information

Journal of Internet Computing and Services / v.23, no.1, 2022 , pp. 87-93 More about this Journal

Abstract

Since 99% of PCs operating in the defense domain use the Windows operating system, detection and response of Window-based malware is very important to keep the defense cyberspace safe. This paper proposes a model capable of detecting malware in a Windows PE (Portable Executable) format. The detection model was designed with an emphasis on rapid update of the training model to efficiently cope with rapidly increasing malware rather than the detection accuracy. Therefore, in order to improve the training speed, the detection model was designed based on a Bidirectional LSTM (Long Short Term Memory) network that can detect malware with minimal sequence data without complicated pre-processing. The experiment was conducted using the EMBER2018 dataset, As a result of training the model with feature sets consisting of three type of sequence data(Byte-Entropy Histogram, Byte Histogram, and String Distribution), accuracy of 90.79% was achieved. Meanwhile, it was confirmed that the training time was shortened to 1/4 compared to the existing detection model, enabling rapid update of the detection model to respond to new types of malware on the surge.

Keywords

Bidirectional LSTM; Windows PE malware; Detection; EMBER2018;

Citations & Related Records

Reference

1	R. Kath, "The Portable Executable File Format from Top to Bottom", MSDN Libary, Microsoft Corporation, 1993. http://www.csn.ul.ie/~caolan/pub/winresdump/winresdump/doc/pefile2.html
2	J. Saxe, H. Sanders, "Malware Data Science", Youngjin, 2020.
3	J. Saxe, K. Berlin, "Deep neural network based malware detection using two dimensional binary program features", 2015 10th International Conference on Malicious and Unwanted Software(MALWARE), pp. 11-20, 2015. http://dx.doi.org/10.1109/MALWARE.2015.7413680 DOI
4	H. S. Anderson, P. Roth, "EMBER: An Open Dataset for Training Static PE Malware Machine Learning Models", arXiv preprint arXiv:1804.04637, 2018. https://arxiv.org/abs/1804.04637v2
5	Shim Seung-bae, "Military introduction direction and tasks of Open O.S", 33rd SPRi Forum, 2017. https://spri.kr/download/21770
6	Clarence Chio, David Freeman, "Machine Learning and Security", pp. 175, O'Reilly Media, Inc., 2018.
7	Y. Oyama, T. Miyashita, H. Kokubo, "Identifying Useful Features for Malware Detection in the Ember Dataset", 2019 7th International Symposium on Computing and Networking Workshops(CANDARW). IEEE, pp. 360-366, 2019. http://dx.doi.org/10.1109/CANDARW.2019.00069 DOI
8	S. Parmanik, H. Teja, "EMBER - Analysis of Malware Dataset Using Convolutional Neural Networks", 2019 3rd International Conference on Inventive Systems and Control(ICISC), pp. 286-291, 2019. http://dx.doi.org/9/ICISC44355.2019.9036424
9	KISA, "2021 First Half KISA Cyber Security Issue Report", 2021. https://www.krcert.or.kr/filedownload.do?attach_file_seq=3431&attach_file_id=EpF3431.pdf
10	Mcafee, "Mcafee ATR Threats Report 4.21", 2021. https://www.mcafee.com/enterprise/en-us/lp/threats-repor ts/apr-2021.html
11	S. Hocheriter, J. Schmidhuber, "Long short-term memory", Nneurl computation 9, no. 8, pp. 1735-1780, 1997. http://dx.doi.org/10.1162/neco.1997.9.8.1735 DOI
12	Yukinaga Azuma, "Introduction to core deep learning", onlybook, 2020.
13	M. Ahmadi, D. Ulyanov, S. Semenov, M. Trofimov, G. Giacinto, "Novel Feature Extraction, Selection and Fusion for Effective Malware Family Classification", In Proceedings of the 6th ACM CODASPY '18, pp. 183-194, 2016. http://dx.doi.org/10.1145/2857705.2857713 DOI
14	Mike Schuster, Kuldip K. Paliwal, "Bidirectional Recurrent Neural Networks", IEEE Transactions on signal processing, Vol. 45, No. 11, 1997. http://dx.doi.org/10.1109/78.650093 DOI
15	Yunseok Rhee, "Malicious Code Detection Method Using LSTM Learning on the File Access Behavior", The Journal of Korean Institute of Information Technology, Vol.18, no. 2, pp.25-32, 2020. http://dx.doi.org/10.14801/jkiit.2020.18.2.25 DOI

KSCI

Bidirectional LSTM based light-weighted malware detection model using Windows PE format binary data 윈도우 PE 포맷 바이너리 데이터를 활용한 Bidirectional LSTM 기반 경량 악성코드 탐지모델

Bidirectional LSTM based light-weighted malware detection model using Windows PE format binary data