Browse > Article
http://dx.doi.org/10.13089/JKIISC.2019.29.4.775

A Study on Malware Identification System Using Static Analysis Based Machine Learning Technique  

Kim, Su-jeong (Hoseo University)
Ha, Ji-hee (Hoseo University)
Oh, Soo-hyun (Hoseo University)
Lee, Tae-jin (Hoseo University)
Abstract
Malware infringement attacks are continuously increasing in various environments such as mobile, IOT, windows and mac due to the emergence of new and variant malware, and signature-based countermeasures have limitations in detection of malware. In addition, analytical performance is deteriorating due to obfuscation, packing, and anti-VM technique. In this paper, we propose a system that can detect malware based on machine learning by using similarity hashing-based pattern detection technique and static analysis after file classification according to packing. This enables more efficient detection because it utilizes both pattern-based detection, which is well-known malware detection, and machine learning-based detection technology, which is advantageous for detecting new and variant malware. The results of this study were obtained by detecting accuracy of 95.79% or more for benign sample files and malware sample files provided by the AI-based malware detection track of the Information Security R&D Data Challenge 2018 competition. In the future, it is expected that it will be possible to build a system that improves detection performance by applying a feature vector and a detection method to the characteristics of a packed file.
Keywords
Malware; Machine learning; Feature statistics; Packer; Similarity hashing;
Citations & Related Records
Times Cited By KSCI : 1  (Citation Analysis)
연도 인용수 순위
1 KISA and KIISC, http://datachallenge.kr/challenge18/malware/introduction/
2 KISA, https://www.kisis.or.kr/kisis/subIndex/283.do
3 AV-TEST, The AV-TEST security report 2017/18, AV-TEST, Jul. 2018.
4 Cisco, Cisco 2018 annual cybersecurity reposrt, Cisco, Jan. 2018
5 Y. Li, S.C. Sundaramurthy, A.B. Bardas, X. Ou, D. Caragea, X. Hu and J. Jang, "Experimental study of fuzzy hashing in malware clustering analysis," 8th Workshop on Cyber Security Experimentation and Test ({CSET} 15). Aug. 2015.
6 Dong-woo Goh and Huy-kang Kim. "A study on malware clustering technique using API call sequence and locality sensitive hashing," Journal of the Korea Institute of Information Security & Cryptology, 27(1) pp.91-101. Feb. 2017   DOI
7 Z. Min, M. Sun and J.CS. Lui, "Droid analytics: a signature based analytic system to collect, extract, analyze and associate android malware," 2013 12th IEEE International Conference on Trust, Security and Privacy in Computing and Communications, pp. 163-171, July 2013.
8 M. Ahmadi, D. Ulyanov, S. Semenov, M. Trofimov and G. Giacinto, "Novel feature extraction, selection and fusion for effective malware family classification," Proceedings of the Sixth ACM Conference on Data and Application Security and Privacy, pp. 183-194, Mar. 2016.
9 Y. Zhong, H. Yamaki and H. Takakura, "A malware classification method based on similarity of function structure," 2012 IEEE/IPSJ 12th International Symposium on Applications and the Internet, pp. 256-261, July 2012.
10 R. Islam, R. Tian, L.M. Batten and S. Versteeg, "Classification of malware based on integrated static and dynamic features," Journal of Network and Computer Applications, vol. 36, no. 2, pp. 646-656, Mar. 2013   DOI
11 J. Kornblum, "Identifying almost identical files using context triggered piecewise hashing," Digital investigation, vol. 3, pp. 91-97, Sep. 2006   DOI
12 J. Oliver, C. Cheng and Y. Chen. "TLSH--a locality sensitive hash," 2013 Fourth Cybercrime and Trustworthy Computing Workshop, pp.7-13, Nov. 2013.
13 NIST, http://www.nsrl.nist.gov/ssdeep.htm
14 Virusshare, https://virusshare.com
15 M.Z. Shafiq, S.M. Tabish, F. Mirza, M. Farooq, "PE-Miner: Mining Structural Information to Detect Malicious Executables in Realtime," International Workshop on Recent Advances in Intrusion Detection, LNCS 5758, pp. 121-141, 2009
16 Dong-hwi Shin, Chae-tae Im, Hyun-cheol Jeong, "The packer detection signature generation based on unpacking algorithm characteistic," Korea Computer Congress 2010, pp. 56-60, June 2010
17 G. Taha, Counterattacking the packers, McAfee Avert Labs, 2007
18 M. Bat-Erdene, T. Kim, H. Park and H. Lee, "Packer detection for multi-layer executables using entropy analysis," Entropy, vol. 19, no. 3, Mar. 2017.
19 Ho-dong Lee, Reverse engineering 1 (file structure section), Hanbit Media, Oct. 2016.
20 M. Sikorski and A. Honig, Practical Malware Analysis: The Hands-On Guide to Dissecting Malicious Software, Acorn Publication, Oct. 2013
21 S. Marsland, Machine Learning: An Algorithmic Perspective, Jpub, Dec. 2016