Browse > Article
http://dx.doi.org/10.13089/JKIISC.2017.27.1.91

A Study on Malware Clustering Technique Using API Call Sequence and Locality Sensitive Hashing  

Goh, Dong Woo (Korea University)
Kim, Huy Kang (Korea University)
Abstract
API call sequence analysis is a kind of analysis using API call information extracted in target program. Compared to other techniques, this is advantageous as it can characterize the behavior of the target. However, existing API call sequence analysis has an issue of identifying same characteristics to different function during the analysis. To resolve the identification issue and improve performance of analysis, this study includes the method of API abstraction technique in addition to existing analysis. From there on, similarity between target programs is computed and clustered into similar types by applying LSH to abstracted API call sequence from analyzed target. Thus, this study can attribute in improving the accuracy of the malware analysis based on discovered information on the types of malware identified.
Keywords
API call sequence; malware analysis; clustering; dynamic analysis;
Citations & Related Records
연도 인용수 순위
  • Reference
1 A. Sami, B. Yadegari, H. Rahimi, N. Peiravian, S. Hashemi, and A. Hamze, "Malware detection based on mining API calls," Proceedings of the 2010 ACM symposium on applied computing, pp. 1020-1025, Mar. 2010.
2 Kyoung-Soo Han, In-Kyoung Kim and Eul-Gyu Im, "Malware Family Classification Method using API Sequential Characteristic," Journal of Security Engineering, 8(2), pp. 319-335, Apr. 2011.
3 Jae-woo Park, Sung-tae Moon, Gi-Wook Son, In-Kyoung Kim, Kyoung-Soo Han, Eul-Gyu Im and Il-Gon Kim, "An Automatic Malware Classification System using String List and API," Journal of Security Engineering, 8(5), pp. 611-626, Sep. 2011.
4 Strace Nt, IntellectualHeaven, http://www.strace-nt.com-about.com/
5 M.K. Shankarapani, S. Ramamoorthy, R.S. Movva and S. Mukkamala, "Malware detection using assembly and API call sequences," Journal in computer virology, vol. 7, no. 2, pp. 107-119, May. 2011.   DOI
6 M. Alazab, S. Venkatraman, P. Watters and M. Alazab, "Zero-day malware detection based on supervised learning algorithms of API call signatures," Proceedings of the 9-th Australasian Data Mining Conference, vol. 121, pp. 171-182, Dec. 2011.
7 E. Elhadi, M.A. Maarof and B. Barry, "Improving the Detection of Malware Behavior Using Simplified Data Dependent API Call Graph." International Journal of Security and Its Applications, vol. 7, no. 5, pp. 29-42, Sep. 2013.   DOI
8 Youngjoon Ki, Eunjin Kim, and Huy Kang Kim. "A novel approach to detect malware based on API call sequence analysis," International Journal of Distributed Sensor Networks, vol. 11, no. 6, Jan. 2015.
9 Jae-wook Jang, Jiyoung Woo, Aziz Mohaisen, Jaesung Yun, and Huy Kang Kim, "Mal-Netminer: Malware Classification Approach Based on Social Network Analysis of System Call Graph," Mathematical Problems in Engineering, vol. 2015, Aug. 2015.
10 Y. Ye, D. Wang, T. Li, D. Ye and Q. Jiang, "An intelligent PE-malware detection system based on association mining," Journal in computer virology, vol. 4, no. 4, pp. 323-334, Nov. 2008.   DOI
11 Kaspersky Lab, "Kaspersky Security Bulletin 2015," Kaspersky Lab, Dec. 2015.
12 S.A. Hofmeyr, S. Forrest and A. Somayaji, "Intrusion detection using sequences of system calls," Journal of computer security, vol. 6, no. 3, pp. 151-180, Jul. 1998.   DOI
13 M. Ahmadi, A. Sami, H. Rahimi and B. Yadegari, "Malware detection by behavioural sequential patterns," Computer Fraud & Security, vol. 2013, no. 8, pp. 11-19, Aug. 2013.   DOI
14 C. Sadowski and G. Levin, "Simhash: Hash-based Similarity Detection," UCSC-SOE-11-07, University of California, Feb. 2011.
15 G.G. Sundarkumar, V. Ravi, I. Nwogu and V. Govindaraju, "Malware detection via API calls, topic models and machine learning," 2015 IEEE International Conference on Automation Science and Engineering (CASE), pp. 1212-1217, Aug. 2015.
16 Y. Li, S.C. Sundaramurthy, A.G. Bardas, X. Ou, D. Caragea, X. Hu and Jiyong Jang, "Experimental study of fuzzy hashing in malware clustering analysis," 8th Workshop on Cyber Security Experimentation and Test (CSET 15), pp. 52-59, Aug. 2015.
17 J. Kornblum, "Identifying almost identical files using context triggered piecewise hashing," Digital investigation, vol. 3, pp. 91-97, Sep. 2006.   DOI
18 J. Oliver, C. Cheng and Y. Chen, "TLSH - A Locality Sensitive Hash," Cybercrime and Trustworthy Computing Workshop (CTC), pp. 7-13, Nov. 2013.
19 Automated Malware Analysis - Cuckoo Sandbox, Cuckoo Foundation, https://www.cuckoosandbox.org/
20 malwares.com, SAINT SECURITY, https://www.malwares.com/