Browse > Article
http://dx.doi.org/10.13089/JKIISC.2022.32.2.171

An LLVM-Based Implementation of Static Analysis for Detecting Self-Modifying Code and Its Evaluation  

Yu, Jae-IL (Chonnam National University)
Choi, Kwang-hoon (Chonnam National University)
Abstract
Self-Modifying-Code is a code that changes the code by itself during execution time. This technique is particularly abused by malicious code to bypass static analysis. Therefor, in order to effectively detect such malicious codes, it is important to identify self-modifying-codes. In the meantime, Self-modify-codes have been analyzed using dynamic analysis methods, but this is time-consuming and costly. If static analysis can detect self-modifying-code it will be of great help to malicious code analysis. In this paper, we propose a static analysis method to detect self-modified code for binary executable programs converted to LLVM IR and apply this method by making a self-modifying-code benchmark. As a result of the experiment in this paper, the designed static analysis method was effective for the standardized LLVM IR program that was compiled and converted to the benchmark program. However, there was a limitation in that it was difficult to detect the self-modifying-code for the unstructured LLVM IR program in which the binary was lifted and transformed. To overcome this, we need an effective way to lift the binary code.
Keywords
Self-Modifying-Code; Static Analysis; Benchmarking;
Citations & Related Records
연도 인용수 순위
  • Reference
1 RNDr. Petr Rockai, "Decompiling binaries into llvm ir using mcsema and dynist,", Ph.D. Thesis, Masaryk University, 2019.
2 A. Brown and G. Wilson, The Architecture of Open Source Applications, Lulu Press, May. 2011.
3 R. Tschuter, J. Ziegenbalg, B. Wesarg, M. Weber, C. Herold, S. Dobel, and R. Brendel, "An LLVM Instrumentation Plug-in for Score-P", Proceedings of the Fourth Workshop on the LLVM Compiler Infrastructure, pp. 1-8, Nov. 2017.
4 B. Hardekopf and C. Lin, "The Ant and the Grasshopper: Fast and accurate pointer analysis for millions of lines of code", Proceedings of the 28th ACM SIGPLAN Conference on Programming Language Design and Implementation, pp. 290-299, Jun. 2007.
5 M. Sridharan and S.J. Fink. "The complexity of andersen's analysis in practice", International Static Analysis Symposium, pp. 205-221, Aug. 2009.
6 K. Choi, Binary target self-correction code identification technique(2020-017), National Security Research Institute, Oct. 2020. https://github.com/dbwodlf3/SMC/blob/master/docs/report.pdf
7 H. Cai, Z. Shao and A. Vaynberg, "Certified selfmodifying code", Proceedings of the 28th ACM SIGPLAN Conference on Programming Language Design and Implementation, pp. 66-77, Jun. 2007.
8 M. Botacin, M. Zanata and A. Gregio, "The self modifying code (smc)-aware processor (sap)", Journal of Computer Virology and Hacking Techniques, vol. 16, no. 3, pp. 185-196, Mar. 2020.   DOI