DOI QR코드

DOI QR Code

Control Flow Reconstruction from Virtualization-Obfuscated Binaries

가상화를 이용하여 난독화된 바이너리의 제어 흐름 재건

  • 황준형 (한국과학기술원 전산학과) ;
  • 한태숙 (한국과학기술원 전산학과)
  • Received : 2014.07.28
  • Accepted : 2014.11.07
  • Published : 2015.01.15

Abstract

Control flow information is useful in the analysis and comparison of programs. Virtualization-obfuscation hides control structures of the original program by transforming machine instructions into bytecode. Direct examination of the resulting binary reveals only the structure of the interpreter. Recovery of the original instructions requires knowledge of the virtual machine architecture, which is randomly generated and hidden. In this paper, we propose a method to reconstruct original control flow using only traces generated from the obfuscated binary. We consider traces as strings and find an automaton that represents the strings. State transitions in the automaton correspond to the control transfers in the original program. We have shown the effectiveness of our method with commercial obfuscators.

제어 흐름 정보는 프로그램이 실행되는 구조를 담고 있어 소프트웨어를 분석할 때 기준이 되고 소프트웨어를 서로 비교할 때에도 유용하게 쓰인다. 가상화를 이용한 난독화는 실제 기계의 명령을 구조가 숨겨진 가상 기계의 명령으로 바꾸어 프로그램의 제어 흐름 정보를 감춘다. 난독화가 적용된 바이너리에서는 가상 기계의 명령을 실행하는 인터프리터의 구조만 직접 드러난다. 이 논문에서는 가상화로 난독화된 바이너리를 실행해서 수행되는 명령들을 기록한 트레이스를 이용해 숨겨져 있는 프로그램의 본질적인 제어 흐름을 다시 만들어내는 방법을 제안한다. 트레이스를 기계 명령으로 이루어진 문자열로 보고 생성되는 트레이스들을 모두 받아들일 수 있는 오토마톤을 찾은 다음, 해당되는 제어 흐름 그래프를 만든다. 기계 명령의 수행은 오토마톤의 상태 전이에 대응하며, 이는 제어 흐름 그래프의 간선에 대응한다. 제안한 방법을 상용 가상화 도구로 난독화된 바이너리에 적용해 보았으며, 원본 바이너리와 유사한 제어 흐름 그래프가 생성되는 것을 확인하였다.

Keywords

Acknowledgement

Supported by : 한국연구재단

References

  1. H. Lim, H. Park, S. Choi, T. Han, "A Static Java Birthmark Based on Control Flow Edges," Proc. of the 33th Annual IEEE Computer Software and Applications Conference (COMPSAC), pp. 413-420, 2009.
  2. R. Rolles, "Unpacking Virtualization Obfuscators," Proc. of the 3rd USENIX Workshop on Offensive Technologies (WOOT), 2009.
  3. M. Sharif, A. Lanzi, J. Giffin, W. Lee, "Automatic Reverse Engineering of Malware Emulators," Proc. of the 30th IEEE Symposium on Security and Privacy (SP), pp. 94-109, 2009.
  4. B. Anckaert, M. H. Jakubowski, R. Venkatesan, "Proteus: Virtualization for Diversified Tamper- Resistance," Proc. of the 6th ACM Workshop on Digital Rights Management (DRM), pp. 47-58, 2006.
  5. VMProtect Software, VMProtect [Online]. Available: http://vmpsoft.com/
  6. Oreans Technologies, Code Virtualizer [Online]. Available: http://oreans.com/codevirtualizer.php
  7. J. Kinder, "Towards Static Analysis of Virtualiza tion-Obfuscated Binaries," Proc. of the 19th Working Conference on Reverse Engineering (WCRE), pp. 61-70, 2012.
  8. D. Balzarotti, M. Cova, C. Karlberger, C. Kruegel, E. Kirda, G. Vigna, "Efficient Detection of Split Personalities in Malware," Proc. of the 17th Annual Network and Distributed System Security Symposium (NDSS), 2010.
  9. M. D. Preda, R. Giacobazzi, S. Debray, K. Coogan, G. M. Townsend, "Modelling Metamorphism by Abstract Interpretation," Proc. of the 17th International Static Analysis Symposium (SAS), LNCS 6337, pp. 218-235, 2010.
  10. L. Mauborgne, X. Rival, "Trace Partitioning in Abstract Interpretation Based Static Analyzers," Proc. of the 14th European Symposium on Programming (ESOP), LNCS 3444, pp. 5-20, 2005.
  11. C. Luk, R. Cohn, R. Muth, H. Patil, A. Klauser, G. Lowney, S. Wallace, V. J. Reddi, K. Hazelwood, "Pin: Building Customized Program Analysis Tools with Dynamic Instrumentation," Proc. of the ACM SIGPLAN 2005 Conference on Programming Language Design and Implementation (PLDI), pp. 190- 200, 2005.
  12. K. Coogan, S. Debray, "Equational Reasoning on x86 Assembly Code," Proc. of the 11th IEEE International Working Conference on Source Code Analysis and Manipulation (SCAM), pp. 75-84, 2011.
  13. K. Coogan, G. Lu, S. Debray, "Deobfuscatoin of Virtualization-Obfuscated Software: A Semantics- Based Approach," Proc. of the 18th ACM Conference on Computer and Communications Security (CCS), pp. 275-284, 2011.
  14. C. de la Higuera, Grammatical Inference: Learning Automata and Grammers, Cambridge University Press, 2010.
  15. C. G. Nevill-Manning, I. H. Witten, "Identifying Hierarchical Structure in Sequences: A linear-time algorithm," Journal of Artificial Intelligence Research (JAIR), Vol. 7, pp. 67-82, 1997.
  16. J. Bohnet, M. Koeleman, J. Doellner, "Visualizing Massively Pruned Execution Traces to Facilitate Trace Exploration," Proc. of the 5th IEEE International Workshop on Visualizing Software for Understanding and Analysis (VISSOFT), pp. 57-64, 2009.
  17. P. Beaucamps, I. Gnaedig, J.-Y. Marion, "Behavior Abstraction in Malware Analysis," Proc. of the 1st International Conference on Runtime Verification (RV), LNCS 6418, pp. 168-182, 2010.
  18. D. Brumley, I. Jager, T. Avgerinos, E. J. Schwartz, "BAP: A Binary Analysis Platform," Proc. of the 23th International Conference on Computer Aided Verification (CAV), LNCS 6806, pp. 463-469, 2011.
  19. G. Balakrishnan, R. Gruian, T. Reps, T. Teitelbaum, "CodeSurfer/x86-A Platform for Analyzing x86 Executables," Proc. of the 14th International Conference on Compiler Construction (CC), LNCS 3443, pp. 250-254, 2005.
  20. J. Kinder, H. Veith, "Jakstab: A Static Analysis Platform for Binaries," Proc. of the 20th International Conference on Computer Aided Verification (CAV), LNCS 5123, pp. 423-427, 2008.
  21. J. Kinder, D. Kravchenko, "Alternating Control Flow Reconstruction," Proc. of the 13th International Conference on Verification, Model Checking, and Abstract Interpretation (VMCAI), LNCS 7148, pp. 267-282, 2012.
  22. S. Bardin, P. Herrmann, F. Vedrine, "Refinement- Based CFG Reconstruction from Unstructured Programs," Proc. of the 13th International Conference on Verification, Model Checking, and Abstract Interpretation (VMCAI), LNCS 6538, pp. 54- 69, 2011.
  23. C. Wang, J. Davidson, J. Hill, J. Knight, "Protection of Software-Based Survivability Mechanisms," Proc. of the International Conference on Dependable Systems and Networks (DSN), pp. 193-202, 2001.
  24. S. K. Udupa, S. K. Debray, M. Madou, "Deobfuscation: Reverse Engineering Obfuscated Code," Proc. of the 12th Working Conference on Reverse Engineering (WCRE), 2005.