DOI QR코드

DOI QR Code

Extraction Scheme of Function Information in Stripped Binaries using LSTM

스트립된 바이너리에서 LSTM을 이용한 함수정보 추출 기법

  • 장두혁 (한성대학교 컴퓨터공학과) ;
  • 김선민 (한성대학교 컴퓨터공학부) ;
  • 허준영 (한성대학교 컴퓨터공학부)
  • Received : 2021.10.21
  • Accepted : 2021.12.20
  • Published : 2021.12.31

Abstract

To analyze and defend malware codes, reverse engineering is used as identify function location information. However, the stripped binary is not easy to find information such as function location because function symbol information is removed. To solve this problem, there are various binary analysis tools such as BAP and BitBlaze IDA Pro, but they are based on heuristics method, so they do not perform well in general. In this paper, we propose a technique to extract function information using LSTM-based models by applying algorithms of N-byte method that is extracted binaries corresponding to reverse assembling instruments in a recursive descent method. Through experiments, the proposed techniques were superior to the existing techniques in terms of time and accuracy.

악성코드를 분석하여 방어하기 위해, 함수 위치 정보 등을 분석 방식으로 리버스 엔지니어링을 활용한다. 하지만, 스트립 된 바이너리는 함수 심볼 정보가 제거되어 함수 위치 등의 정보를 찾기가 쉽지 않다. 이를 해결하기 위해, BAP, BitBlaze IDA Pro 등 다양한 바이너리 분석 도구가 존재하지만, 휴리스틱을 기반으로 하므로 일반적인 성능이 우수하진 못하다. 본 논문에서는 재귀 하강 방식으로 역 어셈블리어에 대응되는 바이너리를 데이터로 N-byte 기법의 알고리즘을 제시해 LSTM 기반 모델을 적용하여 함수정보를 추출하는 기법을 제안한다. 실험을 통해 제안 기법이 수행 시간과 정확도 면에서 기존 기법들보다 우수함을 보였다.

Keywords

Acknowledgement

본 연구는 한성대학교 교내학술연구비 지원과제 임

References

  1. Dong-Hyeok Park, Eui-Jung Myeong, Joobeom Yun. "Efficient Detection of Android Mutant Malwares Using the DEX file", Journal of The Korea Institute of Information Security & Cryptology, 26(4), pp.895-902. 2016. DOI: https://doi.org/10.13089/ JKIISC.2016.26.4.895
  2. CHA, Sang-Gil, "Software Security and Binary Analysis", Communications of the Korean Institute of Information Scientists and Engineers, 2018, 36.3: 11-16. URL: https://www.koreascience.or.kr/article/JAKO201811553400494.page
  3. Taeeun Kim, et al., "A Study on Hybrid Fuzzing using Dynamic Analysis for Automatic Binary Vulnerability Detection", Journal of the Korea Academia- Industrial cooperation Society, 2019, 20.6: 541-547. DOI: https://doi.org/10.5762/KAIS.2019.20.6.541
  4. Ho Cheul Jung, Young Ghyu Sun, Donggu Lee, Soo Hyun Kim, Yu Min Hwang, Issac Sim, Sang Keun Oh, Seung-Ho Song, Jin Young Kim, "Prediction for Energy Demand Using 1D-CNN and Bidirectional LSTM in Internet of Energy", Institute of Korean Electrical and ElectronicsEngineers, (2019) 23(1), 134-142. DOI: http://dx.doi.org/10.7471/ikeee.2019.23.1.134
  5. Hwang, Seong Oun, "A Methodology for Security Vulnerability Assessment Process on Binary Code", The Journal of The Institute of Internet, Broadcasting and Communication, vol. 12, no. 5, pp.237-242, Oct. 2012. DOI: https://doi.org/10.7236/JIWIT.2012.12.5.237
  6. Jianming Fu, Rui Jin, Yan Lin, Baihe Jiang, Zhengwei Guo, "Function Risk Assessment Under Memory Leakage", Networking and Network Applications (NaNA) 2018 International Conference on, pp.284-291, 2018. DOI: https://doi.org/10.1109/NANA.2018.8648754
  7. BRUMLEY, David, et al., "BAP: A binary analysis platform", International Conference on Computer Aided Verification. Springer, Berlin, Heidelberg, pp.463-469. 2011. DOI: https://doi.org/10.1007/978-3-642-22110-1_37
  8. SHIN, Eui Chul Richard; SONG, Dawn; MOAZZEZI, Reza., "Recognizing functions in binaries with neural networks", 24th USENIX Security Symposium (USENIX Security 15). pp.611-626, 2015. URL: https://www.usenix.org/conference/usenixsecurity15/technical-sessions/presentation/shin
  9. DAVID, Yaniv; ALON, Uri; YAHAV, Eran, "Neural reverse engineering of stripped binaries using augmented control flow graphs", Proceedings of the ACM on Programming Languages, 2020, 4.OOPSLA: 1-28. DOI: https://doi.org/10.1145/3428293
  10. Laune C. Harris and Barton P. Miller, "Practical analysis of stripped binary code", SIGARCH Comput. Archit. News 33, 5, , pp.63-68, December 2005. DOI: https://doi.org/10.1145/1127577.1127590