Browse > Article
http://dx.doi.org/10.13089/JKIISC.2021.31.2.233

Quantitative Measures for Code Obfuscation Coverage by the Natural Language Processing  

Kim, Byeong Yeon (Korea University School of Cybersecurity)
Kim, Huy Kang (Korea University School of Cybersecurity)
Abstract
Obfuscation has been vastly applied to both malware and benign Android applications in the last years. Because Obfuscation hides the apps' semantics from analysts by increasing the cost of reverse engineering and decompilation. Consequently, It is important for attackers and security team to measure the quantitative of obfuscation of the app for analysis. However, current research and solutions are surprisingly bad at detecting obfuscation. First, When only a small amount of obfuscation is found, They will have the tendency to judge that code as obfuscated. Second, They can not detect misunderstandable obfuscation techniques. Finally, The systems do not necessarily remain effective over time - when novel obfuscation techniques are proposed. In this work, we propose AndrObfusec, an Natural language processing and heuristic based system to detect obfuscation in Android applications, known as identifier renaming. This system examines a different aspect of the issue - It measure not only readability but also understandability with quantitative measurement for code obfuscation coverage. Particularly, AndrObfusec achieves an high accuracy for identifier renaming detection.
Keywords
Code Obfuscation; Obfuscation Coverage; Natural Language Processing;
Citations & Related Records
연도 인용수 순위
  • Reference
1 R Mohsen and AM Pinto. "Evaluating obfuscation security: A quantitative approach," Springer, vol. 9482, May. 2015.
2 Roedy green. "How To Write Unmaintainable code," Java Developers' Journal, Jan. 2000.
3 "Final Report on Information Security Survey" pp. 59-63. KISA, Feb. 2020.
4 Parvez Faruki, Hossein Fereidooni, Vijay Laxmi, Mauro Conti, Manoj Gaur. "Android Code Protection via Obfuscation Techniques: Past, Present and Future Directions," arXiv: 1611.10231, Nov. 2016.
5 Atanas Rountev, Yan Wang. "Who Changed You? Obfuscator Identification for Android," IEEE, pp. 154-164, July. 2017.
6 J. E. Tapiador, L. Gonz'ales-Manzano, O. Mirzaei, J. M. de Fuentes. AndrODet: An adaptive Android obfuscation detector. Elsevier, pp. 222-235, July. 2018.
7 Nahid Shahmehri, Alireza Mohammadinodooshan, Ulf Karg'en. Comment on "AndrODet: An adaptive Android obfuscation detector". Future Generation Computer Systems, pp. 240-261, vol 90, Jan. 2020.   DOI
8 Nahid Shahmehri, Alireza Mohammadinodooshan, Ulf Kargen. "Robust Detection of Obfuscated Strings in Android Apps," AISec'19, pp. 25-35, Nov. 2019.
9 Sun Microsystems. "Java Code Conventions", pp.15-16. Oracle, Sep. 1997.
10 Github, "open-source-android-apps," https://github.com/pcqpcq/open-sourceandroid-apps, May. 2020
11 Github, "dex2jar," https://github.com/pxb1988/dex2jar, Jun. 2015.
12 S Hochreiter and J Schmidhuber, "Long short-term memory," Neural computation, pp. 1735-1780, Feb. 1997.
13 Yasemin Acar, Bradley Reaves, Patrick Traynor, Sascha Fahl, Dominik Wermke, Nicolas Huaman. "A Large Scale Investigation of Obfuscation Use in Google Play," ACSAC, Dec. 2018.
14 Hyoung-Kee Choi, Dongmin Jo. "Android Application Obfuscation Technique Inducing Misjudgement of Obfuscation Application," Korea Institute Of Communication Sciences, 38(8), pp. 654-662, Jan. 2018.