[KSCI] Korea Science Citation Index Service

http://dx.doi.org/10.13089/JKIISC.2018.28.6.1463

A Method for Field Based Grey Box Fuzzing with Variational Autoencoder

Lee, Su-rim (Graduate School of Information Security, Korea University)
Moon, Jong-sub (Graduate School of Information Security, Korea University)

Publication Information

Journal of the Korea Institute of Information Security & Cryptology / v.28, no.6, 2018 , pp. 1463-1474 More about this Journal

Abstract

Fuzzing is one of the software testing techniques that find security flaws by inputting invalid values or arbitrary values into the program and various methods have been suggested to increase the efficiency of such fuzzing. In this paper, focusing on the existence of field with high relevance to coverage and software crash, we propose a new method for intensively fuzzing corresponding field part while performing field based fuzzing. In this case, we use a deep learning model called Variational Autoencoder(VAE) to learn the statistical characteristic of input values measured in high coverage and it showed that the coverage of the regenerated files are uniformly higher than that of simple variation. It also showed that new crash could be found by learning the statistical characteristic of the files in which the crash occurred and applying the dropout during the regeneration. Experimental results showed that the coverage is about 10% higher than the files in the queue of the AFL fuzzing tool and in the Hwpviewer binary, we found two new crashes using two crashes that found at the initial fuzzing phase.

Keywords

software testing; fuzzing; vulnerability; deep learning; VAE(Variational Autoencoder);

Citations & Related Records

Reference

1	M. Mustafa and K. Jeffrey, "Efficient Instrumentation for Code Coverage Testing," ISSTA '02 Proceedings of the 2002 ACM SIGSOFT international symposium on Software testing and analysis, pp.86-96, July 2002
2	Y. Noller, R. Kersten and C. S. Pasareanu, "Badger: Complexity Analysis with Fuzzing and Symbolic," ISSTA 2018 - Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis, pp. 322-332, July 2018
3	M. Jurczyk, "Effective file format fuzzing," Black hat Europe, Nov. 2016
4	N. Nichols, M. Raugas, R. Jasper and N. Hilliard, "Faster Fuzzing: Reinitialization with Deep Neural Models," arXiv preprint arXiv:1711.02807v1, Nov. 2017
5	A.Severyn and A.Moschitti, "Twitter Sentiment Analysis with Deep Convolutional Neural Networks," SIGIR '15 Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 959-962, Aug. 2015
6	Lcamtuf, "american fuzzy lop" http://lcamtuf.coredump.cx/afl/
7	Lcamtuf, "Technical whitepaper for afl-fuzz" http://lcamtuf.coredump.cx/afl/technical_details.txt
8	Intel, "Pin- A Dynamic Binary Instru mentation Tool" https://software.inteel.com/en-us/articles/pin-a-dynamic-binary-instrumentation-tool
9	DynamoRIO, "DynamoRIO" http://www.dynamorio.org/
10	Wikipedia, "OLE" https://en/wikipedia.org/wiki/OLE
11	Hancom, "HWP File Format" https://www.hancom.com/etc/hwpDownload.do
12	Microsoft, "Summary Information Property Set" https://msdn.microsoft.com/en-us/library/dd943476(v=office.12).aspx
13	Lcamtuf, "American Fuzzy Lop READDME", http://lcamtuf.coredump.cx cx/afl/README.txt
14	Wikipedia, "Segmentation Fault" https://en.wikipedia.org/wiki/Segmentation_fault
15	H. Liang, X. Pei, X. Jia, W. Shen and J. Zhang, "Fuzzing: State of the Art," IEEE Transactions on Reliability, pp. 1199-1218, Sep. 2018
16	M. Bohme, V. Pham and A. Roychoudhury, "Coverage-based Greybox Fuzzing as Markov Chain," IEEE Transactions on Software Engineering, Dec. 2017
17	N. Stephens, J. Grosen, C. Salls, A. Dutcher, R. Wang, J. Corbetta, Y. Shoshitaishvili, C. Kruegel, and G. Vigna, "Driller: Augmenting fuzzing through selective symbolic execution," NDSS, vol. 16, pp. 1-16, Feb. 2016
18	S. Rawat, V. Jainz, A. Kumarz, L. Cojocar, C. Giuffrida and H. Bos, "VUzzer: Application-aware Evolutionary Fuzzing," NDSS Symposium 2017, Feb. 2017
19	QEMU, "QEMU version 3.0.0 User Documentation" https://qemu.weilnetz.de/doc/qemu-doc.html
20	P. Godefroid, H. Peleg, and R. Singh, "Learn&Fuzz: Machine Learning for Input Fuzzing," Proceedings of the 32nd IEEE/ACM International Conference on Automated Software Engineering, pp. 50-59, Nov. 2017
21	C. Cummins, P. Petoumenos, A. Murray and H. Leather, "Compiler Fuzzing through Deep Learning," ACM SIGSOFT International Symposium on Software Testing and Analysis, pp.95-105, July 2018

KSCI

A Method for Field Based Grey Box Fuzzing with Variational Autoencoder Variational Autoencoder를 활용한 필드 기반 그레이 박스 퍼징 방법

A Method for Field Based Grey Box Fuzzing with Variational Autoencoder