[KSCI] Korea Science Citation Index Service

http://dx.doi.org/10.3837/tiis.2019.07.010

ER-Fuzz : Conditional Code Removed Fuzzing

Song, Xiaobin (China National Digital Switching System Engineering and Technological Research Center)
Wu, Zehui (China National Digital Switching System Engineering and Technological Research Center)
Cao, Yan (China National Digital Switching System Engineering and Technological Research Center)
Wei, Qiang (China National Digital Switching System Engineering and Technological Research Center)

Publication Information

KSII Transactions on Internet and Information Systems (TIIS) / v.13, no.7, 2019 , pp. 3511-3532 More about this Journal

Abstract

Coverage-guided fuzzing is an efficient solution that has been widely used in software testing. By guiding fuzzers through the coverage information, seeds that generate new paths will be retained to continually increase the coverage. However, we observed that most samples follow the same few high-frequency paths. The seeds that exercise a high-frequency path are saved for the subsequent mutation process until the user terminates the test process, which directly affects the efficiency with which the low-frequency paths are tested. In this paper, we propose a fuzzing solution, ER-Fuzz, that truncates the recording of a high-frequency path to influence coverage. It utilizes a deep learning-based classifier to locate the high and low-frequency path transfer points; then, it instruments at the transfer position to promote the probability low-frequency transfer paths while eliminating subsequent variations of the high-frequency path seeds. We implemented a prototype of ER-Fuzz based on the popular fuzzer AFL and evaluated it on several applications. The experimental results show that ER-Fuzz improves the coverage of the original AFL method to different degrees. In terms of the number of crash discoveries, in the best case, ER-Fuzz found 115% more unique crashes than did AFL. In total, seven new bugs were found and new CVEs were assigned.

Keywords

Fuzzing; Deep learning; instrumentation; conditional code;

Citations & Related Records

Reference

1	Sutton M, Greene A, Amini P, "Fuzzing: brute force vulnerability discovery," Pearson Education, 2007.
2	CAO Yan, "Research on Software Vulnerability Analysis Oriented Parallel Symbolic Execution," 2013.
3	Cristian Cadar, Daniel Dunbar, and Dawson Engler, "KLEE: Unassisted and Automatic Generation of High-coverage Tests for Complex Systems Programs," in Proc. of the 8th USENIX Conference on Operating Systems Design and Implementation (OSDI'08), vol. 8, pp. 209-224, December, 2008.
4	R. Majumdar and K. Sen, "Hybrid concolic testing," in Porc. of ICSE'07. Washington, DC, USA: IEEE Computer Society, pp. 416-426, 2007.
5	James Newsome, Dawn Song, James Newsome, and Dawn Song, "Dynamic taint analysis: Automatic detection, analysis, and signature generation of exploit attacks on commodity software," in Proc. of the 12th Network and Distributed Systems Security Symposium (NDSS), 2005
6	Andrew Henderson, Lok Kwong Yan, Xunchao Hu, Aravind Prakash, Heng Yin, Stephen McCamant, "DECAF: A Platform-Neutral Whole-System Dynamic Binary Analysis Platform," IEEE Transactions on Software Engineering, vol. 43, no. 2, pp. 164-184, 2017. DOI
7	Stephens, Nick, et al., "Driller: Augmenting Fuzzing Through Selective Symbolic Execution," NDSS, vol. 16, pp. 1-16, February, 2016.
8	Rawat, Sanjay, et al., "VUzzer: Application-aware evolutionary fuzzing," in Porc. of the Network and Distributed System Security Symposium (NDSS), February, 2017.
9	Bohme, Marcel, Van-Thuan Pham, and Abhik Roychoudhury, "Coverage-based greybox fuzzing as markov chain," in Porc. of the 2016 ACM SIGSAC Conference on Computer and Communications Security. ACM, pp. 1032-1043, 2016.
10	J. R. Norris, "Markov Chains (Cambridge Series in Statistical and Probabilistic Mathematics)," Cambridge University Press, July 1998.
11	Jaaskela E, "Genetic Algorithm in Code Coverage Guided Fuzz Testing," University of Oulu, 2016.
12	M. Zalewski, "American fuzzy lop,".
13	Mitchell M, "An introduction to genetic algorithms," MIT press, 1998.
14	Mikolov, Tomas, et al., "Efficient estimation of word representations in vector space," arXiv preprint arXiv:1301.3781, 2013.
15	Hochreiter, Sepp, and Jurgen Schmidhuber, "Long short-term memory," Neural computation, vol. 9, no.8, pp. 1735-1780, 1997. DOI
16	Li, Yuekang, et al., "Steelix: program-state based binary fuzzing," in Porc. of the 2017 11th Joint Meeting on Foundations of Software Engineering. ACM, pp. 627-637, August, 2017.
17	Tian, Yuchi, and Baishakhi Ray, "Automatically diagnosing and repairing error handling bugs in c," in Porc. of the 2017 11th Joint Meeting on Foundations of Software Engineering. ACM, pp. 752-762, August, 2017.
18	Jana, Suman, et al., "Automatically Detecting Error Handling Bugs Using Error Specifications," USENIX Security Symposium, pp. 345-362, August, 2016.
19	Shastry B, Maggi F, Yamaguchi F, et al., "Static exploration of taint-style vulnerabilities found by fuzzing,"arXiv preprint arXiv:1706.00206, 2017.
20	Godefroid, Patrice, Hila Peleg, and Rishabh Singh, "Learn&fuzz: Machine learning for input fuzzing," in Porc. of the 32nd IEEE/ACM International Conference on Automated Software Engineering. IEEE Press, pp. 50-59, October, 2017.
21	Chen, Peng, and Hao Chen, "Angora: Efficient Fuzzing by Principled Search," arXiv preprint arXiv:1803.01307, 2018.
22	Mouzarani, Maryam, Babak Sadeghiyan, and Mohammad Zolfaghari, "Smart fuzzing method for detecting stack-based buffer overflow in binary codes," IET Software, vol. 10, no. 4, pp. 96-107, 2016. DOI
23	Bohme, Marcel, et al., "Directed greybox fuzzing," in Porc. of the 2017 ACM SIGSAC Conference on Computer and Communications Security. ACM, pp. 2329-2344, October, 2017.
24	Rabinovich, Maxim, Mitchell Stern, and Dan Klein, "Abstract syntax networks for code generation and semantic parsing," arXiv preprint arXiv:1704.07535, 2017.
25	Hsu, Chin-Chia, et al., "INSTRIM: Lightweight Instrumentation for Coverage-guided Fuzzing,".
26	Liu, Jinyu, et al., "IdenEH: Identify error-handling code snippets in large-scale software," in Porc. of Computational Science and Its Applications (ICCSA), 2017 17th International Conference on. IEEE, pp. 1-8, July, 2017.
27	Bottou, Leon, Frank E. Curtis, and Jorge Nocedal, "Optimization methods for large-scale machine learning," SIAM Review, vol. 60, no. 2, pp. 223-311, 2018. DOI
28	Miller, George A, "WordNet: a lexical database for English," Communications of the ACM, vol. 38, no. 11, pp. 39-41, 1995. DOI
29	Liao, Xin, Zheng Qin, and Liping Ding, "Data embedding in digital images using critical functions," Signal Processing: Image Communication, vol. 58, pp. 146-156, 2017. DOI
30	Liao, Xin, Qiaoyan Wen, and Jie Zhang, "Improving the Adaptive Steganographic Methods Based on Modulus Function," IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences, vol. 96, no. 12, pp. 2731-2734, 2013.