Browse > Article
http://dx.doi.org/10.13089/JKIISC.2018.28.3.635

Software Vulnerability Prediction System Using Machine Learning Algorithm  

Choi, Minjun (Sejong University)
Kim, Juhwan (Sejong University)
Yun, Joobeom (Sejong University)
Abstract
In the Era of the Fourth Industrial Revolution, we live in huge amounts of software. However, as software increases, software vulnerabilities are also increasing. Therefore, it is important to detect and remove software vulnerabilities. Currently, many researches have been studied to predict and detect software security problems, but it takes a long time to detect and does not have high prediction accuracy. Therefore, in this paper, we describe a method for efficiently predicting software vulnerabilities using machine learning algorithms. In addition, various machine learning algorithms are compared through experiments. Experimental results show that the k-nearest neighbors prediction model has the highest prediction rate.
Keywords
Machine Learning; Fuzzing; Prediction; Vulnerability; Confusion Matrix;
Citations & Related Records
Times Cited By KSCI : 2  (Citation Analysis)
연도 인용수 순위
1 Sunnyeo Park, Wonchan Oh, Hyeonkee Cho and Eunsun Cho, "Automated Vulnerability Analysis Tool for Binary Codes based on Symbolic Execution," Journal of Scientific Conference of Korean Institute of Information Scientists and Engineers, pp. 1917-1919, Jun. 2017.
2 Miller, Barton P., Louis Fredriksen, and Bryan So, "An empirical study of the reliability of unix utilities," Communications of the ACM, vol. 33, no. 12, pp. 32-44, Dec. 1990.   DOI
3 MinSik Shin, JungBeen Yu, and Taekyoung Kwon, "A Study of File Format-Aware Fuzzing against Smartphone Media Server Daemons," Journal of the Korea Institute of Information Security, 27(3), pp. 541-548, Jun. 2017.
4 Jegyeong Jo, and Jaecheol Ryou, "Method of Fuzzing Document Application Based on Android Devices," Journal of the Korea Institute of Information Security, 25(1), pp. 31-37, Feb. 2015.   DOI
5 Nguyen, Viet Hung, and Le Minh Sang Tran, "Predicting vulnerable software components with dependency graphs," Proceedings of the 6th International Workshop on Security Measurements and Metrics, pp. 1-8, Sep. 2010.
6 Yonghee Shin, and Laurie Williams, "An empirical model to predict security vulnerabilities using code complexity metrics," Proceedings of the Second ACM-IEEE international symposium on Empirical software engineering and measurement, pp. 315-317, Oct. 2008.
7 Yonghee Shin, and Laurie Williams, "Is complexity really the enemy of software security?," Proceedings of the 4th ACM workshop on Quality of protection, pp. 47-50, Oct. 2008.
8 Young Jun Lee, Sang-Hoon Choi, Chulwoo Kim and Ki-Woong Park, "Learning binary code with deep learning to detect software weakness," KSII The 9th International Conference on Internet 2017 Symposium, pp. 245-249, Dec. 2017.
9 V. Stinner, python-ptrace, http://python-ptrace.readthedocs.org/
10 Grieco, G., Grinblat, G.L., Uzal, L., Rawat, S., Feist, J. and Mounier, L., "Toward large-scale vulnerability discovery using Machine Learning," Proceedings of the Sixth ACM Conference on Data and Application Security and Privacy, pp. 85-96, Mar. 2016.
11 Hocevar, S. zzuf-multi-purposefuzzer, http://caca.zoy.org/wiki/zzuf/
12 Domingos, Pedro, "A few useful things to know about machine learning," Communications of the ACM, vol. 55, no. 10, pp. 78-87, Oct. 2012.   DOI
13 Witten, I.H., Frank, E., Hall, M.A. and Pal, C.J., Data Mining: Practical machine learning tools and techniques, Morgan Kaufmann, Oct. 2016.
14 Muller, Andreas C. and Sarah Guido, Introduction to machine learning with Python: a guide for data scientists, O'Reilly Media, Inc., Oct. 2016.
15 Pedregosa, Fabian and et al., "Scikitlearn: machine learning in python," Journal of Machine Learning Research, pp. 2825-2830, Oct. 2011.
16 Batista, Gustavo EAPA, Ronaldo C. Prati, and Maria Carolina Monard, "A study of the behavior of several methods for balancing machine learning training data," ACM Sigkdd Explorations Newsletter, vol. 6, no. 1 pp. 20-29, Jun. 2004.   DOI
17 Hsu, Chih-Wei, Chih-Chung Chang, and Chih-Jen Lin, "A practical guide to support vector classification," Department of Computer Science, National Taiwan University, Jul. 2003.
18 Sangsu Kim, and Dongsu Kang, "Software Vulnerability Analysis using File Fuzzing," Conference of The Korean Society Of Computer And Information, 25(2), pp. 29-32, Jul. 2017.
19 S. Neuhaus, T. Zimmermann, C. Holler, and A. Zeller, "Predicting vulnerable software components," Proceedings of the 14th ACM conference on Computer and communications security, pp. 529-540, Oct. 2007.
20 Catal, Cagatay, and Banu Diri, "A systematic review of software fault prediction studies," Expert systems with applications, vol. 36, no. 4, pp. 7346-7354, May. 2009.   DOI
21 King, James C., "Symbolic execution and program testing," Communications of the ACM, vol. 19, no. 7, pp. 385-394, Jul. 1976.   DOI