Browse > Article
http://dx.doi.org/10.9708/jksci.2021.26.10.019

CNN-based Android Malware Detection Using Reduced Feature Set  

Kim, Dong-Min (Dept. of Computer Science and Engineering, Korea National Defense University)
Lee, Soo-jin (Dept. of Computer Science and Engineering, Korea National Defense University)
Abstract
The performance of deep learning-based malware detection and classification models depends largely on how to construct a feature set to be applied to training. In this paper, we propose an approach to select the optimal feature set to maximize detection performance for CNN-based Android malware detection. The features to be included in the feature set were selected through the Chi-Square test algorithm, which is widely used for feature selection in machine learning and deep learning. To validate the proposed approach, the CNN model was trained using 36 characteristics selected for the CICANDMAL2017 dataset and then the malware detection performance was measured. As a result, 99.99% of Accuracy was achieved in binary classification and 98.55% in multiclass classification.
Keywords
CNN; Android; Malware; Feature selection; Binary classification; Multiclass classification;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Guyon, I. and Elisseeff, A. "An introduction to variable and feature selection.", Journal of machine learning research 3 Vol 3. No. March pp. 1157-1182. March 2003.
2 J. Kang, S. Lee "Android Malware Detection Through the Conversion of Network Traffic to Images", Journal of KIISE, Vol. 47, No. 8, pp. 761-768, August 2020. DOI : 10.5626/JOK.2020.47.8.761   DOI
3 Li, J., Cheng, K., Wang, S., Morstatter, F., Trevino, R. P., Tang, J., and Liu, H. "Feature selection: A data perspective", ACM Computing Surveys (CSUR) Vol.50. No. 6 pp.1-45, December 2017. DOI 10.1145/3136625   DOI
4 Hoque, N., Bhattacharyya, D. K. and Kalita, J. K. "MIFS-ND: A mutual information-based feature selection method", Expert Systems with Applications VOl 41 No. 14 pp.6371-6385. October 2014. DOI 10.1016/j.eswa.2014. 04.019   DOI
5 Check Point, "MOBILE SECURITY REPORT 2021", U.S. Headquarters959 Skyway Road, Suite 300, San Carlos, CA 94070, Apr. 2021.
6 Byeon, J. Y., Kim, D. H., Kim, H. C., and Choi, S. Y. "RFA : Recursive Feature Addition Algorithm for Machine Learning-Based Malware Classification", Vol. 26, No. 2, pp.61-68, February. 2021. DOI 10.9708/ jksci.2021.26.02.061.   DOI
7 Thaseen, I. S., Kumar, C. A., and Ahmad, A. "Integrated intrusion detection model using chi-square feature selection and ensemble of classifiers", Arabian Journal for Science and Engineering Vol 44, No 4 pp.3357-3368, August 2018.   DOI
8 Kumar, B. J., Naveen, H., Kumar, B. P., Sharma, S. S., and Villegas, J. "Logistic regression for polymorphic malware detection using ANOVA F-test", 2017 International Conference on Innovations in Information, Embedded and Communication Systems (ICIIECS). IEEE, pp.1-5, March 2017. DOI 10.1109/ICIIECS.2017.8275880   DOI
9 Lashkari, A. H., Kadir, A. F. A., Taheri, L. and Ghorbani, A. A. "Toward Developing a Systematic Approach to Generate Benchmark Android malware Datasets and Classification", Proc. of the 2018 International Carnahan Conference on Security Technology, pp. 1-7, 2018. DOI 10.1109/CCST.2018.8585560   DOI
10 Bibi, I., Akhunzada, A., Malik, J., Ahmed, G. and Raza, M. "An effective Android ransomware detection through multi-factor Feature filtration and recurrent neural network", 2019 UK/China Emerging Technologies (UCET). IEEE, pp 1-4, August 2019. DOI 10.1109/UCET. 2019.8881884   DOI
11 Abuthawabeh, M. and Mahmoud, K. "Enhanced android malware detection and family classification, using conversation-level network traffic Features", The International Arab Journal of Information Technology, Vol 17, No.4A pp.607-614. June 2020. DOI:10.34028/iajit/17/4A/4   DOI
12 Securelist. by Kaspersky, "IT threat evolution Q1 2021. Mobile statistics", https://securelist.com/it-threat-evolution -q1-2021-mobile-statistics/102547/
13 Chandrashekar, G. and Sahin, F. "A survey on feature selection methods", Computers & Electrical Engineering vol 40. No. 1 pp. 16-28 January 2014. DOI 10.1016/ j.compeleceng.2013.11.024   DOI
14 Chen, R., Li, Y. and Fang, W. "Android malware identification based on traffic analysis", International Conference on Artificial Intelligence and Security. Springer, Cham, pp. 293-303. 2019. DOI 10.1007/978-3-030-24274-9_26   DOI
15 Arora, A. and Peddoju, S. K. "Minimizing network traffic features for android mobile malware detection", Proceedings of the 18th International Conference on Distributed Computing and Networking. pp.1-10, January 2017. DOI:10.1145/3007748.3007763   DOI
16 Zebari, R., Abdulazeez, A., Zeebaree, D., Zebari, D. and Saeed, J. "A comprehensive review of dimensionality reduction techniques for feature selection and feature extraction.", Journal of Applied Science and Technology Trends Vol 1, No. 2, pp.56-70, May 2020. DOI 10.38094/ jastt1224   DOI