[KSCI] Korea Science Citation Index Service

http://dx.doi.org/10.9708/jksci.2021.26.02.061

RFA: Recursive Feature Addition Algorithm for Machine Learning-Based Malware Classification

Byeon, Ji-Yun (Dept. of Cyber Security, Yeungnam University College)
Kim, Dae-Ho (Dept. of Cyber Security, Yeungnam University College)
Kim, Hee-Chul (Dept. of Cyber Security, Yeungnam University College)
Choi, Sang-Yong (Dept. of Cyber Security, Yeungnam University College)

Publication Information

Journal of the Korea Society of Computer and Information / v.26, no.2, 2021 , pp. 61-68 More about this Journal

Abstract

Recently, various technologies that use machine learning to classify malicious code have been studied. In order to enhance the effectiveness of machine learning, it is most important to extract properties to identify malicious codes and normal binaries. In this paper, we propose a feature extraction method for use in machine learning using recursive methods. The proposed method selects the final feature using recursive methods for individual features to maximize the performance of machine learning. In detail, we use the method of extracting the best performing features among individual feature at each stage, and then combining the extracted features. We extract features with the proposed method and apply them to machine learning algorithms such as Decision Tree, SVM, Random Forest, and KNN, to validate that machine learning performance improves as the steps continue.

Keywords

machine learning; feature; recursive; malware; classification;

Citations & Related Records

Reference

1	Dong-Geun Lee. "Analysis of Malware Detection Techniquesbased on Machine Learning." Graduate School of Soonchunhyang University, Feb. 2018
2	El Merabet, Hoda, and Abderrahmane Hajraoui. "A survey of malware detection techniques based on machine learning." International Journal of Advanced Computer Science and Applications. Vol. 10 No. 1, pp. 366-373. 2019.
3	Feature engineering, http://www.incodom.kr/%EA%B8%B0%EA%B3%84%ED%95%99%EC%8A%B5/feature_engineering.
4	Decision Tree, https://ko.wikipedia.org/wiki/%EA%B2%B0%EC%A0%95_%ED%8A%B8%EB%A6%AC
5	Random forest, https://ko.wikipedia.org/wiki/%EB%9E%9C%EB%8D%A4_%ED%8F%AC%EB%A0%88%EC%8A%A4%ED%8A%B8
6	SVM, https://ko.wikipedia.org/wiki/%EC%84%9C%ED%8F%AC%ED%8A%B8_%EB%B2%A1%ED%84%B0_%EB%A8%B8%EC%8B%A0
7	KNN, https://ko.wikipedia.org/wiki/K-%EC%B5%9C%EA%B7%BC%EC%A0%91_%EC%9D%B4%EC%9B%83_%EC%95%8C%EA%B3%A0%EB%A6%AC%EC%A6%98
8	Woo-Seok Go, Chun-Gyeong Yoon, Han-Pil Rhee, Soon-Jin Hwang, Sang-Woo LEE, "A Study on the prediction of BMI(Benthic Macroinvertebrate Index) using Machine Learning Based CFS(Correlation-based Feature Selection) and Random Forest Model", Journal of Korean Society on Water Environment, Vol.35, No.5, pp.425-431, September, 2019. DOI:.10.15681/KSWE.2019.35.5.425 DOI
9	ENISA Threat Landscape 2020, https://online.flippingbook.com/view/165705/, Oct. 2020.
10	Sung-Guk Choi, "A Study on the Prediction of Intrusion Types Using a Support Vector Machine", Yonsei University, Feb. 2016.
11	Hong-bi Kim, Tae-jin Lee, "Stacked Autoencoder Based Malware Feature Refinement Technology Research", Journal of Korea Institute of Information Security & Cryptology, Vol.30, No.4, pp-593-603, Aug. 2020. DOI:10.13089/JKIISC.2020.30.4.593 DOI
12	Seong-Eun Kang, Nguyen Vu Long, Sou-hwan Jung, "Android Malware Detection Using Permission-Based Machine Learning Apporach", Journal of The Korea Institute of Information Security & Cryptology, Vol.28, No.3, pp.617-623, Jun, 2018. DOI:10.13089/JKIISC.2018.28.3.617 DOI
13	Seong-Min Jeong, Hyeon-Seok Kim, Young-Jae Kim, Myung-Keun Yoon, "V-gram: Malware Detection Using Opcode Basic Blocks and Deep Learning", Journal of KIISE, Vol.46, No.7, pp.599-605, July, 2019. DOI:10.5626/JOK.2019.46.7.599 DOI
14	Jin-Young Cho, Eun-Gi Ko, Hye-Bin Yoo, Mi-Ri Cho, Chang-Jin Seo, "A Study on Malware Detection System Using Static Analysis and Stacking", The Transactions of the Korean Institute of Electrical Engineers, Vol.69P, No.3, pp.187-192, September, 2020. DOI:10.5370/KIEEP.2020.69.3.187 DOI
15	Young-Min Cho, Hun-Yeong Kwon, "Machine Lerning Based Malware Detection Using API Call Time Interval", Journal of The Korea Institute of Information Security & Cryptology, Vol.30, No.1, pp.51-58, Feb, 2020. DOI:10.13089/JKIISC.2020.30.1.51 DOI