• Title/Summary/Keyword: Malware Family Classification

Search Result 14, Processing Time 0.016 seconds

A Study on Classification of Variant Malware Family Based on ResNet-Variational AutoEncoder (ResNet-Variational AutoEncoder기반 변종 악성코드 패밀리 분류 연구)

  • Lee, Young-jeon;Han, Myung-Mook
    • Journal of Internet Computing and Services
    • /
    • v.22 no.2
    • /
    • pp.1-9
    • /
    • 2021
  • Traditionally, most malicious codes have been analyzed using feature information extracted by domain experts. However, this feature-based analysis method depends on the analyst's capabilities and has limitations in detecting variant malicious codes that have modified existing malicious codes. In this study, we propose a ResNet-Variational AutoEncder-based variant malware classification method that can classify a family of variant malware without domain expert intervention. The Variational AutoEncoder network has the characteristics of creating new data within a normal distribution and understanding the characteristics of the data well in the learning process of training data provided as input values. In this study, important features of malicious code could be extracted by extracting latent variables in the learning process of Variational AutoEncoder. In addition, transfer learning was performed to better learn the characteristics of the training data and increase the efficiency of learning. The learning parameters of the ResNet-152 model pre-trained with the ImageNet Dataset were transferred to the learning parameters of the Encoder Network. The ResNet-Variational AutoEncoder that performed transfer learning showed higher performance than the existing Variational AutoEncoder and provided learning efficiency. Meanwhile, an ensemble model, Stacking Classifier, was used as a method for classifying variant malicious codes. As a result of learning the Stacking Classifier based on the characteristic data of the variant malware extracted by the Encoder Network of the ResNet-VAE model, an accuracy of 98.66% and an F1-Score of 98.68 were obtained.

Design and Implementation of a Pre-processing Method for Image-based Deep Learning of Malware (악성코드의 이미지 기반 딥러닝을 위한 전처리 방법 설계 및 개발)

  • Park, Jihyeon;Kim, Taeok;Shin, Yulim;Kim, Jiyeon;Choi, Eunjung
    • Journal of Korea Multimedia Society
    • /
    • v.23 no.5
    • /
    • pp.650-657
    • /
    • 2020
  • The rapid growth of internet users and faster network speed are driving the new ICT services. ICT Technology has improved our way of thinking and style of life, but it has created security problems such as malware, ransomware, and so on. Therefore, we should research against the increase of malware and the emergence of malicious code. For this, it is necessary to accurately and quickly detect and classify malware family. In this paper, we analyzed and classified visualization technology, which is a preprocessing technology used for deep learning-based malware classification. The first method is to convert each byte into one pixel of the image to produce a grayscale image. The second method is to convert 2bytes of the binary to create a pair of coordinates. The third method is the method using LSH. We proposed improving the technique of using the entire existing malicious code file for visualization, extracting only the areas where important information is expected to exist and then visualizing it. As a result of experimenting in the method we proposed, it shows that selecting and visualizing important information and then classifying it, rather than containing all the information in malicious code, can produce better learning results.

A Study on the Image-Based Malware Classification System that Combines Image Preprocessing and Ensemble Techniques for High Accuracy (높은 정확도를 위한 이미지 전처리와 앙상블 기법을 결합한 이미지 기반 악성코드 분류 시스템에 관한 연구)

  • Kim, Hae Soo;Kim, Mi Hui
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.11 no.7
    • /
    • pp.225-232
    • /
    • 2022
  • Recent development in information and communication technology has been beneficial to many, but at the same time, malicious attack attempts are also increasing through vulnerabilities in new programs. Among malicious attacks, malware operate in various ways and is distributed to people in new ways every time, and to solve this malware, it is necessary to quickly analyze and provide defense techniques. If new malware can be classified into the same type of malware, malware has similar behavioral characteristics, so they can provide defense techniques for new malware using analyzed malware. Therefore, there is a need for a solution to this because the method of accurately and quickly classifying malware and the number of data may not be uniform for each family of analyzed malware. This paper proposes a system that combines image preprocessing and ensemble techniques to increase accuracy in imbalanced data.

Cyberattack Goal Classification Based on MITRE ATT&CK: CIA Labeling (MITRE ATT&CK 기반 사이버 공격 목표 분류 : CIA 라벨링)

  • Shin, Chan Ho;Choi, Chang-hee
    • Journal of Internet Computing and Services
    • /
    • v.23 no.6
    • /
    • pp.15-26
    • /
    • 2022
  • Various subjects are carrying out cyberattacks using a variety of tactics and techniques. Additionally, cyberattacks for political and economic purposes are also being carried out by groups which is sponsored by its nation. To deal with cyberattacks, researchers used to classify the malware family and the subjects of the attack based on malware signature. Unfortunately, attackers can easily masquerade as other group. Also, as the attack varies with subject, techniques, and purpose, it is more effective for defenders to identify the attacker's purpose and goal to respond appropriately. The essential goal of cyberattacks is to threaten the information security of the target assets. Information security is achieved by preserving the confidentiality, integrity, and availability of the assets. In this paper, we relabel the attacker's goal based on MITRE ATT&CK® in the point of CIA triad as well as classifying cyber security reports to verify the labeling method. Experimental results show that the model classified the proposed CIA label with at most 80% probability.