• Title/Summary/Keyword: 악성코드 패밀리 분류

Search Result 19, Processing Time 0.02 seconds

Image-based malware classification system using image preprocessing and ensemble techniques (이미지 전처리와 앙상블 기법을 이용한 이미지 기반 악성코드 분류 시스템)

  • Kim, Hae-Soo;Kim, Mi-hui
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2021.11a
    • /
    • pp.715-718
    • /
    • 2021
  • 정보통신 기술이 발전함에 따라 악의적인 공격을 통해 보안문제를 발생시키고 있다. 또한 새로운 악성코드가 유포되어 기존의 시그니처 비교방식은 새롭게 발생하는 악성코드를 빠르게 분석 할 수 없다. 새로운 악성코드를 빠르게 분석하고 방어기법을 제안하기 위해 악성코드의 패밀리를 분류할 필요가 있다. 본 논문에서는 악성코드의 바이너리 파일을 이용해 시각화하고 CNN모델을 통해 분류한다. 또한 정확도를 높이기 위해 LBP, HOG를 통해 악성코드 이미지에서 중요한 특성을 찾고 데이터 클래스 불균형에서 오는 문제를 앙상블 모델을 통해 해결하는 시스템을 제안한다.

Android Malware Detection Using Permission-Based Machine Learning Approach (머신러닝을 이용한 권한 기반 안드로이드 악성코드 탐지)

  • Kang, Seongeun;Long, Nguyen Vu;Jung, Souhwan
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.28 no.3
    • /
    • pp.617-623
    • /
    • 2018
  • This study focuses on detection of malicious code through AndroidManifest permissoion feature extracted based on Android static analysis. Features are built on the permissions of AndroidManifest, which can save resources and time for analysis. Malicious app detection model consisted of SVM (support vector machine), NB (Naive Bayes), Gradient Boosting Classifier (GBC) and Logistic Regression model which learned 1,500 normal apps and 500 malicious apps and 98% detection rate. In addition, malicious app family identification is implemented by multi-classifiers model using algorithm SVM, GPC (Gaussian Process Classifier) and GBC (Gradient Boosting Classifier). The learned family identification machine learning model identified 92% of malicious app families.

A Study on Classification of Variant Malware Family Based on ResNet-Variational AutoEncoder (ResNet-Variational AutoEncoder기반 변종 악성코드 패밀리 분류 연구)

  • Lee, Young-jeon;Han, Myung-Mook
    • Journal of Internet Computing and Services
    • /
    • v.22 no.2
    • /
    • pp.1-9
    • /
    • 2021
  • Traditionally, most malicious codes have been analyzed using feature information extracted by domain experts. However, this feature-based analysis method depends on the analyst's capabilities and has limitations in detecting variant malicious codes that have modified existing malicious codes. In this study, we propose a ResNet-Variational AutoEncder-based variant malware classification method that can classify a family of variant malware without domain expert intervention. The Variational AutoEncoder network has the characteristics of creating new data within a normal distribution and understanding the characteristics of the data well in the learning process of training data provided as input values. In this study, important features of malicious code could be extracted by extracting latent variables in the learning process of Variational AutoEncoder. In addition, transfer learning was performed to better learn the characteristics of the training data and increase the efficiency of learning. The learning parameters of the ResNet-152 model pre-trained with the ImageNet Dataset were transferred to the learning parameters of the Encoder Network. The ResNet-Variational AutoEncoder that performed transfer learning showed higher performance than the existing Variational AutoEncoder and provided learning efficiency. Meanwhile, an ensemble model, Stacking Classifier, was used as a method for classifying variant malicious codes. As a result of learning the Stacking Classifier based on the characteristic data of the variant malware extracted by the Encoder Network of the ResNet-VAE model, an accuracy of 98.66% and an F1-Score of 98.68 were obtained.

A Study on the Image-Based Malware Classification System that Combines Image Preprocessing and Ensemble Techniques for High Accuracy (높은 정확도를 위한 이미지 전처리와 앙상블 기법을 결합한 이미지 기반 악성코드 분류 시스템에 관한 연구)

  • Kim, Hae Soo;Kim, Mi Hui
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.11 no.7
    • /
    • pp.225-232
    • /
    • 2022
  • Recent development in information and communication technology has been beneficial to many, but at the same time, malicious attack attempts are also increasing through vulnerabilities in new programs. Among malicious attacks, malware operate in various ways and is distributed to people in new ways every time, and to solve this malware, it is necessary to quickly analyze and provide defense techniques. If new malware can be classified into the same type of malware, malware has similar behavioral characteristics, so they can provide defense techniques for new malware using analyzed malware. Therefore, there is a need for a solution to this because the method of accurately and quickly classifying malware and the number of data may not be uniform for each family of analyzed malware. This paper proposes a system that combines image preprocessing and ensemble techniques to increase accuracy in imbalanced data.

Multi-Modal Based Malware Similarity Estimation Method (멀티모달 기반 악성코드 유사도 계산 기법)

  • Yoo, Jeong Do;Kim, Taekyu;Kim, In-sung;Kim, Huy Kang
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.29 no.2
    • /
    • pp.347-363
    • /
    • 2019
  • Malware has its own unique behavior characteristics, like DNA for living things. To respond APT (Advanced Persistent Threat) attacks in advance, it needs to extract behavioral characteristics from malware. To this end, it needs to do classification for each malware based on its behavioral similarity. In this paper, various similarity of Windows malware is estimated; and based on these similarity values, malware's family is predicted. The similarity measures used in this paper are as follows: 'TF-IDF cosine similarity', 'Nilsimsa similarity', 'malware function cosine similarity' and 'Jaccard similarity'. As a result, we find the prediction rate for each similarity measure is widely different. Although, there is no similarity measure which can be applied to malware classification with high accuracy, this result can be helpful to select a similarity measure to classify specific malware family.

A Study on Classification of CNN-based Linux Malware using Image Processing Techniques (영상처리기법을 이용한 CNN 기반 리눅스 악성코드 분류 연구)

  • Kim, Se-Jin;Kim, Do-Yeon;Lee, Hoo-Ki;Lee, Tae-Jin
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.21 no.9
    • /
    • pp.634-642
    • /
    • 2020
  • With the proliferation of Internet of Things (IoT) devices, using the Linux operating system in various architectures has increased. Also, security threats against Linux-based IoT devices are increasing, and malware variants based on existing malware are constantly appearing. In this paper, we propose a system where the binary data of a visualized Executable and Linkable Format (ELF) file is applied to Local Binary Pattern (LBP) image processing techniques and a median filter to classify malware in a Convolutional Neural Network (CNN). As a result, the original image showed the highest accuracy and F1-score at 98.77%, and reproducibility also showed the highest score at 98.55%. For the median filter, the highest precision was 99.19%, and the lowest false positive rate was 0.008%. Using the LBP technique confirmed that the overall result was lower than putting the original ELF file through the median filter. When the results of putting the original file through image processing techniques were classified by majority, it was confirmed that the accuracy, precision, F1-score, and false positive rate were better than putting the original file through the median filter. In the future, the proposed system will be used to classify malware families or add other image processing techniques to improve the accuracy of majority vote classification. Or maybe we mean "the use of Linux O/S distributions for various architectures has increased" instead? If not, please rephrase as intended.

API Grouping Based Flow Analysis and Frequency Analysis Technique for Android Malware Classification (안드로이드 악성코드 분류를 위한 Flow Analysis 기반의 API 그룹화 및 빈도 분석 기법)

  • Shim, Hyunseok;Park, Jungsoo;Doan, Thien-Phuc;Jung, Souhwan
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.29 no.6
    • /
    • pp.1235-1242
    • /
    • 2019
  • While several machine learning technique has been implemented for Android malware categorization, there is still difficulty in analyzing due to overfitting problem and including of un-executable code, etc. In this paper, we introduce our implemented tool to address these problems. Tool is consists of approximately 1,500 lines of Java code, and perform Flow analysis on set of APIs, or on control flow graph. Our tool groups all the API by its relationship and only perform analysis on actually executing code. Using our tool, we grouped 39032 APIs into 4972 groups, and 12123 groups with result of including class names. We collected 7,000 APKs from 7 families and evaluated our feature reduction technique, and we also reduced features again with selecting APIs that have frequency more than 20%. We finally reduced features to 263-numbers of feature for our collected APKs.

Study on DNN Based Android Malware Detection Method for Mobile Environmentt (모바일 환경에 적합한 DNN 기반의 악성 앱 탐지 방법에 관한 연구)

  • Yu, Jinhyun;Seo, In Hyuk;Kim, Seungjoo
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.6 no.3
    • /
    • pp.159-168
    • /
    • 2017
  • Smartphone malware has increased because Smartphone users has increased and smartphones are widely used in everyday life. Since 2012, Android has been the most mobile operating system. Owing to the open nature of Android, countless malware are in Android markets that seriously threaten Android security. Most of Android malware detection program does not detect malware to which bypass techniques apply and also does not detect unknown malware. In this paper, we propose lightweight method for detection of Android malware using static analysis and deep learning techniques. For experiments we crawl 7,000 apps from the Google Play Store and collect 6,120 malwares. The result show that proposed method can achieve 98.05% detection accuracy. Also, proposed method can detect about unknown malware families with good performance. On smartphones, the method requires 10 seconds for an analysis on average.

Cyberattack Goal Classification Based on MITRE ATT&CK: CIA Labeling (MITRE ATT&CK 기반 사이버 공격 목표 분류 : CIA 라벨링)

  • Shin, Chan Ho;Choi, Chang-hee
    • Journal of Internet Computing and Services
    • /
    • v.23 no.6
    • /
    • pp.15-26
    • /
    • 2022
  • Various subjects are carrying out cyberattacks using a variety of tactics and techniques. Additionally, cyberattacks for political and economic purposes are also being carried out by groups which is sponsored by its nation. To deal with cyberattacks, researchers used to classify the malware family and the subjects of the attack based on malware signature. Unfortunately, attackers can easily masquerade as other group. Also, as the attack varies with subject, techniques, and purpose, it is more effective for defenders to identify the attacker's purpose and goal to respond appropriately. The essential goal of cyberattacks is to threaten the information security of the target assets. Information security is achieved by preserving the confidentiality, integrity, and availability of the assets. In this paper, we relabel the attacker's goal based on MITRE ATT&CK® in the point of CIA triad as well as classifying cyber security reports to verify the labeling method. Experimental results show that the model classified the proposed CIA label with at most 80% probability.