• Title/Summary/Keyword: Malware classification

Search Result 103, Processing Time 0.025 seconds

A Study on Malware Identification System Using Static Analysis Based Machine Learning Technique (정적 분석 기반 기계학습 기법을 활용한 악성코드 식별 시스템 연구)

  • Kim, Su-jeong;Ha, Ji-hee;Oh, Soo-hyun;Lee, Tae-jin
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.29 no.4
    • /
    • pp.775-784
    • /
    • 2019
  • Malware infringement attacks are continuously increasing in various environments such as mobile, IOT, windows and mac due to the emergence of new and variant malware, and signature-based countermeasures have limitations in detection of malware. In addition, analytical performance is deteriorating due to obfuscation, packing, and anti-VM technique. In this paper, we propose a system that can detect malware based on machine learning by using similarity hashing-based pattern detection technique and static analysis after file classification according to packing. This enables more efficient detection because it utilizes both pattern-based detection, which is well-known malware detection, and machine learning-based detection technology, which is advantageous for detecting new and variant malware. The results of this study were obtained by detecting accuracy of 95.79% or more for benign sample files and malware sample files provided by the AI-based malware detection track of the Information Security R&D Data Challenge 2018 competition. In the future, it is expected that it will be possible to build a system that improves detection performance by applying a feature vector and a detection method to the characteristics of a packed file.

Automated Link Tracing for Classification of Malicious Websites in Malware Distribution Networks

  • Choi, Sang-Yong;Lim, Chang Gyoon;Kim, Yong-Min
    • Journal of Information Processing Systems
    • /
    • v.15 no.1
    • /
    • pp.100-115
    • /
    • 2019
  • Malicious code distribution on the Internet is one of the most critical Internet-based threats and distribution technology has evolved to bypass detection systems. As a new defense against the detection bypass technology of malicious attackers, this study proposes the automated tracing of malicious websites in a malware distribution network (MDN). The proposed technology extracts automated links and classifies websites into malicious and normal websites based on link structure. Even if attackers use a new distribution technology, website classification is possible as long as the connections are established through automated links. The use of a real web-browser and proxy server enables an adequate response to attackers' perception of analysis environments and evasion technology and prevents analysis environments from being infected by malicious code. The validity and accuracy of the proposed method for classification are verified using 20,000 links, 10,000 each from normal and malicious websites.

Novel Optimizer AdamW+ implementation in LSTM Model for DGA Detection

  • Awais Javed;Adnan Rashdi;Imran Rashid;Faisal Amir
    • International Journal of Computer Science & Network Security
    • /
    • v.23 no.11
    • /
    • pp.133-141
    • /
    • 2023
  • This work take deeper analysis of Adaptive Moment Estimation (Adam) and Adam with Weight Decay (AdamW) implementation in real world text classification problem (DGA Malware Detection). AdamW is introduced by decoupling weight decay from L2 regularization and implemented as improved optimizer. This work introduces a novel implementation of AdamW variant as AdamW+ by further simplifying weight decay implementation in AdamW. DGA malware detection LSTM models results for Adam, AdamW and AdamW+ are evaluated on various DGA families/ groups as multiclass text classification. Proposed AdamW+ optimizer results has shown improvement in all standard performance metrics over Adam and AdamW. Analysis of outcome has shown that novel optimizer has outperformed both Adam and AdamW text classification based problems.

Andro-profiler: Anti-malware system based on behavior profiling of mobile malware (행위기반의 프로파일링 기법을 활용한 모바일 악성코드 분류 기법)

  • Yun, Jae-Sung;Jang, Jae-Wook;Kim, Huy Kang
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.24 no.1
    • /
    • pp.145-154
    • /
    • 2014
  • In this paper, we propose a novel anti-malware system based on behavior profiling, called Andro-profiler. Andro-profiler consists of mobile devices and a remote server, and is implemented in Droidbox. Our aim is to detect and classify malware using an automatic classifier based on behavior profiling. First, we propose the representative behavior profiling for each malware family represented by system calls coupled with Droidbox system logs. This is done by executing the malicious application on an emulator and extracting integrated system logs. By comparing the behavior profiling of malicious applications with representative behavior profiling for each malware family, we can detect and classify them into malware families. Andro-profiler shows over 99% of classification accuracy in classifying malware families.

Framework Design for Malware Dataset Extraction Using Code Patches in a Hybrid Analysis Environment (코드패치 및 하이브리드 분석 환경을 활용한 악성코드 데이터셋 추출 프레임워크 설계)

  • Ki-Sang Choi;Sang-Hoon Choi;Ki-Woong Park
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.34 no.3
    • /
    • pp.403-416
    • /
    • 2024
  • Malware is being commercialized and sold on the black market, primarily driven by financial incentives. With the increasing demand driven by these sales, the scope of attacks via malware has expanded. In response, there has been a surge in research efforts leveraging artificial intelligence for detection and classification. However, adversaries are integrating various anti-analysis techniques into their malware to thwart analytical efforts. In this study, we introduce the "Malware Analysis with Dynamic Extraction (MADE)" framework, a hybrid binary analysis tool devised to procure datasets from advanced malware incorporating Anti-Analysis techniques. The MADE framework has the proficiency to autonomously execute dynamic analysis on binaries, encompassing those laden with Anti-VM and Anti-Debugging defenses. Experimental results substantiate that the MADE framework can effectively circumvent over 90% of diverse malware implementations using Anti-Analysis techniques and can adeptly extract relevant datasets.

Probabilistic K-nearest neighbor classifier for detection of malware in android mobile (안드로이드 모바일 악성 앱 탐지를 위한 확률적 K-인접 이웃 분류기)

  • Kang, Seungjun;Yoon, Ji Won
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.25 no.4
    • /
    • pp.817-827
    • /
    • 2015
  • In this modern society, people are having a close relationship with smartphone. This makes easier for hackers to gain the user's information by installing the malware in the user's smartphone without the user's authority. This kind of action are threats to the user's privacy. The malware characteristics are different to the general applications. It requires the user's authority. In this paper, we proposed a new classification method of user requirements method by each application using the Principle Component Analysis(PCA) and Probabilistic K-Nearest Neighbor(PKNN) methods. The combination of those method outputs the improved result to classify between malware and general applications. By using the K-fold Cross Validation, the measurement precision of PKNN is improved compare to the previous K-Nearest Neighbor(KNN). The classification which difficult to solve by KNN also can be solve by PKNN with optimizing the discovering the parameter k and ${\beta}$. Also the sample that has being use in this experiment is based on the Contagio.

A Study on Classification of Variant Malware Family Based on ResNet-Variational AutoEncoder (ResNet-Variational AutoEncoder기반 변종 악성코드 패밀리 분류 연구)

  • Lee, Young-jeon;Han, Myung-Mook
    • Journal of Internet Computing and Services
    • /
    • v.22 no.2
    • /
    • pp.1-9
    • /
    • 2021
  • Traditionally, most malicious codes have been analyzed using feature information extracted by domain experts. However, this feature-based analysis method depends on the analyst's capabilities and has limitations in detecting variant malicious codes that have modified existing malicious codes. In this study, we propose a ResNet-Variational AutoEncder-based variant malware classification method that can classify a family of variant malware without domain expert intervention. The Variational AutoEncoder network has the characteristics of creating new data within a normal distribution and understanding the characteristics of the data well in the learning process of training data provided as input values. In this study, important features of malicious code could be extracted by extracting latent variables in the learning process of Variational AutoEncoder. In addition, transfer learning was performed to better learn the characteristics of the training data and increase the efficiency of learning. The learning parameters of the ResNet-152 model pre-trained with the ImageNet Dataset were transferred to the learning parameters of the Encoder Network. The ResNet-Variational AutoEncoder that performed transfer learning showed higher performance than the existing Variational AutoEncoder and provided learning efficiency. Meanwhile, an ensemble model, Stacking Classifier, was used as a method for classifying variant malicious codes. As a result of learning the Stacking Classifier based on the characteristic data of the variant malware extracted by the Encoder Network of the ResNet-VAE model, an accuracy of 98.66% and an F1-Score of 98.68 were obtained.

Visualization of Malwares for Classification Through Deep Learning (딥러닝 기술을 활용한 멀웨어 분류를 위한 이미지화 기법)

  • Kim, Hyeonggyeom;Han, Seokmin;Lee, Suchul;Lee, Jun-Rak
    • Journal of Internet Computing and Services
    • /
    • v.19 no.5
    • /
    • pp.67-75
    • /
    • 2018
  • According to Symantec's Internet Security Threat Report(2018), Internet security threats such as Cryptojackings, Ransomwares, and Mobile malwares are rapidly increasing and diversifying. It means that detection of malwares requires not only the detection accuracy but also versatility. In the past, malware detection technology focused on qualitative performance due to the problems such as encryption and obfuscation. However, nowadays, considering the diversity of malware, versatility is required in detecting various malwares. Additionally the optimization is required in terms of computing power for detecting malware. In this paper, we present Stream Order(SO)-CNN and Incremental Coordinate(IC)-CNN, which are malware detection schemes using CNN(Convolutional Neural Network) that effectively detect intelligent and diversified malwares. The proposed methods visualize each malware binary file onto a fixed sized image. The visualized malware binaries are learned through GoogLeNet to form a deep learning model. Our model detects and classifies malwares. The proposed method reveals better performance than the conventional method.

Research on text mining based malware analysis technology using string information (문자열 정보를 활용한 텍스트 마이닝 기반 악성코드 분석 기술 연구)

  • Ha, Ji-hee;Lee, Tae-jin
    • Journal of Internet Computing and Services
    • /
    • v.21 no.1
    • /
    • pp.45-55
    • /
    • 2020
  • Due to the development of information and communication technology, the number of new / variant malicious codes is increasing rapidly every year, and various types of malicious codes are spreading due to the development of Internet of things and cloud computing technology. In this paper, we propose a malware analysis method based on string information that can be used regardless of operating system environment and represents library call information related to malicious behavior. Attackers can easily create malware using existing code or by using automated authoring tools, and the generated malware operates in a similar way to existing malware. Since most of the strings that can be extracted from malicious code are composed of information closely related to malicious behavior, it is processed by weighting data features using text mining based method to extract them as effective features for malware analysis. Based on the processed data, a model is constructed using various machine learning algorithms to perform experiments on detection of malicious status and classification of malicious groups. Data has been compared and verified against all files used on Windows and Linux operating systems. The accuracy of malicious detection is about 93.5%, the accuracy of group classification is about 90%. The proposed technique has a wide range of applications because it is relatively simple, fast, and operating system independent as a single model because it is not necessary to build a model for each group when classifying malicious groups. In addition, since the string information is extracted through static analysis, it can be processed faster than the analysis method that directly executes the code.

The attacker group feature extraction framework : Authorship Clustering based on Genetic Algorithm for Malware Authorship Group Identification (공격자 그룹 특징 추출 프레임워크 : 악성코드 저자 그룹 식별을 위한 유전 알고리즘 기반 저자 클러스터링)

  • Shin, Gun-Yoon;Kim, Dong-Wook;Han, Myung-Mook
    • Journal of Internet Computing and Services
    • /
    • v.21 no.2
    • /
    • pp.1-8
    • /
    • 2020
  • Recently, the number of APT(Advanced Persistent Threats) attack using malware has been increasing, and research is underway to prevent and detect them. While it is important to detect and block attacks before they occur, it is also important to make an effective response through an accurate analysis for attack case and attack type, these respond which can be determined by analyzing the attack group of such attacks. Therefore, this paper propose a framework based on genetic algorithm for analyzing malware and understanding attacker group's features. The framework uses decompiler and disassembler to extract related code in collected malware, and analyzes information related to author through code analysis. Malware has unique characteristics that only it has, which can be said to be features that can identify the author or attacker groups of that malware. So, we select specific features only having attack group among the various features extracted from binary and source code through the authorship clustering method, and apply genetic algorithm to accurate clustering to infer specific features. Also, we find features which based on characteristics each group of malware authors has that can express each group, and create profiles to verify that the group of authors is correctly clustered. In this paper, we do experiment about author classification using genetic algorithm and finding specific features to express author characteristic. In experiment result, we identified an author classification accuracy of 86% and selected features to be used for authorship analysis among the information extracted through genetic algorithm.