• Title/Summary/Keyword: Malware

Search Result 543, Processing Time 0.021 seconds

Deep Learning in Drebin: Android malware Image Texture Median Filter Analysis and Detection

  • Luo, Shi-qi;Ni, Bo;Jiang, Ping;Tian, Sheng-wei;Yu, Long;Wang, Rui-jin
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.13 no.7
    • /
    • pp.3654-3670
    • /
    • 2019
  • This paper proposes an Image Texture Median Filter (ITMF) to analyze and detect Android malware on Drebin datasets. We design a model of "ITMF" combined with Image Processing of Median Filter (MF) to reflect the similarity of the malware binary file block. At the same time, using the MAEVS (Malware Activity Embedding in Vector Space) to reflect the potential dynamic activity of malware. In order to ensure the improvement of the classification accuracy, the above-mentioned features(ITMF feature and MAEVS feature)are studied to train Restricted Boltzmann Machine (RBM) and Back Propagation (BP). The experimental results show that the model has an average accuracy rate of 95.43% with few false alarms. to Android malicious code, which is significantly higher than 95.2% of without ITMF, 93.8% of shallow machine learning model SVM, 94.8% of KNN, 94.6% of ANN.

Malware Classification using Dynamic Analysis with Deep Learning

  • Asad Amin;Muhammad Nauman Durrani;Nadeem Kafi;Fahad Samad;Abdul Aziz
    • International Journal of Computer Science & Network Security
    • /
    • v.23 no.8
    • /
    • pp.49-62
    • /
    • 2023
  • There has been a rapid increase in the creation and alteration of new malware samples which is a huge financial risk for many organizations. There is a huge demand for improvement in classification and detection mechanisms available today, as some of the old strategies like classification using mac learning algorithms were proved to be useful but cannot perform well in the scalable auto feature extraction scenario. To overcome this there must be a mechanism to automatically analyze malware based on the automatic feature extraction process. For this purpose, the dynamic analysis of real malware executable files has been done to extract useful features like API call sequence and opcode sequence. The use of different hashing techniques has been analyzed to further generate images and convert them into image representable form which will allow us to use more advanced classification approaches to classify huge amounts of images using deep learning approaches. The use of deep learning algorithms like convolutional neural networks enables the classification of malware by converting it into images. These images when fed into the CNN after being converted into the grayscale image will perform comparatively well in case of dynamic changes in malware code as image samples will be changed by few pixels when classified based on a greyscale image. In this work, we used VGG-16 architecture of CNN for experimentation.

MalDC: Malicious Software Detection and Classification using Machine Learning

  • Moon, Jaewoong;Kim, Subin;Park, Jangyong;Lee, Jieun;Kim, Kyungshin;Song, Jaeseung
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.16 no.5
    • /
    • pp.1466-1488
    • /
    • 2022
  • Recently, the importance and necessity of artificial intelligence (AI), especially machine learning, has been emphasized. In fact, studies are actively underway to solve complex and challenging problems through the use of AI systems, such as intelligent CCTVs, intelligent AI security systems, and AI surgical robots. Information security that involves analysis and response to security vulnerabilities of software is no exception to this and is recognized as one of the fields wherein significant results are expected when AI is applied. This is because the frequency of malware incidents is gradually increasing, and the available security technologies are limited with regard to the use of software security experts or source code analysis tools. We conducted a study on MalDC, a technique that converts malware into images using machine learning, MalDC showed good performance and was able to analyze and classify different types of malware. MalDC applies a preprocessing step to minimize the noise generated in the image conversion process and employs an image augmentation technique to reinforce the insufficient dataset, thus improving the accuracy of the malware classification. To verify the feasibility of our method, we tested the malware classification technique used by MalDC on a dataset provided by Microsoft and malware data collected by the Korea Internet & Security Agency (KISA). Consequently, an accuracy of 97% was achieved.

A Method to Find the Core Node Engaged in Malware Propagation in the Malware Distribution Network Hidden in the Web (웹에 숨겨진 악성코드 배포 네트워크에서 악성코드 전파 핵심노드를 찾는 방안)

  • Kim Sung Jin
    • Convergence Security Journal
    • /
    • v.23 no.2
    • /
    • pp.3-10
    • /
    • 2023
  • In the malware distribution network existing on the web, there is a central node that plays a key role in distributing malware. If you find and block this node, you can effectively block the propagation of malware. In this study, a centrality search method applied with risk analysis in a complex network is proposed, and a method for finding a core node in a malware distribution network is introduced through this approach. In addition, there is a big difference between a benign network and a malicious network in terms of in-degree and out-degree, and also in terms of network layout. Through these characteristics, we can discriminate between malicious and benign networks.

De-cloaking Malicious Activities in Smartphones Using HTTP Flow Mining

  • Su, Xin;Liu, Xuchong;Lin, Jiuchuang;He, Shiming;Fu, Zhangjie;Li, Wenjia
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.11 no.6
    • /
    • pp.3230-3253
    • /
    • 2017
  • Android malware steals users' private information, and embedded unsafe advertisement (ad) libraries, which execute unsafe code causing damage to users. The majority of such traffic is HTTP and is mixed with other normal traffic, which makes the detection of malware and unsafe ad libraries a challenging problem. To address this problem, this work describes a novel HTTP traffic flow mining approach to detect and categorize Android malware and unsafe ad library. This work designed AndroCollector, which can automatically execute the Android application (app) and collect the network traffic traces. From these traces, this work extracts HTTP traffic features along three important dimensions: quantitative, timing, and semantic and use these features for characterizing malware and unsafe ad libraries. Based on these HTTP traffic features, this work describes a supervised classification scheme for detecting malware and unsafe ad libraries. In addition, to help network operators, this work describes a fine-grained categorization method by generating fingerprints from HTTP request methods for each malware family and unsafe ad libraries. This work evaluated the scheme using HTTP traffic traces collected from 10778 Android apps. The experimental results show that the scheme can detect malware with 97% accuracy and unsafe ad libraries with 95% accuracy when tested on the popular third-party Android markets.

Dynamic RNN-CNN malware classifier correspond with Random Dimension Input Data (임의 차원 데이터 대응 Dynamic RNN-CNN 멀웨어 분류기)

  • Lim, Geun-Young;Cho, Young-Bok
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.23 no.5
    • /
    • pp.533-539
    • /
    • 2019
  • This study proposes a malware classification model that can handle arbitrary length input data using the Microsoft Malware Classification Challenge dataset. We are based on imaging existing data from malware. The proposed model generates a lot of images when malware data is large, and generates a small image of small data. The generated image is learned as time series data by Dynamic RNN. The output value of the RNN is classified into malware by using only the highest weighted output by applying the Attention technique, and learning the RNN output value by Residual CNN again. Experiments on the proposed model showed a Micro-average F1 score of 92% in the validation data set. Experimental results show that the performance of a model capable of learning and classifying arbitrary length data can be verified without special feature extraction and dimension reduction.

A Study on the Detection of Malware That Extracts Account IDs and Passwords on Game Sites and Possible Countermeasures Through Analysis (게임 사이트의 계정과 비밀번호 유출 악성코드 분석을 통한 탐지 및 대응방안 연구)

  • Lee, Seung-Won;Roh, Young-Sup;Kim, Woo-Suk;Lee, Mi-Hwa;Han, Kook-Il
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.22 no.2
    • /
    • pp.283-293
    • /
    • 2012
  • A new type of malware that extracts personal and account data over an extended period of time and that apparently is resistant to detection by vaccines has been identified. Generally, a malware is installed on a computer through network-to-network connections by utilizing Web vulnerabilities that contain injection, XSS, broken authentication and session management, or insecure direct-object references, among others. After the malware executes registration of an arbitrary service and an arbitrary process on a computer, it then periodically communicates the collected confidential information to a hacker. This paper is a systematic approach to analyzing a new type of malware called "winweng," a kind of worm that frequently made appearances during the first half of 2011. The research describes how the malware came to be in circulation, how it infects computers, how its operations expose its existence and suggests improvements in responses and countermeasures. Keywords: Malware, Worm, Winweng, SNORT.

Method of Similarity Hash-Based Malware Family Classification (유사성 해시 기반 악성코드 유형 분류 기법)

  • Kim, Yun-jeong;Kim, Moon-sun;Lee, Man-hee
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.32 no.5
    • /
    • pp.945-954
    • /
    • 2022
  • Billions of malicious codes are detected every year, of which only 0.01% are new types of malware. In this situation, an effective malware type classification tool is needed, but previous studies have limitations in quickly analyzing a large amount of malicious code because it requires a complex and massive amount of data pre-processing. To solve this problem, this paper proposes a method to classify the types of malicious code based on the similarity hash without complex data preprocessing. This approach trains the XGBoost model based on the similarity hash information of the malware. To evaluate this approach, we used the BIG-15 dataset, which is widely used in the field of malware classification. As a result, the malicious code was classified with an accuracy of 98.9% also, identified 3,432 benign files with 100% accuracy. This result is superior to most recent studies using complex preprocessing and deep learning models. Therefore, it is expected that more efficient malware classification is possible using the proposed approach.

Analysis of Malware Group Classification with eXplainable Artificial Intelligence (XAI기반 악성코드 그룹분류 결과 해석 연구)

  • Kim, Do-yeon;Jeong, Ah-yeon;Lee, Tae-jin
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.31 no.4
    • /
    • pp.559-571
    • /
    • 2021
  • Along with the increase prevalence of computers, the number of malware distributions by attackers to ordinary users has also increased. Research to detect malware continues to this day, and in recent years, research on malware detection and analysis using AI is focused. However, the AI algorithm has a disadvantage that it cannot explain why it detects and classifies malware. XAI techniques have emerged to overcome these limitations of AI and make it practical. With XAI, it is possible to provide a basis for judgment on the final outcome of the AI. In this paper, we conducted malware group classification using XGBoost and Random Forest, and interpreted the results through SHAP. Both classification models showed a high classification accuracy of about 99%, and when comparing the top 20 API features derived through XAI with the main APIs of malware, it was possible to interpret and understand more than a certain level. In the future, based on this, a direct AI reliability improvement study will be conducted.

A Study on Machine Learning Based Anti-Analysis Technique Detection Using N-gram Opcode (N-gram Opcode를 활용한 머신러닝 기반의 분석 방지 보호 기법 탐지 방안 연구)

  • Kim, Hee Yeon;Lee, Dong Hoon
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.32 no.2
    • /
    • pp.181-192
    • /
    • 2022
  • The emergence of new malware is incapacitating existing signature-based malware detection techniques., and applying various anti-analysis techniques makes it difficult to analyze. Recent studies related to signature-based malware detection have limitations in that malware creators can easily bypass them. Therefore, in this study, we try to build a machine learning model that can detect and classify the anti-analysis techniques of packers applied to malware, not using the characteristics of the malware itself. In this study, the n-gram opcodes are extracted from the malicious binary to which various anti-analysis techniques of the commercial packers are applied, and the features are extracted by using TF-IDF, and through this, each anti-analysis technique is detected and classified. In this study, real-world malware samples packed using The mida and VMProtect with multiple anti-analysis techniques were trained and tested with 6 machine learning models, and it constructed the optimal model showing 81.25% accuracy for The mida and 95.65% accuracy for VMProtect.