• Title/Summary/Keyword: 악성코드 이미지화

Search Result 20, Processing Time 0.023 seconds

CNN-Based Malware Detection Using Opcode Frequency-Based Image (Opcode 빈도수 기반 악성코드 이미지를 활용한 CNN 기반 악성코드 탐지 기법)

  • Ko, Seok Min;Yang, JaeHyeok;Choi, WonJun;Kim, TaeGuen
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.32 no.5
    • /
    • pp.933-943
    • /
    • 2022
  • As the Internet develops and the utilization rate of computers increases, the threats posed by malware keep increasing. This leads to the demand for a system to automatically analyzes a large amount of malware. In this paper, an automatic malware analysis technique using a deep learning algorithm is introduced. Our proposed method uses CNN (Convolutional Neural Network) to analyze the malicious features represented as images. To reflect semantic information of malware for detection, our method uses the opcode frequency data of binary for image generation, rather than using bytes of binary. As a result of the experiments using the datasets consisting of 20,000 samples, it was found that the proposed method can detect malicious codes with 91% accuracy.

Algorithm for Detecting Malicious Code in Mobile Environment Using Deep Learning (딥러닝을 이용한 모바일 환경에서 변종 악성코드 탐지 알고리즘)

  • Woo, Sung-hee;Cho, Young-bok
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2018.10a
    • /
    • pp.306-308
    • /
    • 2018
  • This paper proposes a variant malicious code detection algorithm in a mobile environment using a deep learning algorithm. In order to solve the problem of malicious code detection method based on Android, we have proved high detection rate through signature based malicious code detection method and realtime malicious file detection algorithm using machine learning method.

  • PDF

Malware detection methodology through on pre-training and transfer learning for AutoEncoder based deobfuscation (AutoEncoder 기반 역난독화 사전학습 및 전이학습을 통한 악성코드 탐지 방법론)

  • Jang, Jae-Seok;Ku, Bon-Jae;Eom, Sung-Jun;Han, Ji-Hyeong
    • Annual Conference of KIPS
    • /
    • 2022.11a
    • /
    • pp.905-907
    • /
    • 2022
  • 악성코드를 분석하는 기존 기법인 정적분석은 빠르고 효율적으로 악성코드를 탐지할 수 있지만 난독화된 파일에 취약한 반면,, 동적분석은 난독화된 파일에 적합하지만 느리고 비용이 많이 든다는 단점을 가진다. 본 연구에서는 두 분석 기법의 단점을 해결하기 위해 딥러닝 모델을 활용한 난독화에 강한 정적분석 모델을 제안하였다. 본 연구에서 제안한 방법은 원본 코드 및 난독화된 파일을 grayscale 이미지로 변환하여 데이터셋을 구축하고 AutoEncoder 를 사전학습시켜 encoder 가 원본 파일과 난독화된 파일로부터 원본 파일의 특징을 추출할 수 있도록 한 이후, encoder 의 output 을 fully connected layer 의 입력으로 넣고 전이학습시켜 악성코드를 탐지하도록 하였다. 본 연구에서는 제안한 방법론은 난독화된 파일에서 악성코드를 탐지하는 성능을 F1 score 기준 14.17% 포인트 향상시켰고, 난독화된 파일과 원본 파일을 전체를 합친 데이터셋에서도 악성코드 탐지 성능을 F1 score 기준 7.22% 포인트 향상시켰다.

Technique for Malicious Code Detection using Stacked Convolution AutoEncoder (적층 콘볼루션 오토엔코더를 활용한 악성코드 탐지 기법)

  • Choi, Hyun-Woong;Heo, Junyoung
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.20 no.2
    • /
    • pp.39-44
    • /
    • 2020
  • Malicious codes cause damage to equipments while avoiding detection programs(vaccines). The reason why it is difficult to detect such these new malwares using the existing vaccines is that they use "signature-based" detection techniques. these techniques effectively detect already known malicious codes, however, they have problems about detecting new malicious codes. Therefore, most of vaccines have recognized these drawbacks and additionally make use of "heuristic" techniques. This paper proposes a technology to detecting unknown malicious code using deep learning. In addition, detecting malware skill using Supervisor Learning approach has a clear limitation. This is because, there are countless files that can be run on the devices. Thus, this paper utilizes Stacked Convolution AutoEncoder(SCAE) known as Semi-Supervisor Learning. To be specific, byte information of file was extracted, imaging was carried out, and these images were learned to model. Finally, Accuracy of 98.84% was achieved as a result of inferring unlearned malicious and non-malicious codes to the model.

Using Image Visualization Based Malware Detection Techniques for Customer Churn Prediction in Online Games (악성코드의 이미지 시각화 탐지 기법을 적용한 온라인 게임상에서의 이탈 유저 탐지 모델)

  • Yim, Ha-bin;Kim, Huy-kang;Kim, Seung-joo
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.27 no.6
    • /
    • pp.1431-1439
    • /
    • 2017
  • In the security field, log analysis is important to detect malware or abnormal behavior. Recently, image visualization techniques for malware dectection becomes to a major part of security. These techniques can also be used in online games. Users can leave a game when they felt bad experience from game bot, automatic hunting programs, malicious code, etc. This churning can damage online game's profit and longevity of service if game operators cannot detect this kind of events in time. In this paper, we propose a new technique of PNG image conversion based churn prediction to improve the efficiency of data analysis for the first. By using this log compression technique, we can reduce the size of log files by 52,849 times smaller and increase the analysis speed without features analysis. Second, we apply data mining technique to predict user's churn with a real dataset from Blade & Soul developed by NCSoft. As a result, we can identify potential churners with a high accuracy of 97%.

Image-Based Machine Learning Model for Malware Detection on LLVM IR (LLVM IR 대상 악성코드 탐지를 위한 이미지 기반 머신러닝 모델)

  • Kyung-bin Park;Yo-seob Yoon;Baasantogtokh Duulga;Kang-bin Yim
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.34 no.1
    • /
    • pp.31-40
    • /
    • 2024
  • Recently, static analysis-based signature and pattern detection technologies have limitations due to the advanced IT technologies. Moreover, It is a compatibility problem of multiple architectures and an inherent problem of signature and pattern detection. Malicious codes use obfuscation and packing techniques to hide their identity, and they also avoid existing static analysis-based signature and pattern detection techniques such as code rearrangement, register modification, and branching statement addition. In this paper, We propose an LLVM IR image-based automated static analysis of malicious code technology using machine learning to solve the problems mentioned above. Whether binary is obfuscated or packed, it's decompiled into LLVM IR, which is an intermediate representation dedicated to static analysis and optimization. "Therefore, the LLVM IR code is converted into an image before being fed to the CNN-based transfer learning algorithm ResNet50v2 supported by Keras". As a result, we present a model for image-based detection of malicious code.

Website Falsification Detection System Based on Image and Code Analysis for Enhanced Security Monitoring and Response (이미지 및 코드분석을 활용한 보안관제 지향적 웹사이트 위·변조 탐지 시스템)

  • Kim, Kyu-Il;Choi, Sang-Soo;Park, Hark-Soo;Ko, Sang-Jun;Song, Jung-Suk
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.24 no.5
    • /
    • pp.871-883
    • /
    • 2014
  • New types of attacks that mainly compromise the public, portal and financial websites for the purpose of economic profit or national confusion are being emerged and evolved. In addition, in case of 'drive by download' attack, if a host just visits the compromised websites, then the host is infected by a malware. Website falsification detection system is one of the most powerful solutions to cope with such cyber threats that try to attack the websites. Many domestic CERTs including NCSC (National Cyber Security Center) that carry out security monitoring and response service deploy it into the target organizations. However, the existing techniques for the website falsification detection system have practical problems in that their time complexity is high and the detection accuracy is not high. In this paper, we propose website falsification detection system based on image and code analysis for improving the performance of the security monitoring and response service in CERTs. The proposed system focuses on improvement of the accuracy as well as the rapidity in detecting falsification of the target websites.

A Study on Classification of CNN-based Linux Malware using Image Processing Techniques (영상처리기법을 이용한 CNN 기반 리눅스 악성코드 분류 연구)

  • Kim, Se-Jin;Kim, Do-Yeon;Lee, Hoo-Ki;Lee, Tae-Jin
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.21 no.9
    • /
    • pp.634-642
    • /
    • 2020
  • With the proliferation of Internet of Things (IoT) devices, using the Linux operating system in various architectures has increased. Also, security threats against Linux-based IoT devices are increasing, and malware variants based on existing malware are constantly appearing. In this paper, we propose a system where the binary data of a visualized Executable and Linkable Format (ELF) file is applied to Local Binary Pattern (LBP) image processing techniques and a median filter to classify malware in a Convolutional Neural Network (CNN). As a result, the original image showed the highest accuracy and F1-score at 98.77%, and reproducibility also showed the highest score at 98.55%. For the median filter, the highest precision was 99.19%, and the lowest false positive rate was 0.008%. Using the LBP technique confirmed that the overall result was lower than putting the original ELF file through the median filter. When the results of putting the original file through image processing techniques were classified by majority, it was confirmed that the accuracy, precision, F1-score, and false positive rate were better than putting the original file through the median filter. In the future, the proposed system will be used to classify malware families or add other image processing techniques to improve the accuracy of majority vote classification. Or maybe we mean "the use of Linux O/S distributions for various architectures has increased" instead? If not, please rephrase as intended.

Extracting Scheme of Compiler Information using Convolutional Neural Networks in Stripped Binaries (스트립 바이너리에서 합성곱 신경망을 이용한 컴파일러 정보 추출 기법)

  • Lee, Jungsoo;Choi, Hyunwoong;Heo, Junyeong
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.21 no.4
    • /
    • pp.25-29
    • /
    • 2021
  • The strip binary is a binary from which debug symbol information has been deleted, and therefore it is difficult to analyze the binary through techniques such as reverse engineering. Traditional binary analysis tools rely on debug symbolic information to analyze binaries, making it difficult to detect or analyze malicious code with features of these strip binaries. In order to solve this problem, the need for a technology capable of effectively extracting the information of the strip binary has emerged. In this paper, focusing on the fact that the byte code of the binary file is generated very differently depending on compiler version, optimazer level, etc. For effective compiler version extraction, the entire byte code is read and imaged as the target of the stripped binaries and this is applied to the convolution neural network. Finally, we achieve an accuracy of 93.5%, and we provide an opportunity to analyze stripped binary more effectively than before.

Visualization of Malwares for Classification Through Deep Learning (딥러닝 기술을 활용한 멀웨어 분류를 위한 이미지화 기법)

  • Kim, Hyeonggyeom;Han, Seokmin;Lee, Suchul;Lee, Jun-Rak
    • Journal of Internet Computing and Services
    • /
    • v.19 no.5
    • /
    • pp.67-75
    • /
    • 2018
  • According to Symantec's Internet Security Threat Report(2018), Internet security threats such as Cryptojackings, Ransomwares, and Mobile malwares are rapidly increasing and diversifying. It means that detection of malwares requires not only the detection accuracy but also versatility. In the past, malware detection technology focused on qualitative performance due to the problems such as encryption and obfuscation. However, nowadays, considering the diversity of malware, versatility is required in detecting various malwares. Additionally the optimization is required in terms of computing power for detecting malware. In this paper, we present Stream Order(SO)-CNN and Incremental Coordinate(IC)-CNN, which are malware detection schemes using CNN(Convolutional Neural Network) that effectively detect intelligent and diversified malwares. The proposed methods visualize each malware binary file onto a fixed sized image. The visualized malware binaries are learned through GoogLeNet to form a deep learning model. Our model detects and classifies malwares. The proposed method reveals better performance than the conventional method.