• Title/Summary/Keyword: malware similarity

Search Result 40, Processing Time 0.02 seconds

Multi-Modal Based Malware Similarity Estimation Method (멀티모달 기반 악성코드 유사도 계산 기법)

  • Yoo, Jeong Do;Kim, Taekyu;Kim, In-sung;Kim, Huy Kang
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.29 no.2
    • /
    • pp.347-363
    • /
    • 2019
  • Malware has its own unique behavior characteristics, like DNA for living things. To respond APT (Advanced Persistent Threat) attacks in advance, it needs to extract behavioral characteristics from malware. To this end, it needs to do classification for each malware based on its behavioral similarity. In this paper, various similarity of Windows malware is estimated; and based on these similarity values, malware's family is predicted. The similarity measures used in this paper are as follows: 'TF-IDF cosine similarity', 'Nilsimsa similarity', 'malware function cosine similarity' and 'Jaccard similarity'. As a result, we find the prediction rate for each similarity measure is widely different. Although, there is no similarity measure which can be applied to malware classification with high accuracy, this result can be helpful to select a similarity measure to classify specific malware family.

Development of a Performance Evaluation Model on Similarity Measurement Method of Malware (악성코드 유사도 측정 기법의 성능 평가 모델 개발)

  • Chu, Sung-Taek;Kim, HeeSeok;Im, Kwang-Hyuk;Kim, Kyu-Il;Seo, Chang-Ho
    • The Journal of the Korea Contents Association
    • /
    • v.14 no.10
    • /
    • pp.32-40
    • /
    • 2014
  • While there is a great demand for malware classification to reduce the time required in malware analysis and find a new type of malware, various similarity measurement methods of malware to classify a lot of malwares have been proposed. But, the existing methods to measure similarity just represented the classification results by them and have not carried out performance comparison with other methods. This is because an evaluation model to compare the performance of similarity measurement methods is non-existent. In this paper, we propose a new performance evaluation model on similarity measurement methods of malware by using two indicators: success rate and degree of confidence. In addition, we compare and evaluate the performance of existing similarity measurement methods by using these two indicators.

Improvement of Performance of Malware Similarity Analysis by the Sequence Alignment Technique (서열 정렬 기법을 이용한 악성코드 유사도 분석의 성능 개선)

  • Cho, In Kyeom;Im, Eul Gyu
    • KIISE Transactions on Computing Practices
    • /
    • v.21 no.3
    • /
    • pp.263-268
    • /
    • 2015
  • Malware variations could be defined as malicious executable files that have similar functions but different structures. In order to classify the variations, this paper analyzed sequence alignment, the method used in Bioinformatics. This method found common parts of the Malwares' API call information. This method's performance is dependent on the API call information's length; if the length is too long, the performance should be very poor. Therefore we removed the repeated patterns in API call information in order to improve the performance of sequence alignment analysis, before the method was applied. Finally the similarity between malware was analyzed using sequence alignment. The experimental results with the real malware samples were presented.

Method of Similarity Hash-Based Malware Family Classification (유사성 해시 기반 악성코드 유형 분류 기법)

  • Kim, Yun-jeong;Kim, Moon-sun;Lee, Man-hee
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.32 no.5
    • /
    • pp.945-954
    • /
    • 2022
  • Billions of malicious codes are detected every year, of which only 0.01% are new types of malware. In this situation, an effective malware type classification tool is needed, but previous studies have limitations in quickly analyzing a large amount of malicious code because it requires a complex and massive amount of data pre-processing. To solve this problem, this paper proposes a method to classify the types of malicious code based on the similarity hash without complex data preprocessing. This approach trains the XGBoost model based on the similarity hash information of the malware. To evaluate this approach, we used the BIG-15 dataset, which is widely used in the field of malware classification. As a result, the malicious code was classified with an accuracy of 98.9% also, identified 3,432 benign files with 100% accuracy. This result is superior to most recent studies using complex preprocessing and deep learning models. Therefore, it is expected that more efficient malware classification is possible using the proposed approach.

A Malware Variants Detection Method based on Behavior Similarity (행위 유사도 기반 변종 악성코드 탐지 방법)

  • Joe, Woo-Jin;Kim, Hyong-Shik
    • Smart Media Journal
    • /
    • v.8 no.4
    • /
    • pp.25-32
    • /
    • 2019
  • While the development of the Internet has made information more accessible, this also has provided a variety of intrusion paths for malicious programs. Traditional Signature-based malware-detectors cannot identify new malware. Although Dynamic Analysis may analyze new malware that the Signature cannot do, it still is inefficient for detecting variants while most of the behaviors are similar. In this paper, we propose a detection method using behavioral similarity with existing malicious codes, assuming that they have parallel patterns. The proposed method is to extract the behavior targets common to variants and detect programs that have similar targets. Here, we verified behavioral similarities between variants through the conducted experiments with 1,000 malicious codes.

A Study on Malware Identification System Using Static Analysis Based Machine Learning Technique (정적 분석 기반 기계학습 기법을 활용한 악성코드 식별 시스템 연구)

  • Kim, Su-jeong;Ha, Ji-hee;Oh, Soo-hyun;Lee, Tae-jin
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.29 no.4
    • /
    • pp.775-784
    • /
    • 2019
  • Malware infringement attacks are continuously increasing in various environments such as mobile, IOT, windows and mac due to the emergence of new and variant malware, and signature-based countermeasures have limitations in detection of malware. In addition, analytical performance is deteriorating due to obfuscation, packing, and anti-VM technique. In this paper, we propose a system that can detect malware based on machine learning by using similarity hashing-based pattern detection technique and static analysis after file classification according to packing. This enables more efficient detection because it utilizes both pattern-based detection, which is well-known malware detection, and machine learning-based detection technology, which is advantageous for detecting new and variant malware. The results of this study were obtained by detecting accuracy of 95.79% or more for benign sample files and malware sample files provided by the AI-based malware detection track of the Information Security R&D Data Challenge 2018 competition. In the future, it is expected that it will be possible to build a system that improves detection performance by applying a feature vector and a detection method to the characteristics of a packed file.

Proposal of Process Hollowing Attack Detection Using Process Virtual Memory Data Similarity (프로세스 가상 메모리 데이터 유사성을 이용한 프로세스 할로윙 공격 탐지)

  • Lim, Su Min;Im, Eul Gyu
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.29 no.2
    • /
    • pp.431-438
    • /
    • 2019
  • Fileless malware uses memory injection attacks to hide traces of payloads to perform malicious works. During the memory injection attack, an attack named "process hollowing" is a method of creating paused benign process like system processes. And then injecting a malicious payload into the benign process allows malicious behavior by pretending to be a normal process. In this paper, we propose a method to detect the memory injection regardless of whether or not the malicious action is actually performed when a process hollowing attack occurs. The replication process having same execution condition as the process of suspending the memory injection is executed, the data set belonging to each process virtual memory area is compared using the fuzzy hash, and the similarity is calculated.

Generating Malware DNA to Classify the Similar Malwares (악성코드 DNA 생성을 통한 유사 악성코드 분류기법)

  • Han, Byoung-Jin;Choi, Young-Han;Bae, Byung-Chul
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.23 no.4
    • /
    • pp.679-694
    • /
    • 2013
  • According to the national information security white paper 2013, the number of hacking attempt in 2012 is 17,570 which is increased by 67.4% than in 2011, and it has been increasing year after year. The cause of this increase is considered as pursuit of monetary profit and diversification techniques of infection. However, because the development of malicious code faster than the increase in the number of experts to analyze and respond the malware, it is difficult to respond to security threats due to malicious code. So, the interest on automatic analysis tools is increasing. In this paper, we proposed the method of malware classification by similarity using malware DNA. It helps the experts to reduce the analysis time, to increase the correctness. The proposed method generates 'Malware DNA' from extracted features, and then calculates similarity to classify the malwares.

Malware Family Recommendation using Multiple Sequence Alignment (다중 서열 정렬 기법을 이용한 악성코드 패밀리 추천)

  • Cho, In Kyeom;Im, Eul Gyu
    • Journal of KIISE
    • /
    • v.43 no.3
    • /
    • pp.289-295
    • /
    • 2016
  • Malware authors spread malware variants in order to evade detection. It's hard to detect malware variants using static analysis. Therefore dynamic analysis based on API call information is necessary. In this paper, we proposed a malware family recommendation method to assist malware analysts in classifying malware variants. Our proposed method extract API call information of malware families by dynamic analysis. Then the multiple sequence alignment technique was applied to the extracted API call information. A signature of each family was extracted from the alignment results. By the similarity of the extracted signatures, our proposed method recommends three family candidates for unknown malware. We also measured the accuracy of our proposed method in an experiment using real malware samples.

Function partitioning methods for malware variant similarity comparison (변종 악성코드 유사도 비교를 위한 코드영역의 함수 분할 방법)

  • Park, Chan-Kyu;Kim, Hyong-Shik;Lee, Tae Jin;Ryou, Jae-Cheol
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.25 no.2
    • /
    • pp.321-330
    • /
    • 2015
  • There have been found many modified malwares which could avoid detection simply by replacing a sequence of characters or a part of code. Since the existing anti-virus program performs signature-based analysis, it is difficult to detect a malware which is slightly different from the well-known malware. This paper suggests a method of detecting modified malwares by extending a hash-value based code comparison. We generated hash values for individual functions and individual code blocks as well as the whole code, and thus use those values to find whether a pair of codes are similar in a certain degree. We also eliminated some numeric data such as constant and address before generating hash values to avoid incorrectness incurred from them. We found that the suggested method could effectively find inherent similarity between original malware and its derived ones.