• Title/Summary/Keyword: Malware Detection

Search Result 302, Processing Time 0.032 seconds

Stacked Autoencoder Based Malware Feature Refinement Technology Research (Stacked Autoencoder 기반 악성코드 Feature 정제 기술 연구)

  • Kim, Hong-bi;Lee, Tae-jin
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.30 no.4
    • /
    • pp.593-603
    • /
    • 2020
  • The advent of malicious code has increased exponentially due to the spread of malicious code generation tools in accordance with the development of the network, but there is a limit to the response through existing malicious code detection methods. According to this situation, a machine learning-based malicious code detection method is evolving, and in this paper, the feature of data is extracted from the PE header for machine-learning-based malicious code detection, and then it is used to automate the malware through autoencoder. Research on how to extract the indicated features and feature importance. In this paper, 549 features composed of information such as DLL/API that can be identified from PE files that are commonly used in malware analysis are extracted, and autoencoder is used through the extracted features to improve the performance of malware detection in machine learning. It was proved to be successful in providing excellent accuracy and reducing the processing time by 2 times by effectively extracting the features of the data by compressively storing the data. The test results have been shown to be useful for classifying malware groups, and in the future, a classifier such as SVM will be introduced to continue research for more accurate malware detection.

An Enhanced method for detecting obfuscated Javascript Malware using automated Deobfuscation (난독화된 자바스크립트의 자동 복호화를 통한 악성코드의 효율적인 탐지 방안 연구)

  • Ji, Sun-Ho;Kim, Huy-Kang
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.22 no.4
    • /
    • pp.869-882
    • /
    • 2012
  • With the growth of Web services and the development of web exploit toolkits, web-based malware has increased dramatically. Using Javascript Obfuscation, recent web-based malware hide a malicious URL and the exploit code. Thus, pattern matching for network intrusion detection systems has difficulty of detecting malware. Though various methods have proposed to detect Javascript malware on a users' web browser, the overall detection is needed to counter advanced attacks such as APTs(Advanced Persistent Treats), aimed at penetration into a certain an organization's intranet. To overcome the limitation of previous pattern matching for network intrusion detection systems, a novel deobfuscating method to handle obfuscated Javascript is needed. In this paper, we propose a framework for effective hidden malware detection through an automated deobfuscation regardless of advanced obfuscation techniques with overriding JavaScript functions and a separate JavaScript interpreter through to improve jsunpack-n.

A Study on Robustness Evaluation and Improvement of AI Model for Malware Variation Analysis (악성코드 변종 분석을 위한 AI 모델의 Robust 수준 측정 및 개선 연구)

  • Lee, Eun-gyu;Jeong, Si-on;Lee, Hyun-woo;Lee, Tea-jin
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.32 no.5
    • /
    • pp.997-1008
    • /
    • 2022
  • Today, AI(Artificial Intelligence) technology is being extensively researched in various fields, including the field of malware detection. To introduce AI systems into roles that protect important decisions and resources, it must be a reliable AI model. AI model that dependent on training dataset should be verified to be robust against new attacks. Rather than generating new malware detection, attackers find malware detection that succeed in attacking by mass-producing strains of previously detected malware detection. Most of the attacks, such as adversarial attacks, that lead to misclassification of AI models, are made by slightly modifying past attacks. Robust models that can be defended against these variants is needed, and the Robustness level of the model cannot be evaluated with accuracy and recall, which are widely used as AI evaluation indicators. In this paper, we experiment a framework to evaluate robustness level by generating an adversarial sample based on one of the adversarial attacks, C&W attack, and to improve robustness level through adversarial training. Through experiments based on malware dataset in this study, the limitations and possibilities of the proposed method in the field of malware detection were confirmed.

A Study on Machine Learning Based Anti-Analysis Technique Detection Using N-gram Opcode (N-gram Opcode를 활용한 머신러닝 기반의 분석 방지 보호 기법 탐지 방안 연구)

  • Kim, Hee Yeon;Lee, Dong Hoon
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.32 no.2
    • /
    • pp.181-192
    • /
    • 2022
  • The emergence of new malware is incapacitating existing signature-based malware detection techniques., and applying various anti-analysis techniques makes it difficult to analyze. Recent studies related to signature-based malware detection have limitations in that malware creators can easily bypass them. Therefore, in this study, we try to build a machine learning model that can detect and classify the anti-analysis techniques of packers applied to malware, not using the characteristics of the malware itself. In this study, the n-gram opcodes are extracted from the malicious binary to which various anti-analysis techniques of the commercial packers are applied, and the features are extracted by using TF-IDF, and through this, each anti-analysis technique is detected and classified. In this study, real-world malware samples packed using The mida and VMProtect with multiple anti-analysis techniques were trained and tested with 6 machine learning models, and it constructed the optimal model showing 81.25% accuracy for The mida and 95.65% accuracy for VMProtect.

A research on detection techniques of Proxy DLL malware disguised as a Windows library : Focus on the case of Winnti (윈도우즈 라이브러리로 위장한 Proxy DLL 악성코드 탐지기법에 대한 연구 : Winnti 사례를 중심으로)

  • Koo, JunSeok;Kim, Huy Kang
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.25 no.6
    • /
    • pp.1385-1397
    • /
    • 2015
  • The Proxy DLL is a mechanism using a normal characteristics of Windows. Specific malware is executed via this mechanism after intrusion into a system which is targeted. If a intrusion of malware is successful, malware should be executed at least once. For execution, malware is disguised as a Windows Library. The malware of Winnti group is a good case for this. Winnti is a group of Chinese hacking groups identified by research in the fall of 2011 at Kaspersky Lab. Winnti group activities was negatively over the years to target the online video game industry, in this process by making a number of malware infected the online gaming company. In this paper, we perform research on detection techniques of Proxy DLL malware which is disguised as a Windows library through Winnti group case. The experiments that are undertaken to target real malware of Winnti show reliability of detection techniques.

API Feature Based Ensemble Model for Malware Family Classification (악성코드 패밀리 분류를 위한 API 특징 기반 앙상블 모델 학습)

  • Lee, Hyunjong;Euh, Seongyul;Hwang, Doosung
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.29 no.3
    • /
    • pp.531-539
    • /
    • 2019
  • This paper proposes the training features for malware family analysis and analyzes the multi-classification performance of ensemble models. We construct training data by extracting API and DLL information from malware executables and use Random Forest and XGBoost algorithms which are based on decision tree. API, API-DLL, and DLL-CM features for malware detection and family classification are proposed by analyzing frequently used API and DLL information from malware and converting high-dimensional features to low-dimensional features. The proposed feature selection method provides the advantages of data dimension reduction and fast learning. In performance comparison, the malware detection rate is 93.0% for Random Forest, the accuracy of malware family dataset is 92.0% for XGBoost, and the false positive rate of malware family dataset including benign is about 3.5% for Random Forest and XGBoost.

PE Header Characteristics Analysis Technique for Malware Detection (악성프로그램 탐지를 위한 PE헤더 특성 분석 기술)

  • Choi, Yang-Seo;Kim, Ik-Kyun;Oh, Jin-Tae;Ryu, Jae-Cheol
    • Convergence Security Journal
    • /
    • v.8 no.2
    • /
    • pp.63-70
    • /
    • 2008
  • In order not to make the malwares be easily analyzed, the hackers apply various anti-reversing and obfuscation techniques to the malwares. However, as the more anti-revering techniques are applied to the malwares the more abnormal characteristics in the PE file's header which are not shown in the normal PE file, could be observed. In this letter, a new malware detection technique is proposed based on this observation. For the malware detection, we define the Characteristics Vector(CV) which can represent the characteristics of a PE file's header. In the learning phase, we calculate the average CV(ACV) of malwares(ACVM) and normal files(ACVN). To detect the malwares we calculate the 2 Weighted Euclidean Distances(WEDs) from a file's CV to ACVs and they are used to decide whether the file is a malware or not. The proposed technique is very fast and detection rate is fairly high, so it could be applied to the network based attack detection and prevention devices. Moreover, this technique is could be used to detect the unknown malwares because it does not utilize a signature but the malware's characteristics.

  • PDF

Detecting Malware in Cyberphysical Systems Using Machine Learning: a Survey

  • Montes, F.;Bermejo, J.;Sanchez, L.E.;Bermejo, J.R.;Sicilia, J.A.
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.15 no.3
    • /
    • pp.1119-1139
    • /
    • 2021
  • Among the scientific literature, it has not been possible to find a consensus on the definition of the limits or properties that allow differentiating or grouping the cyber-physical systems (CPS) and the Internet of Things (IoT). Despite this controversy the papers reviewed agree that both have become crucial elements not only for industry but also for society in general. The impact of a malware attack affecting one of these systems may suppose a risk for the industrial processes involved and perhaps also for society in general if the system affected is a critical infrastructure. This article reviews the state of the art of the application of machine learning in the automation of malware detection in cyberphysical systems, evaluating the most representative articles in this field and summarizing the results obtained, the most common malware attacks in this type of systems, the most promising algorithms for malware detection in cyberphysical systems and the future lines of research in this field with the greatest potential for the coming years.

Malware Detector Classification Based on the SPRT in IoT

  • Jun-Won Ho
    • International journal of advanced smart convergence
    • /
    • v.12 no.1
    • /
    • pp.59-63
    • /
    • 2023
  • We create a malware detector classification method with using the Sequential Probability Ratio Test (SPRT) in IoT. More specifically, we adapt the SPRT to classify malware detectors into two categories of basic and advanced in line with malware detection capability. We perform evaluation of our scheme through simulation. Our simulation results show that the number of advanced detectors is changed in line with threshold for fraction of advanced malware information, which is used to judge advanced detectors in the SPRT.

IoT Malware Detection and Family Classification Using Entropy Time Series Data Extraction and Recurrent Neural Networks (엔트로피 시계열 데이터 추출과 순환 신경망을 이용한 IoT 악성코드 탐지와 패밀리 분류)

  • Kim, Youngho;Lee, Hyunjong;Hwang, Doosung
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.11 no.5
    • /
    • pp.197-202
    • /
    • 2022
  • IoT (Internet of Things) devices are being attacked by malware due to many security vulnerabilities, such as the use of weak IDs/passwords and unauthenticated firmware updates. However, due to the diversity of CPU architectures, it is difficult to set up a malware analysis environment and design features. In this paper, we design time series features using the byte sequence of executable files to represent independent features of CPU architectures, and analyze them using recurrent neural networks. The proposed feature is a fixed-length time series pattern extracted from the byte sequence by calculating partial entropy and applying linear interpolation. Temporary changes in the extracted feature are analyzed by RNN and LSTM. In the experiment, the IoT malware detection showed high performance, while low performance was analyzed in the malware family classification. When the entropy patterns for each malware family were compared visually, the Tsunami and Gafgyt families showed similar patterns, resulting in low performance. LSTM is more suitable than RNN for learning temporal changes in the proposed malware features.