• Title/Summary/Keyword: malware classification

Search Result 103, Processing Time 0.021 seconds

Distributed Processing System Design and Implementation for Feature Extraction from Large-Scale Malicious Code (대용량 악성코드의 특징 추출 가속화를 위한 분산 처리 시스템 설계 및 구현)

  • Lee, Hyunjong;Euh, Seongyul;Hwang, Doosung
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.8 no.2
    • /
    • pp.35-40
    • /
    • 2019
  • Traditional Malware Detection is susceptible for detecting malware which is modified by polymorphism or obfuscation technology. By learning patterns that are embedded in malware code, machine learning algorithms can detect similar behaviors and replace the current detection methods. Data must collected continuously in order to learn malicious code patterns that change over time. However, the process of storing and processing a large amount of malware files is accompanied by high space and time complexity. In this paper, an HDFS-based distributed processing system is designed to reduce space complexity and accelerate feature extraction time. Using a distributed processing system, we extract two API features based on filtering basis, 2-gram feature and APICFG feature and the generalization performance of ensemble learning models is compared. In experiments, the time complexity of the feature extraction was improved about 3.75 times faster than the processing time of a single computer, and the space complexity was about 5 times more efficient. The 2-gram feature was the best when comparing the classification performance by feature, but the learning time was long due to high dimensionality.

A Study on Automatic Classification Technique of Malware Packing Type (악성코드 패킹유형 자동분류 기술 연구)

  • Kim, Su-jeong;Ha, Ji-hee;Lee, Tae-jin
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.28 no.5
    • /
    • pp.1119-1127
    • /
    • 2018
  • Most of the cyber attacks are caused by malicious codes. The damage caused by cyber attacks are gradually expanded to IoT and CPS, which is not limited to cyberspace but a serious threat to real life. Accordingly, various malicious code analysis techniques have been appeared. Dynamic analysis have been widely used to easily identify the resulting malicious behavior, but are struggling with an increase in Anti-VM malware that is not working in VM environment detection. On the other hand, static analysis has difficulties in analysis due to various packing techniques. In this paper, we proposed malware classification techniques regardless of known packers or unknown packers through the proposed model. To do this, we designed a model of supervised learning and unsupervised learning for the features that can be used in the PE structure, and conducted the results verification through 98,000 samples. It is expected that accurate analysis will be possible through customized analysis technology for each class.

Classification of Malicious Web Pages by Using SVM (SVM을 활용한 악성 웹 페이지 분류)

  • Hwang, Young-Sup;Moon, Jae-Chan;Cho, Seong-Je
    • Journal of the Korea Society of Computer and Information
    • /
    • v.17 no.3
    • /
    • pp.77-83
    • /
    • 2012
  • As web pages provide various services, the distribution of malware via the web pages is being also increased. Malware can make personal information leak, system mal-function and system be zombie. To protect this damages, we should block the malicious web pages. Because the malicious codes embedded in web pages are obfuscated or transformed, it is difficult to detect them using signature-based approaches which are used by current anti-virus software. To overcome this problem, we extracted features to classify malicious web pages and benign ones by analyzing web pages. And we propose a classification method using SVM which is widely used in machine learning. Experimental results show that the proposed method is better than other methods. The proposed method could classify malicious web pages correctly and be helpful to block the distribution of malicious codes.

Malware API Classification Technology Using LSTM Deep Learning Algorithm (LSTM 딥러닝 알고리즘을 활용한 악성코드 API 분류 기술 연구)

  • Kim, Jinha;Park, Wonhyung
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2022.05a
    • /
    • pp.259-261
    • /
    • 2022
  • Recently, malicious code is not a single technique, but several techniques are combined and merged, and only important parts are extracted. As new malicious codes are created and transformed, attack patterns are gradually diversified and attack targets are also diversifying. In particular, the number of damage cases caused by malicious actions in corporate security is increasing over time. However, even if attackers combine several malicious codes, the APIs for each type of malicious code are repeatedly used and there is a high possibility that the patterns and names of the APIs are similar. For this reason, this paper proposes a classification technique that finds patterns of APIs frequently used in malicious code, calculates the meaning and similarity of APIs, and determines the level of risk.

  • PDF

Malware Classification System to Support Decision Making of App Installation on Android OS (안드로이드 OS에서 앱 설치 의사결정 지원을 위한 악성 앱 분류 시스템)

  • Ryu, Hong Ryeol;Jang, Yun;Kwon, Taekyoung
    • Journal of KIISE
    • /
    • v.42 no.12
    • /
    • pp.1611-1622
    • /
    • 2015
  • Although Android systems provide a permission-based access control mechanism and demand a user to decide whether to install an app based on its permission list, many users tend to ignore this phase. Thus, an improved method is necessary for users to intuitively make informed decisions when installing a new app. In this paper, with regard to the permission-based access control system, we present a novel approach based on a machine-learning technique in order to support a user decision-making on the fly. We apply the K-NN (K-Nearest Neighbors) classification algorithm with necessary weighted modifications for malicious app classification, and use 152 Android permissions as features. Our experiment shows a superior classification result (93.5% accuracy) compared to other previous work. We expect that our method can help users make informed decisions at the installation step.

Malware Classification and Analysis of Automated Malware Analysis System (악성코드 자동 분석 시스템의 결과를 이용한 악성코드 분류 및 분석)

  • Na, Jaechan;Jo, Yeong-Hun;Youn, Jonghee M.
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2014.11a
    • /
    • pp.490-491
    • /
    • 2014
  • 쿠쿠 샌드박스(Cuckoo Sandbox)는 가상머신을 이용해 악성코드를 자동으로 동적 분석할 수 있는 도구이다. 우선 악성코드의 MD5값을 이용하여 VirusTotal을 이용해 종류를 분류하고, 쿠쿠 샌드박스로 악성코드 동적을 분석하여 결과파일을 이용해 악성코드에서 호출한 API들에 대한 정보를 추출하고, 다양한 종류별 악성코드 그룹에 대해서 API빈도를 종합하고, 또한 다른 종류군의 악성코드 그룹과 API 빈도를 비교해 특정 종류의 악성코드 그룹에 대한 특징적인 API를 찾아내어 향후 이런 특징 API들을 이용해 악성코드의 종류를 자동으로 판정하기 위한 방법을 제시한다.

Malware Detection Technology Based on API Call Time Section Characteristics (API 호출 구간 특성 기반 악성코드 탐지 기술)

  • Kim, Dong-Yeob;Choi, Sang-Yong
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.32 no.4
    • /
    • pp.629-635
    • /
    • 2022
  • Cyber threats are also increasing with recent social changes and the development of ICT technology. Malicious codes used in cyber threats are becoming more advanced and intelligent, such as analysis environment avoidance technology, concealment, and fileless distribution, to make analysis difficult. Machine learning technology is being used to effectively analyze these malicious codes, but a lot of effort is needed to increase the accuracy of classification. In this paper, we propose a malicious code detection technology based on API call interval characteristics to improve the classification performance of machine learning. The proposed technology uses API call characteristics for each section and entropy of binary to separate characteristic factors into sections based on the extraction malicious code and API call order of normal binary. It was verified that malicious code can be well analyzed using the support vector machine (SVM) algorithm for the extracted characteristic factors.

Android Malware Analysis Technology Research Based on Naive Bayes (Naive Bayes 기반 안드로이드 악성코드 분석 기술 연구)

  • Hwang, Jun-ho;Lee, Tae-jin
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.27 no.5
    • /
    • pp.1087-1097
    • /
    • 2017
  • As the penetration rate of smartphones increases, the number of malicious codes targeting smartphones is increasing. I 360 Security 's smartphone malware statistics show that malicious code increased 437 percent in the first quarter of 2016 compared to the fourth quarter of 2015. In particular, malicious applications, which are the main means of distributing malicious code on smartphones, are aimed at leakage of user information, data destruction, and money withdrawal. Often, it is operated by an API, which is an interface that allows you to control the functions provided by the operating system or programming language. In this paper, we propose a mechanism to detect malicious application based on the similarity of API pattern in normal application and malicious application by learning pattern of API in application derived from static analysis. In addition, we show a technique for improving the detection rate and detection rate for each label derived by using the corresponding mechanism for the sample data. In particular, in the case of the proposed mechanism, it is possible to detect when the API pattern of the new malicious application is similar to the previously learned patterns at a certain level. Future researches of various features of the application and applying them to this mechanism are expected to be able to detect new malicious applications of anti-malware system.

Deobfuscation Processing and Deep Learning-Based Detection Method for PowerShell-Based Malware (파워쉘 기반 악성코드에 대한 역난독화 처리와 딥러닝 기반 탐지 방법)

  • Jung, Ho-jin;Ryu, Hyo-gon;Jo, Kyu-whan;Lee, Sangkyun
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.32 no.3
    • /
    • pp.501-511
    • /
    • 2022
  • In 2021, ransomware attacks became popular, and the number is rapidly increasing every year. Since PowerShell is used as the primary ransomware technique, the need for PowerShell-based malware detection is ever increasing. However, the existing detection techniques have limits in that they cannot detect obfuscated scripts or require a long processing time for deobfuscation. This paper proposes a simple and fast deobfuscation method and a deep learning-based classification model that can detect PowerShell-based malware. Our technique is composed of Word2Vec and a convolutional neural network to learn the meaning of a script extracting important features. We tested the proposed model using 1400 malicious codes and 8600 normal scripts provided by the AI-based PowerShell malicious script detection track of the 2021 Cybersecurity AI/Big Data Utilization Contest. Our method achieved 5.04 times faster deobfuscation than the existing methods with a perfect success rate and high detection performance with FPR of 0.01 and TPR of 0.965.

Classification of HTTP Automated Software Communication Behavior Using a NoSQL Database

  • Tran, Manh Cong;Nakamura, Yasuhiro
    • IEIE Transactions on Smart Processing and Computing
    • /
    • v.5 no.2
    • /
    • pp.94-99
    • /
    • 2016
  • Application layer attacks have for years posed an ever-serious threat to network security, since they always come after a technically legitimate connection has been established. In recent years, cyber criminals have turned to fully exploiting the web as a medium of communication to launch a variety of forbidden or illicit activities by spreading malicious automated software (auto-ware) such as adware, spyware, or bots. When this malicious auto-ware infects a network, it will act like a robot, mimic normal behavior of web access, and bypass the network firewall or intrusion detection system. Besides that, in a private and large network, with huge Hypertext Transfer Protocol (HTTP) traffic generated each day, communication behavior identification and classification of auto-ware is a challenge. In this paper, based on a previous study, analysis of auto-ware communication behavior, and with the addition of new features, a method for classification of HTTP auto-ware communication is proposed. For that, a Not Only Structured Query Language (NoSQL) database is applied to handle large volumes of unstructured HTTP requests captured every day. The method is tested with real HTTP traffic data collected through a proxy server of a private network, providing good results in the classification and detection of suspicious auto-ware web access.