• Title/Summary/Keyword: 악성문서 탐지

Search Result 18, Processing Time 0.024 seconds

MS Office Malicious Document Detection Based on CNN (CNN 기반 MS Office 악성 문서 탐지)

  • Park, Hyun-su;Kang, Ah Reum
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.32 no.2
    • /
    • pp.439-446
    • /
    • 2022
  • Document-type malicious codes are being actively distributed using attachments on websites or e-mails. Document-type malicious code is relatively easy to bypass security programs because the executable file is not executed directly. Therefore, document-type malicious code should be detected and prevented in advance. To detect document-type malicious code, we identified the document structure and selected keywords suspected of being malicious. We then created a dataset by converting the stream data in the document to ASCII code values. We specified the location of malicious keywords in the document stream data, and classified the stream as malicious by recognizing the adjacent information of the malicious keywords. As a result of detecting malicious codes by applying the CNN model, we derived accuracies of 0.97 and 0.92 in stream units and file units, respectively.

A Study of Office Open XML Document-Based Malicious Code Analysis and Detection Methods (Office Open XML 문서 기반 악성코드 분석 및 탐지 방법에 대한 연구)

  • Lee, Deokkyu;Lee, Sangjin
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.30 no.3
    • /
    • pp.429-442
    • /
    • 2020
  • The proportion of attacks via office documents is increasing in recent incidents. Although the security of office applications has been strengthened gradually, the attacks through the office documents are still effective due to the sophisticated use of social engineering techniques and advanced attack techniques. In this paper, we propose a method for detecting malicious OOXML(Office Open XML) documents and a framework for detection. To do this, malicious files used in the attack and benign files were collected from the malicious code repository and the search engine. By analyzing the malicious code types of collected files, we identified six "suspicious object" elements that are meaningful in determining whether they are malicious in a document. In addition, we implemented an OOXML document-based malware detection framework based on the detection method to classify the collected files and found that 98.45% of malicious filesets were detected.

An Email Vaccine Cloud System for Detecting Malcode-Bearing Documents (악성코드 은닉 문서파일 탐지를 위한 이메일 백신 클라우드 시스템)

  • Park, Choon-Sik
    • Journal of Korea Multimedia Society
    • /
    • v.13 no.5
    • /
    • pp.754-762
    • /
    • 2010
  • Nowadays, email-based targeted attacks using malcode-bearing documents have been steadily increased. To improve the success rate of the attack and avoid anti-viruses, attackers mainly employ zero-day exploits and relevant social engineering techniques. In this paper, we propose an architecture of the email vaccine cloud system to prevent targeted attacks using malcode-bearing documents. The system extracts attached document files from email messages, performs behavior analysis as well as signature-based detection in the virtual machine environment, and completely removes malicious documents from the messages. In the process of behavior analysis, the documents are regarded as malicious ones in cases of creating executable files, launching new processes, accessing critical registry entries, connecting to the Internet. The email vaccine cloud system will help prevent various cyber terrors such as information leakages by preventing email based targeted attacks.

A Research of Anomaly Detection Method in MS Office Document (MS 오피스 문서 파일 내 비정상 요소 탐지 기법 연구)

  • Cho, Sung Hye;Lee, Sang Jin
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.6 no.2
    • /
    • pp.87-94
    • /
    • 2017
  • Microsoft Office is an office suite of applications developed by Microsoft. Recently users with malicious intent customize Office files as a container of the Malware because MS Office is most commonly used word processing program. To attack target system, many of malicious office files using a variety of skills and techniques like macro function, hiding shell code inside unused area, etc. And, people usually use two techniques to detect these kinds of malware. These are Signature-based detection and Sandbox. However, there is some limits to what it can afford because of the increasing complexity of malwares. Therefore, this paper propose methods to detect malicious MS office files in Computer forensics' way. We checked Macros and potential problem area with structural analysis of the MS Office file for this purpose.

A Study On Malicious Script Detection using Text Categorization (문서분류기법을 이용한 악성 스크립트 탐지)

  • 신대현;위규범
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2002.10c
    • /
    • pp.514-516
    • /
    • 2002
  • 본 논문은 스크립트 호스트 모니터링을 통한 정보검색 기법인 유사도 알고리즘을 이용하는 악성 스크립트 탐지에 관한 연구이다. 스크립트 호스트의 모니터링을 통하여 스크립트가 실행되기 전에 스크립트를 가로채고 알려진 악성 스크립트와의 유사도를 비교하여 악성 여부를 판단한다. 소스기반의 빠른 탐지와 유사 변명의 악성 스크립트 탐지가 가능하며 악성행위의 종류를 사용자에게 보고할 수 있는 장점을 갖는다.

  • PDF

A Study on Email Security through Proactive Detection and Prevention of Malware Email Attacks (악성 이메일 공격의 사전 탐지 및 차단을 통한 이메일 보안에 관한 연구)

  • Yoo, Ji-Hyun
    • Journal of IKEEE
    • /
    • v.25 no.4
    • /
    • pp.672-678
    • /
    • 2021
  • New malware continues to increase and become advanced by every year. Although various studies are going on executable files to diagnose malicious codes, it is difficult to detect attacks that internalize malicious code threats in emails by exploiting non-executable document files, malicious URLs, and malicious macros and JS in documents. In this paper, we introduce a method of analyzing malicious code for email security through proactive detection and blocking of malicious email attacks, and propose a method for determining whether a non-executable document file is malicious based on AI. Among various algorithms, an efficient machine learning modeling is choosed, and an ML workflow system to diagnose malicious code using Kubeflow is proposed.

A Study on Outer Document Flow for APT Response (지능형 지속 위협 대응을 위한 외부 문서 유입 방안 연구)

  • Kim, JongPil;Park, Sangho;Na, Onechul;Chang, Hangbae
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2017.04a
    • /
    • pp.323-325
    • /
    • 2017
  • 최근 지능형 지속 위협(APT)은 명확한 공격 대상과 정교한 프로그램을 사용하여 치밀하게 공격하는 사회공학적 공격 기법을 사용함으로 상업용 탐지기술의 지속적인 발전과 개발에도 빠르게 증가되고 있다. 기존의 탐지 기법은 알려진 악성코드에 대하여는 효과적으로 대응 가능하나 아무런 정보가 없는 제로데이 공격 등의 악성코드는 탐지하기 어렵다. 특히 최근의 악성코드들은 빠르게 변종을 만들어냄으로 기존의 탐지 기법으로는 한계가 있다. 따라서 본 연구에서는 악성코드에 대한 경로 및 유형, 공격 방법 등을 분석하고 이를 탐지하고 분석하는 선행 기술들 조사하여 DAST 기반의 콘텐츠재구성을 통한 무해화 기술을 제안하였다.

Efficient Hangul Word Processor (HWP) Malware Detection Using Semi-Supervised Learning with Augmented Data Utility Valuation (효율적인 HWP 악성코드 탐지를 위한 데이터 유용성 검증 및 확보 기반 준지도학습 기법)

  • JinHyuk Son;Gihyuk Ko;Ho-Mook Cho;Young-Kuk Kim
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.34 no.1
    • /
    • pp.71-82
    • /
    • 2024
  • With the advancement of information and communication technology (ICT), the use of electronic document types such as PDF, MS Office, and HWP files has increased. Such trend has led the cyber attackers increasingly try to spread malicious documents through e-mails and messengers. To counter such attacks, AI-based methodologies have been actively employed in order to detect malicious document files. The main challenge in detecting malicious HWP(Hangul Word Processor) files is the lack of quality dataset due to its usage is limited in Korea, compared to PDF and MS-Office files that are highly being utilized worldwide. To address this limitation, data augmentation have been proposed to diversify training data by transforming existing dataset, but as the usefulness of the augmented data is not evaluated, augmented data could end up harming model's performance. In this paper, we propose an effective semi-supervised learning technique in detecting malicious HWP document files, which improves overall AI model performance via quantifying the utility of augmented data and filtering out useless training data.

An analysis of Content Disarm and Reconstruction (콘텐츠 무해화 및 재조합 기술 연구 분석 및 고찰)

  • Sohyeon Oh;Abir EL Azzaoui;Jong Hyuk Park
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2023.05a
    • /
    • pp.206-208
    • /
    • 2023
  • 비대면 활동 및 원격 작업 증가에 따라 문서 파일을 이용한 사이버 공격 빈도가 증가하고 있으며, 별도의 실행 파일 대신 문서 내의 기본적인 기능을 악용하는 문서 공격은 기존의 악성코드 탐지 메커니즘을 우회할 수 있기 때문에 큰 문제가 되고 있다. 이러한 문제에 대응하기 위한 여러 기술 중 CDR 기술은 악성 행위에 이용될 가능성이 있는 액티브 콘텐츠를 제거하거나 비활성화하여 사전에 악성코드로 탐지되지 않았던 파일에 대한 보안성을 제공하지만, 문서의 내용을 분석하고 안전하게 재조합하는 과정에서 오류가 발생하여 전달하고자 했던 내용을 제대로 표현할 수 없게 되거나, 파일을 사용할 수 없게 되는 문제가 발생할 수 있다. 본 논문에서는 파일을 후처리하는 방식으로만 CDR을 적용하는 것이 아니라, 확장 프로그램이나 가상 환경 등을 이용해 문서의 작성 단계에서부터 CDR 처리과정을 거치게 하는 방법을 제안하여 파일 손상이나 내용 누락 문제를 완화하고 사용자의 업무 효율을 높이는 동시에 강화된 보안성을 제공한다.

OLE File Analysis and Malware Detection using Machine Learning

  • Choi, Hyeong Kyu;Kang, Ah Reum
    • Journal of the Korea Society of Computer and Information
    • /
    • v.27 no.5
    • /
    • pp.149-156
    • /
    • 2022
  • Recently, there have been many reports of document-type malicious code injecting malicious code into Microsoft Office files. Document-type malicious code is often hidden by encoding the malicious code in the document. Therefore, document-type malware can easily bypass anti-virus programs. We found that malicious code was inserted into the Visual Basic for Applications (VBA) macro, a function supported by Microsoft Office. Malicious codes such as shellcodes that run external programs and URL-related codes that download files from external URLs were identified. We selected 354 keywords repeatedly appearing in malicious Microsoft Office files and defined the number of times each keyword appears in the body of the document as a feature. We performed machine learning with SVM, naïve Bayes, logistic regression, and random forest algorithms. As a result, each algorithm showed accuracies of 0.994, 0.659, 0.995, and 0.998, respectively.