• Title/Summary/Keyword: LLVM IR

Search Result 10, Processing Time 0.024 seconds

Study on LLVM application in Parallel Computing System (병렬 컴퓨팅 시스템에서 LLVM 응용 연구)

  • Cho, Jungseok;Cho, Doosan;Kim, Yongyeon
    • The Journal of the Convergence on Culture Technology
    • /
    • v.5 no.1
    • /
    • pp.395-399
    • /
    • 2019
  • In order to support various parallel computing systems, it is necessary to extend LLVM IR to more efficiently support vector / matrix and to design LLVM IR to machine code as a new algorithm. As shown in the IR example, RISC instruction generation is naturally generated because the RISC instruction is basically composed of the RISC instruction, and the vector instruction is also not supported. There is a need for new IR structures, command generation algorithms and related extensions to support vector / matrix more robustly. To do this, it is important to map each instruction in the LLVM IR to the appropriate instruction in the target architecture (vector / matrix) (instruction selection algorithm). It is necessary to understand the meaning of LLVM IR command, to compare the meaning of each instruction of the target architecture with syntax, and to select the instruction that matches the pattern to make mapping efficient.

Image-Based Machine Learning Model for Malware Detection on LLVM IR (LLVM IR 대상 악성코드 탐지를 위한 이미지 기반 머신러닝 모델)

  • Kyung-bin Park;Yo-seob Yoon;Baasantogtokh Duulga;Kang-bin Yim
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.34 no.1
    • /
    • pp.31-40
    • /
    • 2024
  • Recently, static analysis-based signature and pattern detection technologies have limitations due to the advanced IT technologies. Moreover, It is a compatibility problem of multiple architectures and an inherent problem of signature and pattern detection. Malicious codes use obfuscation and packing techniques to hide their identity, and they also avoid existing static analysis-based signature and pattern detection techniques such as code rearrangement, register modification, and branching statement addition. In this paper, We propose an LLVM IR image-based automated static analysis of malicious code technology using machine learning to solve the problems mentioned above. Whether binary is obfuscated or packed, it's decompiled into LLVM IR, which is an intermediate representation dedicated to static analysis and optimization. "Therefore, the LLVM IR code is converted into an image before being fed to the CNN-based transfer learning algorithm ResNet50v2 supported by Keras". As a result, we present a model for image-based detection of malicious code.

A Study on RSIL to LLVM IR Translator For Verification of Intermediate Code on IoT Virtual Machine (IoT 가상기계의 중간코드 검증을 위한 RSIL to LLVM IR 변환기의 설계 및 구현)

  • Cho, Jaehyeon;Choi, Chanwhi;Son, Yunsik;Lee, YangSun
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2017.04a
    • /
    • pp.1135-1137
    • /
    • 2017
  • 본 논문은 IoT 장치에서 동작하는 가상기계의 중간코드를 LLVM IR 코드로 변환하는 코드 변환기를 설계 및 구현하였다. LLVM 인터프리터를 통해 RSIL 코드로부터 변환된 LLVM IR 코드를 실행하고, 그 결과를 비교하여 IoT 가상기계의 중간코드를 검증하였다.

An LLVM-Based Implementation of Static Analysis for Detecting Self-Modifying Code and Its Evaluation (자체 수정 코드를 탐지하는 정적 분석방법의 LLVM 프레임워크 기반 구현 및 실험)

  • Yu, Jae-IL;Choi, Kwang-hoon
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.32 no.2
    • /
    • pp.171-179
    • /
    • 2022
  • Self-Modifying-Code is a code that changes the code by itself during execution time. This technique is particularly abused by malicious code to bypass static analysis. Therefor, in order to effectively detect such malicious codes, it is important to identify self-modifying-codes. In the meantime, Self-modify-codes have been analyzed using dynamic analysis methods, but this is time-consuming and costly. If static analysis can detect self-modifying-code it will be of great help to malicious code analysis. In this paper, we propose a static analysis method to detect self-modified code for binary executable programs converted to LLVM IR and apply this method by making a self-modifying-code benchmark. As a result of the experiment in this paper, the designed static analysis method was effective for the standardized LLVM IR program that was compiled and converted to the benchmark program. However, there was a limitation in that it was difficult to detect the self-modifying-code for the unstructured LLVM IR program in which the binary was lifted and transformed. To overcome this, we need an effective way to lift the binary code.

Static Dalvik Bytecode Optimization for Android Applications

  • Kim, Jeehong;Kim, Inhyeok;Min, Changwoo;Jun, Hyung Kook;Lee, Soo Hyung;Kim, Won-Tae;Eom, Young Ik
    • ETRI Journal
    • /
    • v.37 no.5
    • /
    • pp.1001-1011
    • /
    • 2015
  • Since just-in-time (JIT) has considerable overhead to detect hot spots and compile them at runtime, using sophisticated optimization techniques for embedded devices means that any resulting performance improvements will be limited. In this paper, we introduce a novel static Dalvik bytecode optimization framework, as a complementary compilation of the Dalvik virtual machine, to improve the performance of Android applications. Our system generates optimized Dalvik bytecodes by using Low Level Virtual Machine (LLVM). A major obstacle in using LLVM for optimizing Dalvik bytecodes is determining how to handle the high-level language features of the Dalvik bytecode in LLVM IR and how to optimize LLVM IR conforming to the language information of the Dalvik bytecode. To this end, we annotate the high-level language features of Dalvik bytecode to LLVM IR and successfully optimize Dalvik bytecodes through instruction selection processes. Our experimental results show that our system with JIT improves the performance of Android applications by up to 6.08 times, and surpasses JIT by up to 4.34 times.

A Study of Malware Argument Detection (악성코드 Argument Detection 방법 연구)

  • Hwang, Shin-Woon;Youn, Jonghee
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2021.05a
    • /
    • pp.181-182
    • /
    • 2021
  • 악성코드 분석방법의 발전에 따라 악성코드의 분석우회기법도 나날이 발전하여 대량의 악성코드분석이 다양한 이유로 수행되지 않고 있다. 대부분의 악성코드는 소스코드가 없는 바이너리로 동적 분석이 동작하지 않는 원인을 파악하기 어렵다. 동적 분석이 실행되지 않는 악성코드들은 입력 값에 따라 악성코드가 동작하거나, 특정 시간대를 일치하는 등 다양한 트리거가 존재한다. 본 논문에서는 트리거가 필요한 악성코드에 대해 바이너리 리프팅(lifting) 기술을 활용한 새로운 동적 분석방법을 제안한다. 바이너리 리프팅 기술은 소스코드가 없는 바이너리를 LLVM IR 로 변환시키는 기술로서 이를 활용해 입력 값 유무에 따른 악성코드를 판별하고자 한다. 전달인자를 사용하는 코드와 사용하지 않는 코드간 LLVM IR 을 비교분석하여 전달인자에 따른 악성코드 동작 여부를 판별해 대량의 악성코드 동적 분석시스템의 분석률을 높이는 방안을 제안하고자 한다.

Intermediate-Representation Translation Techniques to Improve Vulnerability Analysis Efficiency for Binary Files in Embedded Devices (임베디드 기기 바이너리 취약점 분석 효율성 제고를 위한 중간어 변환 기술)

  • Jeoung, Byeoung Ho;Kim, Yong Hyuk;Bae, Sung il;Im, Eul Gyu
    • Smart Media Journal
    • /
    • v.7 no.1
    • /
    • pp.37-44
    • /
    • 2018
  • Utilizing sequence control and numerical computing, embedded devices are used in a variety of automated systems, including those at industrial sites, in accordance with their control program. Since embedded devices are used as a control system in corporate industrial complexes, nuclear power plants and public transport infrastructure nowadays, deliberate attacks on them can cause significant economic and social damages. Most attacks aimed at embedded devices are data-coded, code-modulated, and control-programmed. The control programs for industry-automated embedded devices are designed to represent circuit structures, unlike common programming languages, and most industrial automation control programs are designed with a graphical language, LAD, which is difficult to process static analysis. Because of these characteristics, the vulnerability analysis and security related studies for industry automation control programs have only progressed up to the formal verification, real-time monitoring levels. Furthermore, the static analysis of industrial automation control programs, which can detect vulnerabilities in advance and prepare for attacks, stays poorly researched. Therefore, this study suggests a method to present a discussion on an industry automation control program designed to represent the circuit structure to increase the efficiency of static analysis of embedded industrial automation programs. It also proposes a medium term translation technology exploiting LLVM IR to comprehensively analyze the industrial automation control programs of various manufacturers. By using LLVM IR, it is possible to perform integrated analysis on dynamic analysis. In this study, a prototype program that converts to a logical expression type of medium language was developed with regards to the S company's control program in order to verify our method.

Android library vulnerability analysis using binary lifting (Binary lifting을 이용한 안드로이드 라이브러리 취약점 분석)

  • Lee, Sung-Won;Youn, Jonghee
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2020.05a
    • /
    • pp.290-291
    • /
    • 2020
  • 안드로이드 OS는 대중적이고 중요한 시스템으로 자리 잡았고, 이에 따른 다양한 연구도 진행 중이다. 본 논문에서는 보안측면에서의 취약점 분석 방법을 제시하여, 각종 보안 위협을 예방하는데 기여하고자 한다. 안드로이드 라이브러리를 대상으로 Binary Lifting 기술을 사용하여 코드기반(LLVM IR) 퍼징을 진행하는, 취약점 분석 과정을 설계 수행한다.

A Study of LLVM-based Embedded System Performance Analyzer (LLVM 기반의 임베디드 시스템 성능 분석기의 연구)

  • Cho, Doosan
    • The Journal of the Convergence on Culture Technology
    • /
    • v.8 no.1
    • /
    • pp.577-582
    • /
    • 2022
  • For developing a new embedded system, an application program/an emulator and a compiler are developed simultaneously. In order to provide the optimal performance of all system components, local optimization should be carried out for the developing process. For this purpose, if a source-level performance analyzer is developed, it is possible to optimize the application program's source code by the performance evaluation. In general, the performance of an application program is determined in the loop iterations. The Intermediate Representation (IR) code generator generates IR code from the source code, and evaluates the execution time with the instructions in the intermediate representation code. If the source code is improved based on the evaluated result, better results can be obtained in the final application code. This study describes the source-level performance analyzer that can be used during the simultaneous development of the new embedded system and its application programs. The performance analyzer makes it possible to more quickly optimize the performance of the new embedded system.

CMA 인식을 통한 메모리 안전성 강화 연구

  • Hong, Junwha;Park, Chanmin;Jeong, Seongyun;Min, Jiun;Yu, Dongyeon;Kwon, Yonghwi;Jeon, Yuseok
    • Review of KIISC
    • /
    • v.32 no.4
    • /
    • pp.61-69
    • /
    • 2022
  • C/C++에는 다수의 메모리 취약점이 존재하며 ASan은 낮은 오버헤드와 높은 탐지율로 이러한 메모리 취약점을 탐지하기 위해 광범위하게 사용되고 있다. 그러나 상용 프로그램 중 다수는 메모리를 효율적으로 사용하기 위해 Custom Memory Allocator(CMA)를 구현하여 사용하며, ASan은 이러한 CMA로부터 파생된 버그를 대부분 탐지하지 못한다. 이를 극복하기 위해 본 연구에서는 LLVM IR 코드를 RNN 신경망에 학습하여 CMA를 탐지하고, ASan이 CMA를 식별할 수 있도록 수정하여 CMA로부터 파생된 메모리 취약점을 탐지할 수 있는 도구인 CMASan을 제안한다. ASan과 CMASan의 성능 및 CMA 관련 취약점의 탐지 결과를 비교·분석하여 CMASan이 낮은 실행시간 및 적은 메모리 오버헤드로 ASan이 탐지하지 못하는 메모리 취약점을 탐지할 수 있음을 확인하였다.