• Title/Summary/Keyword: 바이너리 유사도

Search Result 18, Processing Time 0.021 seconds

Efficient Similarity Analysis Methods for Same Open Source Functions in Different Versions (서로 다른 버전의 동일 오픈소스 함수 간 효율적인 유사도 분석 기법)

  • Kim, Yeongcheol;Cho, Eun-Sun
    • Journal of KIISE
    • /
    • v.44 no.10
    • /
    • pp.1019-1025
    • /
    • 2017
  • Binary similarity analysis is used in vulnerability analysis, malicious code analysis, and plagiarism detection. Proving that a function is equal to a well-known safe functions of different versions through similarity analysis can help to improve the efficiency of the binary code analysis of malicious behavior as well as the efficiency of vulnerability analysis. However, few studies have been carried out on similarity analysis of the same function of different versions. In this paper, we analyze the similarity of function units through various methods based on extractable function information from binary code, and find a way to analyze efficiently with less time. In particular, we perform a comparative analysis of the different versions of the OpenSSL library to determine the way in which similar functions are detected even when the versions differ.

Improving Resilience in Graph-based Structure Comparison for Binary Objects (그래프 기반 바이너리 구조 비교 기법의 강인성 개선)

  • Jang, Joon-Hyouk;Cho, Yoo-Kun;Hong, Ji-Man
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2012.06a
    • /
    • pp.104-106
    • /
    • 2012
  • 소프트웨어 버스마크 (Software Birthmark)는 프로그램 실행 파일로부터 프로그램의 고유한 정보를 추출하는 기법이다. 프로그램의 도용을 판별하기 위해 바이너리로부터 버스마크를 추출하여 원본 프로그램과의 유사도를 측정하거나 악성 코드 탐지에 사용된다. 본 논문에서는 그래프 기반 바이너리 구조 매칭기법을 기반으로 한 버스마크를 제안한다. 제안 기법은 원본 프로그램과 대상 프로그램 사이에서 함수와 함수, 기본 블록과 기본 블록의 매칭 방법을 개선함으로써, 기존 기법에 비해 강인성(Resilience)이 향상된 버스마크를 추출한다.

Detecting Software Similarity Using API Sequences on Static Major Paths (정적 주요 경로 API 시퀀스를 이용한 소프트웨어 유사성 검사)

  • Park, Seongsoo;Han, Hwansoo
    • Journal of KIISE
    • /
    • v.41 no.12
    • /
    • pp.1007-1012
    • /
    • 2014
  • Software birthmarks are used to detect software plagiarism. For binaries, however, only a few birthmarks have been developed. In this paper, we propose a static approach to generate API sequences along major paths, which are analyzed from control flow graphs of the binaries. Since our API sequences are extracted along the most plausible paths of the binary codes, they can represent actual API sequences produced from binary executions, but in a more concise form. Our similarity measures use the Smith-Waterman algorithm that is one of the popular sequence alignment algorithms for DNA sequence analysis. We evaluate our static path-based API sequence with multiple versions of five applications. Our experiment indicates that our proposed method provides a quite reliable similarity birthmark for binaries.

Control Flow Reconstruction from Virtualization-Obfuscated Binaries (가상화를 이용하여 난독화된 바이너리의 제어 흐름 재건)

  • Hwang, Joonhyung;Han, Taisook
    • Journal of KIISE
    • /
    • v.42 no.1
    • /
    • pp.44-53
    • /
    • 2015
  • Control flow information is useful in the analysis and comparison of programs. Virtualization-obfuscation hides control structures of the original program by transforming machine instructions into bytecode. Direct examination of the resulting binary reveals only the structure of the interpreter. Recovery of the original instructions requires knowledge of the virtual machine architecture, which is randomly generated and hidden. In this paper, we propose a method to reconstruct original control flow using only traces generated from the obfuscated binary. We consider traces as strings and find an automaton that represents the strings. State transitions in the automaton correspond to the control transfers in the original program. We have shown the effectiveness of our method with commercial obfuscators.

Detection of an Open-Source Software Module based on Function-level Features (함수 수준 특징정보 기반의 오픈소스 소프트웨어 모듈 탐지)

  • Kim, Dongjin;Cho, Seong-je
    • Journal of KIISE
    • /
    • v.42 no.6
    • /
    • pp.713-722
    • /
    • 2015
  • As open-source software (OSS) becomes more widely used, many users breach the terms in the license agreement of OSS, or reuse a vulnerable OSS module. Therefore, a technique needs to be developed for investigating if a binary program includes an OSS module. In this paper, we propose an efficient technique to detect a particular OSS module in an executable program using its function-level features. The conventional methods are inappropriate for determining whether a module is contained in a specific program because they usually measure the similarity between whole programs. Our technique determines whether an executable program contains a certain OSS module by extracting features such as its function-level instructions, control flow graph, and the structural attributes of a function from both the program and the module, and comparing the similarity of features. In order to demonstrate the efficiency of the proposed technique, we evaluate it in terms of the size of features, detection accuracy, execution overhead, and resilience to compiler optimizations.

Analyzing Differences of Binary Executable Files using Program Structure and Constant Values (프로그램의 구조와 상수 값을 이용하는 바이너리 실행 파일의 차이점 분석)

  • Park, Hee-Wan;Choi, Seok-Woo;Seo, Sun-Ae;Han, Tai-Sook
    • Journal of KIISE:Software and Applications
    • /
    • v.35 no.7
    • /
    • pp.452-461
    • /
    • 2008
  • Binary diffing is a method to find differences in similar binary executables such as two different versions of security patches. Previous diffing methods using flow information can detect control flow changes, but they cannot track constant value changes. Biffing methods using assembly instructions can detect constant value changes, but they give false positives which are due to compiling methods such as instruction reordering. We present a binary diffing method and its implementation named SCV which utilizes both structure and value information. SCV summarizes structure and constant value information from disassembled code, and matches the summaries to find differences. By analyzing a Microsoft Windows security patches, we showed that SCV found necessary differences caused by constant value changes which the state-of-the-art binary diffing tool BinDiff failed to find.

Software Birthmark Based on k-gram Using Operator Abstraction (연산자 요약을 이용한 k그램 소프트웨어 버스마크)

  • Lee, Kihwa;Woo, Gyun
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2013.05a
    • /
    • pp.687-690
    • /
    • 2013
  • 소프트웨어 버스마크 기법은 도용이 의심되는 소프트웨어의 소스 코드를 얻을 수 없을 때 사용할 수 있는 소프트웨어 도용 탐지 기법이다. 이 기법은 프로그램의 바이너리나 자바 클래스 파일에서 프로그램 고유의 특징인 버스마크를 추출한 다음 프로그램간 버스마크 유사도 측정을 통해 도용을 탐지한다. 이 논문에서는 선행 연구된 k그램 버스마크 기법에 연산자 요약이라는 아이디어를 접목한 연산자 요약 k그램 버스마크 기법을 제안한다. 연산자 요약이란 연산자 우선순위가 같은 연산자의 JVM 명령어를 묶어 요약번호로 나타내는 것이다. 연산자 요약 k그램 버스마크 기법은 연산자 요약과 제어 흐름을 고려하여 생성한 연속된 k개의 요약번호 시퀀스 집합을 버스마크로 정의한다. 버스마크를 평가하기 위해 선택 정렬 메소드와 버블 정렬 메소드를 대상으로 신뢰도 실험과 강인도 실험을 하였다. 실험 결과 연산자 요약 k그램 버스마크 기법이 선행 연구된 Tamada 버스마크 기법과 k그램 버스마크 기법보다 높은 신뢰도와 강인도를 보였다.

A Flow-sensitive API k-gram Based Software Birthmark (제어 흐름을 고려한 API k-gram 소프트웨어 버스마크)

  • Heewan Park;Seokwoo Choi;Hyun-il Lim;Taisook Han
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2008.11a
    • /
    • pp.523-526
    • /
    • 2008
  • 소프트웨어 버스마크는 클래스나 바이너리를 대상으로 고유한 특징을 추출하여 유사도를 비교하는 방법이다. 본 논문에서는 자바의 제어 흐름을 고려한 API k-gram 버스마크 기법을 제안한다. 이 기법은 다른 것으로 대체하기 어려운 자바 표준 API 호출에 대한 시퀀스를 k-gram으로 사용했기 때문에 신뢰도가 높고, 제어 흐름을 반영하여 시퀀스를 추출하기 때문에 난독화에 강하다는 장점이 있다. 본 논문에서 제안하는 버스마크를 기법을 평가하기 위해서 신뢰도와 강인도에 대한 실험을 한 결과 기존의 버스마크보다 신뢰도와 강인도가 우수한 버스마크임을 확인하였다.

Improving Compatibility Method of New Vworld 3D Data Using the Serialization Technique (데이터 직렬화 기법을 활용한 차세대 브이월드 3차원 데이터의 호환성 개선 방안)

  • KANG, Ji-Hun;KIM, Hyeon-Deok;KIM, Jeong-Taek
    • Journal of the Korean Association of Geographic Information Studies
    • /
    • v.21 no.1
    • /
    • pp.96-105
    • /
    • 2018
  • The V-world, Spatial information open platform map service, provides various national spatial data. Recently, with the development of IT technology, demand for 3D geospatial data that can be merged with new industries such as Internet of Things(IoT) and autonomous vehicles is increasing. Because 3D geospatial data is large and complex, many computer resources are used to provide map services. Most of the 3D map services, such as Vworld, are constructed binary data in consideration of performance. However, this type of data is incompatible because it is difficult to use in other services if there is no precise understanding of the specification. In this paper, we propose a data serialization method to improve the compatibility of new Vworld 3D format which is constructed in binary form. The performance of binary data and serialized binary data is tested and compared. As a result, it is expected that the data using the serialization technique will be similar to the binary data and contribute to improve compatibility.