Browse > Article
http://dx.doi.org/10.13089/JKIISC.2012.22.6.1325

Research on the Classification Model of Similarity Malware using Fuzzy Hash  

Park, Changwook (Center for Information Security Technologies, Korea University)
Chung, Hyunji (Center for Information Security Technologies, Korea University)
Seo, Kwangseok (Korea Information Security Education Center)
Lee, Sangjin (Center for Information Security Technologies, Korea University)
Abstract
In the past about 10 different kinds of malicious code were found in one day on the average. However, the number of malicious codes that are found has rapidly increased reachingover 55,000 during the last 10 year. A large number of malicious codes, however, are not new kinds of malicious codes but most of them are new variants of the existing malicious codes as same functions are newly added into the existing malicious codes, or the existing malicious codes are modified to evade anti-virus detection. To deal with a lot of malicious codes including new malicious codes and variants of the existing malicious codes, we need to compare the malicious codes in the past and the similarity and classify the new malicious codes and the variants of the existing malicious codes. A former calculation method of the similarity on the existing malicious codes compare external factors of IPs, URLs, API, Strings, etc or source code levels. The former calculation method of the similarity takes time due to the number of malicious codes and comparable factors on the increase, and it leads to employing fuzzy hashing to reduce the amount of calculation. The existing fuzzy hashing, however, has some limitations, and it causes come problems to the former calculation of the similarity. Therefore, this research paper has suggested a new comparison method for malicious codes to improve performance of the calculation of the similarity using fuzzy hashing and also a classification method employing the new comparison method.
Keywords
Fuzzy Hash; Malware; Similarity;
Citations & Related Records
연도 인용수 순위
  • Reference
1 AV-TEST, "AV-TEST Statistic," http:// www.av-test.org/en/statistics/, 2012년 2월.
2 AhnLab, "2011 악성코드 통계 분석 보고서," ASEC Report, Vol.24, 2012년 1월.
3 AhnLab, "V3 최신엔진 업데이트," http:// www.ahnlab.com/kr/site/securitycenter/statistics/, 2012년 2월.
4 Hauri, "Virobot 최신엔진 업데이트 목록," http://www.hauri.co.kr/customer/security/update.html, 2012년 2월.
5 Andrew Walenstein, Arun Lakhotia. "The Software Similarity Problem in Malware Analysis," In Proceedings Dagstuhl Seminar 06301: Duplication, Redundancy, and Similarity in Software, Dagstuhl, Germany, pp. 10 July. 2006.
6 이호동, "Windows 시스템 실행파일의 구조와 원리," 한빛미디어, 서울 마포구, 2005년 5월.
7 서기민, 임경수, 이상진, "디지털 포렌식 수사를 위한 유사 파일 탐지," 한국정보기술학회논문지 7(2), pp. 182-190, 2009년 4월.
8 Vassil Roussev, Golden G. Ricahard III, Lodovico Marziale, "Multire solution similarity hashing," Proceedings of the 7th annual Digital Forensics Research Workshop, pp. 105-113, August. 2007.
9 J. Kornblum, "Identifying almost identical files using context triggered piecewise hashing," Digital Investigation, vol. 3(S1), pp. 91-97, March. 2006.   DOI
10 J. Kornblum, "Fuzzy Hashing and ssdeep," http://ssdeep.sourceforge.net, 2006.
11 jcanto, "Extra metadata field : ssdeep," Blog VirusTotal, November 2008.
12 Dustin Hurlbut, "Fuzzy Hashing for Digital Forensic Investigators," AccessData, January. 2009.
13 경찰청, "3.4 DDoS 사건의 공격자는 7.7 DDoS 와 동일범," 경찰청 보도자료, 2011년 4월.
14 Joel Yonts, "Building a Malware Zoo," SANS Institute InfoSec ReadingRoom, Decemver. 2010
15 Q.Miao, Y.Wang, Y.Cao, X.Zhang, Z.Liu, "APICapture - a Tool for Monitoring the Behavior of Malware," Proceedings of the 3rd International Conference on Advanced Computer Theory and Engineering, pp. 390-394, August. 2010.
16 한경수, 김인경, 임을규, "순차적 특징을 이용한 악성코드 변종 분류 기법 API," 보안공학연구논문지 8(2), pp. 319-335, 2011년 4월.
17 박재우, 문성태, 손기욱, 김인경, 한경수, 임을규, 김일곤, "문자열과 API를 이용한 악성코드 자동 분류 시스템," 보안공학연구논문지 8(5), pp. 611-626,2011년 10월.
18 Halvar Flake, "Graph-based binary analysis," In Blackhat Briefings, July. 2002
19 Z. Wang, K. Pierce, and S. McFarling, "BMAT: A Binary Matching Tool for Stale Profile Propagation," The Journal of .nstruction-Level Parallelism, vol. 2, May. 2000.
20 Oh, J. "Fight against 1-day exploits: Diffing binaries vs anti-diffing binaries," In Blackhat technical Security Conference, July. 2009.
21 정용욱, 노봉남, "공격용 툴킷 및 변형 코드의 유사성 기준 선정," 보안공학연구논문지 9(1), pp. 31-44, 2012년 2월
22 David French, "Fuzzy Hashing Against Different Types of Malware," http://blog. sei.cmu.edu/post.cfm/fuzzy-hashing-ag ainst-differnt-types-of-malware, October. 2010.
23 PosionIvy, "Poisonivy-rat Development," http://www.poisonivy-rat.com/index.php?link=dev, 2008.
24 DigitalNinja, "Fuzzy Clarity: Using Fuzz y Hashing Techniques to Identify. Malicio us Code," http://digitalninjitsu.com/taxonomy/term/8, April. 2007
25 Intel, "Intel Architecture Software Developer's Manual Volume 2," Intel Corporation, May. 2010.
26 한승원, 이상진, "악성코드 포렌식을 위한 패킹 파일 탐지에 관한 연구," 정보처리학회 논문지 16(5), pp. 555-562, 2009년 10월.
27 Graphviz, "Graphviz - Graph Visualization Software," http://www.graphviz.org/, 1998