Browse > Article
http://dx.doi.org/10.9717/kmms.2013.16.7.802

A Similarity Measurement and Visualization Method for the Analysis of Program Code  

Lee, Youngjoo (삼성전자 생산기술연구소)
Lee, Jeongjin (숭실대학교 컴퓨터학부)
Publication Information
Abstract
In this paper, we propose the similarity measurement method between two program codes by counting the frequency and length of continuous patterns of specifiers and keywords, which exist in two program codes. In addition, we propose the visualization method of this analysis result by formal concept analysis. Proposed method considers adjacencies of specifiers or keywords, which have not been considered in the previous similarity measurements. Proposed method can detect the plagiarism by analyzing the pattern in each function regardless of the order of function call and execution. In addition, the result of the similarity measurement is visualized by the lattice of formal concept analysis to increase the user understanding about the relations between program codes. Experimental results showed that proposed method succeeded in 96% plagiarism detections. Our method could be applied into the analysis of general documents.
Keywords
Similarity Measurement; Formal Concept Analysis; Pattern Analysis; Concept Lattice;
Citations & Related Records
Times Cited By KSCI : 3  (Citation Analysis)
연도 인용수 순위
1 A. Si, H.V. Leong, and R.W.H. Lau, "CHECK: a Document Plagiarism Detection System," Proc. the 1997 ACM Symposium on Applied Computing, pp. 70-77, 1997.
2 손정우, 박성배, 이상조, 박세영, "Parse tree kernel을 이용한 소스코드 표절 검출," 한국컴퓨터종합학술대회 논문지, 제33권, 제1호, pp. 157- 159, 2006.
3 K.M. Hammouda and M.S. Kamel, "Efficient Phrase-Based Document Indexing for Web Document Clustering," IEEE Transactions on Knowledge and Data Engineering, Vol. 16, No. 10, pp. 1279-1296, 2004.   DOI   ScienceOn
4 이정진, 이호, 김정곤, 이창경, 신영길, 이윤철, 이민선, "동적 MR 영상에서 비강체 정합과 감산 기법을 이용한 자동 전립선 분할 기법," 멀티미디어학회논문지, 제14권, 제3호, pp. 348-355, 2011.
5 P. Boucher-Ryan and D. Bridge, "Collaborative Recommending using Formal Concept Analysis," Research and Development in Intelligent Systems XXII , Vol. 19, No. 1, pp. 205-218, 2006.
6 S.A. Yevtushenko, "System of Data Analysis Concept Explorer," Proc. the 7th national conference on Artificial Intelligence KII-2000, p. 127-134, 2000.
7 A. Barabasi, R. Albert, and H. Jeong, "Scalefree Characteristics of Random Networks: the Topology of the World-wide Web," Physica, Vol. 281, No. 1, pp. 69-77, 2000.   DOI   ScienceOn
8 김영철, 최재영, "구문트리에서 키워드 추출을 이용한 프로그램 유사도 평가," 정보처리학회 논문지, 제12권, 제2호, pp. 109-116, 2005.   과학기술학회마을   DOI   ScienceOn
9 손기락, 문승미, "계층적 군집화 기법을 이용한 소스 코드 표절 검사," 정보교육학회논문지, 제11권, 제1호, pp. 91-98, 2007.   과학기술학회마을
10 김은혜, 이송아, 허준, 한경숙, 오용철, "자바소스코드 유사도 측정 시스템," 한국정보과학회 학술발표논문집, 제34권, 제2호, pp. 536-539, 2007.
11 D. Grune and M. Huntjens, "Het Detecteren van Kopieen bij Informatica-practica," Informatie, Vol. 31, No. 11, pp. 864-867, 1989.
12 한소정, 용환승, "오픈 소스코드 표절 탐지 기법," 한국정보처리학회 추계학술발표대회 논문집, 제15권, 제2호, pp. 1459-1461, 2008.
13 한소정, 오픈 소스코드 표절 탐지 기법, 이화여자대학교 석사논문, 2009.