• Title/Summary/Keyword: 코드클론

Search Result 14, Processing Time 0.025 seconds

Automatic Generation of Code-clone Reference Corpus (코드클론 표본 집합체 자동 생성기)

  • Lee, Hyo-Sub;Doh, Kyung-Goo
    • Journal of Software Assessment and Valuation
    • /
    • v.7 no.1
    • /
    • pp.29-39
    • /
    • 2011
  • To evaluate the quality of clone detection tools, we should know how many clones the tool misses. Hence we need to have the standard code-clone reference corpus for a carefully chosen set of sample source codes. The reference corpus available so far has been built by manually collecting clones from the results of various existing tools. This paper presents a tree-pattern-based clone detection tool that can be used for automatic generation of reference corpus. Our tool is compared with CloneDR for precision and Bellon's reference corpus for recall. Our tool finds no false positives and 2 to 3 times more clones than CloneDR. Compared to Bellon's reference corpus, our tools shows the 93%-to-100% recall rate and detects far more clones.

CCR : Tree-pattern based Code-clone Detector (CCR : 트리패턴 기반의 코드클론 탐지기)

  • Lee, Hyo-Sub;Do, Kyung-Goo
    • Journal of Software Assessment and Valuation
    • /
    • v.8 no.2
    • /
    • pp.13-27
    • /
    • 2012
  • This paper presents a tree-pattern based code-clone detector as CCR(Code Clone Ransacker) that finds all clusterd dulpicate pattern by comparing all pair of subtrees in the programs. The pattern included in its entirely in another pattern is ignored since only the largest duplicate patterns are interesed. Evaluation of CCR is high precision and recall. The previous tree-pattern based code-clone detectors are known to have good precision and recall because of comparing program structure. CCR is still high precision and the maximum 5 times higher recall than Asta and about 1.9 times than CloneDigger. The tool also include the majority of Bellon's reference corpus.

Cross-Language Clone Detection based on Common Token (공통 토큰에 기반한 서로 다른 언어의 유사성 검사)

  • Hong, Sung-Moon;Kim, Hyunha;Lee, Jaehyung;Park, Sungwoo;Mo, Ji-Hwan;Doh, Kyung-Goo
    • Journal of Software Assessment and Valuation
    • /
    • v.14 no.2
    • /
    • pp.35-44
    • /
    • 2018
  • Tools for detecting cross-language clones usually compare abstract-syntax-tree representations of source code, which lacks scalability. In order to compare large source code to a practical level, we need a similarity checking technique that works on a token level basis. In this paper, we define common tokens that represent all tokens commonly used in programming languages of different paradigms. Each source code of different language is then transformed into the list of common tokens that are compared. Experimental results using exEyes show that our proposed method using common tokens is effective in detecting cross-language clones.

Improvement of BigCloneBench Using Tree-Based Convolutional Neural Network (트리 기반 컨볼루션 신경망을 이용한 BigCloneBench 개선)

  • Park, Gunwoo;Hong, Sung-Moon;Kim, Hyunha;Doh, Kyung-Goo
    • Journal of Software Assessment and Valuation
    • /
    • v.15 no.1
    • /
    • pp.43-53
    • /
    • 2019
  • BigCloneBench has recently been used for performance evaluation of code clone detection tool using machine learning. However, since BigCloneBench is not a benchmark that is optimized for machine learning, incorrect learning data can be created. In this paper, we have shown through experiments using machine learning that the set of Type-4 clone methods provided by BigCloneBench can additionally be found. Experimental results using Tree-Based Convolutional Neural Network show that our proposed method is effective in improving BigCloneBench's dataset.

Design of a protocol for preventing mobile agent clone from execution (이동에이전트의 클론 실행 방지하는 프로토콜 설계)

  • ;;R.Ramakrishna
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 1998.10a
    • /
    • pp.311-313
    • /
    • 1998
  • 이동에이전트는 독립된 객체로서 자율성을 가지고 컴퓨터를 이동하며 부연된 임무를 수행하는 프로그램이다. 이동에이전트는 코드와 데이터로 구성된 프로그램이므로 쉽게 복제될 수 있다. 이렇게 복제된 이동에이전트를 이동에이전트 클론이라 한다. 복제된 클론은 원본과 구별이 불가능하다. 이것은 에이전트의 인증을 불가능하게 만들고 예상되지 않은 에이전트의 중복 수행을 야기하며 에이전트의 내부정보 유출 공격을 위한 수단으로 사용된다. 본 논문에서는 이동에이전트 클론에 의한 이러한 문제점을 고찰하고 온라인 상에서 클론의 존재를 탐지하고 실행을 방지하며 클론을 생성한 서버를 확인하는 프로토콜을 설계한다.

  • PDF

A Code Clustering Technique for Unifying Method Full Path of Reusable Cloned Code Sets of a Product Family (제품군의 재사용 가능한 클론 코드의 메소드 경로 통일을 위한 코드 클러스터링 방법)

  • Kim, Taeyoung;Lee, Jihyun;Kim, Eunmi
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.12 no.1
    • /
    • pp.1-18
    • /
    • 2023
  • Similar software is often developed with the Clone-And-Own (CAO) approach that copies and modifies existing artifacts. The CAO approach is considered as a bad practice because it makes maintenance difficult as the number of cloned products increases. Software product line engineering is a methodology that can solve the issue of the CAO approach by developing a product family through systematic reuse. Migrating product families that have been developed with the CAO approach to the product line engineering begins with finding, integrating, and building them as reusable assets. However, cloning occurs at various levels from directories to code lines, and their structures can be changed. This makes it difficult to build product line code base simply by finding clones. Successful migration thus requires unifying the source code's file path, class name, and method signature. This paper proposes a clustering method that identifies a set of similar codes scattered across product variants and some of their method full paths are different, so path unification is necessary. In order to show the effectiveness of the proposed method, we conducted an experiment using the Apo Games product line, which has evolved with the CAO approach. As a result, the average precision of clustering performed without preprocessing was 0.91 and the number of identified common clusters was 0, whereas our method showed 0.98 and 15 respectively.

Enhancing the performance of code-clone detection tools using code2vec (code2vec을 이용한 유사도 감정 도구의 성능 개선)

  • Um, Taeho;Hong, Sung Moon;Yang, Joon Hyuk;Jang, Hyo Seok;Doh, Kyung-Goo
    • Journal of Software Assessment and Valuation
    • /
    • v.17 no.1
    • /
    • pp.31-40
    • /
    • 2021
  • Plagiarism refers to the act of using the original data as if it were one's own without revealing the source. The plagiarism of source code causes a variety of problems, including legal disputes. Plagiarism in software projects is usually determined by measuring similarity by comparing every pair of source code within two projects. However, blindly comparing every pair has been a huge computational burden, causing a major factor of not using tools of better accuracy. If we can only compare pairs that are probable to be clones, eliminating pairs that are impossible to be clones, we can concentrate more on improving the accuracy of detection. In this paper, we propose a method of selecting highly probable candidates of clone pairs by pre-classifying suspected source-codes using a machine-learning model called code2vec.

Numerical Study of Turbulent Flow in a Hydrocyclone (하이드로사이클론 내의 난류유동해석)

  • Ju, Jong-Il;Choi, Young-Seok;Lee, Yong-Kab;Kim, Tak-Hyun;Kim, sangyong
    • The KSFM Journal of Fluid Machinery
    • /
    • v.6 no.2 s.19
    • /
    • pp.34-40
    • /
    • 2003
  • Numerical studies have been conducted to predict the solid-liquid separation efficiency of turbulent flow in a hydrocyclone using a commercial CFD code. To validate the CFD code, several preliminary numerical calculations are carried out to determine the influence of parameters such as grid systems, numerical schemes, and turbulence models. The numerical studies have been performed on the hydrocyclones with the different vortex finder geometries by changing the mass flow rate, and the results were compared with the experimental data. The results show that the CFD code can be used as a design tool to improve the performance of hydrocyclones.

Implementation of Open Source-based SIC/XE Computer Simulator for Educational Laboratory (오픈소스 기반의 실습용 SIC/XE 컴퓨터 시뮬레이터의 구현)

  • Kim, Ju Hyun;Kim, Hyun Ah;Moon, Bongkyo
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2017.11a
    • /
    • pp.2-5
    • /
    • 2017
  • 기존의 어셈블러는 시각적으로 불편하고 사용자 편의를 위한 기능을 제공하지 않으며 최신의 컴퓨터와의 호환성 문제가 있었다. 이러한 문제점들의 해결책으로 나온 SIC/XE 어셈블러 시뮬레이터 오픈 소스를 GitHub에서 클론하여 분석하고 테스트하였다. 본 논문에서는 오픈 소스 SIC/XE 어셈블러 시뮬레이터의 다양한 오류를 분석하고 이를 수정하였다. 또한 리터럴 테이블, 심볼 테이블, 목적코드 및 오류 메시지의 시각화를 통해 기존의 SIC/XE 어셈블러 시뮬레이터를 개선시켜 사용자 편의를 높인 학습용 SIC/XE 어셈블러 시뮬레이터를 구현하였다.

Cloning and Expression of A Bacillus licheniformis Cellulase Gene (Bacillus licheniformis WL-12의 cellulase 유전자 클로닝과 발현)

  • Yoon, Ki-Hong
    • Korean Journal of Microbiology
    • /
    • v.42 no.4
    • /
    • pp.313-318
    • /
    • 2006
  • A thermophilic bacterium producing the extracellular cellulase was isolated from soybean paste, and the isolate WL-12 has been identified as Bacillus licheniformis on the basis on its 16S rRNA sequence, morphology and biochemical properties. A gene encoding the cellulase of B. licheniformis WL-12 was cloned and its nucleotide sequence was determined. This cellulase gene, designated celA, consisted of 1,551 nucleotides, encoding a polypeptide of 517 amino acid residues. The gene product contained catalytic domain and cellulose binding domain. The deduced amino acid sequence was highly homologous to those of cellulases of B. licheniformis, B. subtilis and B. amytoliquefaciens belonging to the glycosyl hydrolase family 5. When the celA gene was highly expressed using a strong B. subtilis promoter, the extracellular cellulase was produced up to 7.0 units/ml in B. subtilis WB700.