• Title/Summary/Keyword: 소프트웨어 유사도

Search Result 816, Processing Time 0.029 seconds

Measuring Similarity of Android Applications Using Method Reference Frequency and Manifest Information (메소드 참조 빈도와 매니페스트 정보를 이용한 안드로이드 애플리케이션들의 유사도 측정)

  • Kim, Gyoosik;Hamedani, Masoud Reyhani;Cho, Seong-je;Kim, Seong Baeg
    • The Journal of Korean Institute of Next Generation Computing
    • /
    • v.13 no.3
    • /
    • pp.15-25
    • /
    • 2017
  • As the value and importance of softwares are growing up, software theft and piracy become a much larger problem. To tackle this problem, it is highly required to provide an accurate method for detecting software theft and piracy. Especially, while software theft is relatively easy in the case of Android applications (apps), screening illegal apps has not been properly performed in Android markets. In this paper, we propose a method to effectively measure the similarity between Android apps for detecting software theft at the executable file level. Our proposed method extracts method reference frequency and manifest information through static analysis of executable Android apps as the main features for similarity measurement. Each app is represented as an n-dimensional vectors with the features, and then cosine similarity is utilized as the similarity measure. We demonstrate the effectiveness of our proposed method by evaluating its accuracy in comparison with typical source code-based similarity measurement methods. As a result of the experiments for the Android apps whose source file and executable file are available side by side, we found that our similarity degree measured at the executable file level is almost equivalent to the existing well-known similarity degree measured at the source file level.

Enhancing the performance of code-clone detection tools using code2vec (code2vec을 이용한 유사도 감정 도구의 성능 개선)

  • Um, Taeho;Hong, Sung Moon;Yang, Joon Hyuk;Jang, Hyo Seok;Doh, Kyung-Goo
    • Journal of Software Assessment and Valuation
    • /
    • v.17 no.1
    • /
    • pp.31-40
    • /
    • 2021
  • Plagiarism refers to the act of using the original data as if it were one's own without revealing the source. The plagiarism of source code causes a variety of problems, including legal disputes. Plagiarism in software projects is usually determined by measuring similarity by comparing every pair of source code within two projects. However, blindly comparing every pair has been a huge computational burden, causing a major factor of not using tools of better accuracy. If we can only compare pairs that are probable to be clones, eliminating pairs that are impossible to be clones, we can concentrate more on improving the accuracy of detection. In this paper, we propose a method of selecting highly probable candidates of clone pairs by pre-classifying suspected source-codes using a machine-learning model called code2vec.

A Study on the Degree of Signature Similarity between Domain Model and Object Candidate Groups (영역 모델과 객체후보군의 유사도 측정에 관한 연구)

  • 박성옥;노경주;이문근
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 1999.10a
    • /
    • pp.522-524
    • /
    • 1999
  • 절차지향 소프트웨어를 객체지향 소프트웨어로 변환하는 여러 가지 방법이 존재한다. 프로그램을 변환하기 위하여 일반적으로 함수, 변수와 자료형들 간의 관계를 이용한다. 이들간의 관계성을 이용하면 결과로서 객체 후보가 생성된다. 생성된 객체 후보와 영역 전문가에 의하여 생성된 영역 모델을 비교하여 두 모델간의 유사성을 측정하여야 한다. 본 논문에서는 클래스의 시그너처(클래스 이름, 속성의 이름, 속성의 자료형, 메소드 이름, 메소드의 리턴형, 메소드 파라미터의 자료형)을 이용하여 클래스와 객체 후보의 유사도를 측정하고, 측정된 유사도의 평균값을 이용하여 객체 후보군의 유사도를 측정한다. 기존의 연구 방법과는 다르게 n개의 클래스와 m개의 객체 후보사이의 구문적 측면의 유사도 측정뿐만이 아니라 의미적 측면의 유사도를 측정하는 방법을 제시하여 최적합 객체 후보군을 추출하도록 하였다.

  • PDF

Appraisal Method for Similarity of Large File Transfer Software (대용량 파일 전송 소프트웨어의 동일성 감정 방법)

  • Chun, Byung-Tae
    • Journal of Software Assessment and Valuation
    • /
    • v.17 no.1
    • /
    • pp.11-16
    • /
    • 2021
  • The importance of software is increasing due to the development of information and communication, and software copyright disputes are also increasing. In this paper, the source of the submitted programs and the files necessary for the execution of the program were taken as the scope of analysis. The large-capacity file transfer solution program to be analyzed provides additional functions such as confidentiality, integrity, user authentication, and non-repudiation functions through digital signature and encryption of data.In this paper, we analyze the program A, program B, and the program C. In order to calculate the program similarity rate, the following contents are analyzed. Analyze the similarity of the package structure, package name, source file name in each package, variable name in source file, function name, function implementation source code, and product environment variable information. It also calculates the overall similarity rate of the program. In order to check the degree of agreement between the package structure and the package name, the similarity was determined by comparing the folder structure. It also analyzes the extent to which the package structure and package name match and the extent to which the source file (class) name within each package matches.

Design and Implementation of Computer Engineering Technical Interview Support System (컴퓨터 공학 기술 면접 지원 시스템의 설계 및 구현)

  • Dong-Hyun Lee;Seung-Min Park;Dong-Hyun Kim
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.19 no.3
    • /
    • pp.603-608
    • /
    • 2024
  • Recently, the frequency of computer engineering and technology interviews has increased in the process of hiring developers, and accordingly, the burden of technical interviews among interviewees has also increased. However, during computer engineering technical interview practice, it is difficult to judge whether one's answers are correct, and to measure the appropriate vocalization speed by oneself. In this paper, we propose a computer engineering technical interview support system using similarity measurement technology. The proposed system measures the technical accuracy of the interviewee's answers through a sentence similarity evaluation procedure using cosine similarity to measure the technical accuracy of the interviewee's answers. It also measures the speech rate and provides it to the interviewee.

Analysis Method for Revision and Addition of the Specification to Appraisal (감정 대상 규격서의 수정 및 추가에 대한 분석 방법)

  • Chun, Byung-Tae
    • Journal of Software Assessment and Valuation
    • /
    • v.16 no.2
    • /
    • pp.37-44
    • /
    • 2020
  • As the information society develops, various cases of copyright infringement have occurred. In many disputes between companies, software similarity appraisal is dominated. This thesis is a study on the method of calculating the similarity of the standards subject to appraisal. In other words, it is a study to calculate the amount of revision and addition of the specification to be assessed. The analysis method compares the table of contents of both specifications and finds the same or similar part. The similarity weight is determined according to the degree of similarity. Weights identify and assign the degree of similarity between the expert's expertise and the specification. If it is completely newly added, the similarity weight is 1, if it is partially modified, the similarity weight is 0.4, and if it is almost the same as before, it is calculated by giving a weight of 0.05. Through this paper, it was found that the result of calculating the similarity to the specification is 21.2 pages.

Applying Coarse-to-Fine Curriculum Learning Mechanism to the multi-label classification task (다중 레이블 분류 작업에서의 Coarse-to-Fine Curriculum Learning 메카니즘 적용 방안)

  • Kong, Heesan;Park, Jaehun;Kim, Kwangsu
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2022.07a
    • /
    • pp.29-30
    • /
    • 2022
  • Curriculum learning은 딥러닝의 성능을 향상시키기 위해 사람의 학습 과정과 유사하게 일종의 'curriculum'을 도입해 모델을 학습시키는 방법이다. 대부분의 연구는 학습 데이터 중 개별 샘플의 난이도를 기반으로 점진적으로 모델을 학습시키는 방안에 중점을 두고 있다. 그러나, coarse-to-fine 메카니즘은 데이터의 난이도보다 학습에 사용되는 class의 유사도가 더욱 중요하다고 주장하며, 여러 난이도의 auxiliary task를 차례로 학습하는 방법을 제안했다. 그러나, 이 방법은 혼동행렬 기반으로 class의 유사성을 판단해 auxiliary task를 생성함으로 다중 레이블 분류에는 적용하기 어렵다는 한계점이 있다. 따라서, 본 논문에서는 multi-label 환경에서 multi-class와 binary task를 생성하는 방법을 제안해 coarse-to-fine 메카니즘 적용을 위한 방안을 제시하고, 그 결과를 분석한다.

  • PDF

Design of Document Suggestion System based on TF-IDF Algorithm for Efficient Organization of Documentation (효율적인 문서 구성을 위한 TF-IDF 알고리즘 기반 문서 제안 시스템의 설계)

  • Kim, Young-Hoon;Park, Seung-Min;Cho, Dae-Soo
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2022.07a
    • /
    • pp.527-528
    • /
    • 2022
  • 빠르게 변하는 환경에 맞춰 평생 교육이 일반화되고 개인에게 요구되는 학습량은 많아지고 있으며 높아진 학습량에 맞게 학습 시간 단축과 효율적인 학습을 위한 학습 방법을 선택하는 것이 중요해지고 있다. 본 논문에서는 학습 정리를 위해 작성한 문서를 분석하여 해당 문서와 관련된 문서를 제안하고 본 문서와 엮어 학습을 위한 문서 묶음을 만들 수 있는 시스템을 제안한다. 문서의 유사도, 중요도를 구할 수 있는 TF-IDF를 이용하여 문서를 분석해 키워드를 추출한 다음 그와 관련된 문서를 제안하고 문서 묶음을 만들어 조회할 수 있도록 한다. 이 시스템은 학습 정리 시 관련 문서를 함께 볼 수 있도록 하고, 필요하다면 묶음으로 만들어 효과적인 학습을 위한 도구로 이용할 수 있다.

  • PDF

Similarity Detection in Object Codes and Design of Its Tool (목적 코드에서 유사도 검출과 그 도구의 설계)

  • Yoo, Jang-Hee
    • Journal of Software Assessment and Valuation
    • /
    • v.16 no.2
    • /
    • pp.1-8
    • /
    • 2020
  • The similarity detection to plagiarism or duplication of computer programs requires a different type of analysis methods and tools according to the programming language used in the implementation and the sort of code to be analyzed. In recent years, the similarity appraisal for the object code in the embedded system, which requires a considerable resource along with a more complicated procedure and advanced skill compared to the source code, is increasing. In this study, we described a method for analyzing the similarity of functional units in the assembly language through the conversion of object code using the reverse engineering approach, such as the reverse assembly technique to the object code. The instruction and operand table for comparing the similarity is generated by using the syntax analysis of the code in assembly language, and a tool for detecting the similarity is designed.

A Study on the Relationship between Weighted Value and Qualitative Standard in Substantial Similarity (실질적 유사성 판단을 위한 가중치 활용과 질적 분석의 관계)

  • Kim, Si-Yeol
    • Journal of Software Assessment and Valuation
    • /
    • v.15 no.1
    • /
    • pp.25-35
    • /
    • 2019
  • In Korea, the calculation of quantitative similarity is commonly used to gauge the substantial similarity of computer programs. Substantial similarity should be assessed by considering the quantity and quality of areas that show similarity, but in practice, qualitative aspects are reflected by multiplying the weighted value in the calculation of quantitative similarity. However, such a practical method cannot be deemed adequate, considering the fundamental characteristic of the judgment on substantial similarity, which holds that the quantitative and qualitative aspects of similar areas should be considered on an equal footing. Thus, this study pointed out the issue regarding the use of weighted value and sought appropriate ways to take into account qualitative aspects when assessing the substantial similarity of computer programs.