• Title/Summary/Keyword: Plagiarism Detection

Search Result 65, Processing Time 0.036 seconds

A Plagiarism Detection Technique for Java Program Using Bytecode Analysis (바이트코드 분석을 이용한 자바 프로그램 표절검사기법)

  • Ji, Jeong-Hoon;Woo, Gyun;Cho, Hwan-Gue
    • Journal of KIISE:Software and Applications
    • /
    • v.35 no.7
    • /
    • pp.442-451
    • /
    • 2008
  • Most plagiarism detection systems evaluate the similarity of source codes and detect plagiarized program pairs. If we use the source codes in plagiarism detection, the source code security can be a significant problem. Plagiarism detection based on target code can be used for protecting the security of source codes. In this paper, we propose a new plagiarism detection technique for Java programs using bytecodes without referring their source codes. The plagiarism detection procedure using bytecode consists of two major steps. First, we generate the token sequences from the Java class file by analyzing the code area of methods. Then, we evaluate the similarity between token sequences using the adaptive local alignment. According to the experimental results, we can find the distributions of similarities of the source codes and that of bytecodes are very similar. Also, the correlation between the similarities of source code pairs and those of bytecode pairs is high enough for typical test data. The plagiarism detection system using bytecode can be used as a preliminary verifying tool before detecting the plagiarism by source code comparison.

A Study on Plagiarism Detection and Document Classification Using Association Analysis (연관분석을 이용한 효과적인 표절검사 및 문서분류에 관한 연구)

  • Hwang, Insoo
    • The Journal of Information Systems
    • /
    • v.23 no.3
    • /
    • pp.127-142
    • /
    • 2014
  • Plagiarism occurs when the content is copied without permission or citation, and the problem of plagiarism has rapidly increased because of the digital era of resources available on the World Wide Web. An important task in plagiarism detection is measuring and determining similar text portions between a given pair of documents. One of the main difficulties of this task is that not all similar text fragments are examples of plagiarism, since thematic coincidences also tend to produce portions of similar text. In order to handle this problem, this paper proposed association analysis in data mining to detect plagiarism. This method is able to detect common actions performed by plagiarists such as word deletion, insertion and transposition, allowing to obtain plausible portions of plagiarized text. Experimental results employing an unsupervised document classification strategy showed that the proposed method outperformed traditionally used approaches.

Big Signature Method for Plagiarism Detection (표절 탐지를 위한 비트 시그니처 기법)

  • Kim, Woosaeng;Kang, Kyucheol
    • Journal of Information Technology Applications and Management
    • /
    • v.24 no.1
    • /
    • pp.1-10
    • /
    • 2017
  • Recently, the problem of plagiarism has emerged as a big social issue because not only literature but also thesis become the target of plagiarism. Even the government requires conformation for plagiarism of high-ranking official's thesis as a standard of their ethical morality. Plagiarism is not just direct copy but also paraphrasing, rewording, adapting parts, missing references or wrong citations. This makes the problem more difficult to handle adequately. We propose a plagiarism detection scheme called a bit signature in which each unique word of document is represented by 0 or 1. The bit signature scheme can find the similar documents by comparing their absolute and relative bit signatures. Experiments show that a bit signature scheme produces better performance for document copy detection than existing similar schemes.

The Sequence Labeling Approach for Text Alignment of Plagiarism Detection

  • Kong, Leilei;Han, Zhongyuan;Qi, Haoliang
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.13 no.9
    • /
    • pp.4814-4832
    • /
    • 2019
  • Plagiarism detection is increasingly exploiting text alignment. Text alignment involves extracting the plagiarism passages in a pair of the suspicious document and its source document. The heuristics have achieved excellent performance in text alignment. However, the further improvements of the heuristic methods mainly depends more on the experiences of experts, which makes the heuristics lack of the abilities for continuous improvements. To address this problem, machine learning maybe a proper way. Considering the position relations and the context of text segments pairs, we formalize the text alignment task as a problem of sequence labeling, improving the current methods at the model level. Especially, this paper proposes to use the probabilistic graphical model to tag the observed sequence of pairs of text segments. Hence we present the sequence labeling approach for text alignment in plagiarism detection based on Conditional Random Fields. The proposed approach is evaluated on the PAN@CLEF 2012 artificial high obfuscation plagiarism corpus and the simulated paraphrase plagiarism corpus, and compared with the methods achieved the best performance in PAN@CLEF 2012, 2013 and 2014. Experimental results demonstrate that the proposed approach significantly outperforms the state of the art methods.

A Design and Implementation of the Source Code Plagiarism Detection System

  • Ahn, Byung-Ryul;Choi, Bae-Young;Kim, Moon-Hyun
    • Proceedings of the Korea Society of Information Technology Applications Conference
    • /
    • 2005.11a
    • /
    • pp.319-323
    • /
    • 2005
  • As the software industry develops at a rate speed, anyone can copy or plagiarize without difficulty contents that are becoming digitalized. To make it worse, the development of various contents that be illegally copied and plagiarized are resulting in the increasing infringement on and the plagiarism of the intellectual property. This dissertation tries to put forth the method and the theory to effectively detect any plagiarism of the source code of programs realized in various languages. This dissertation analyzes the advantage and disadvantage of the plagiarism test software, and especially, presents a method to detect possible plagiarism by using the Pattern Matching to overcome its disadvantage. And it also intends to introduce more developed automatic detection system by overcoming the problems with the method of Pattern Matching.

  • PDF

Development of A Plagiarism Detection System Using Web Search and Morpheme Analysis (인터넷 검색과 형태소분석을 이용한 표절검사시스템의 개발에 관한 연구)

  • Hwang, In-Soo
    • Journal of Information Technology Applications and Management
    • /
    • v.16 no.1
    • /
    • pp.21-36
    • /
    • 2009
  • As the World Wide Web (WWW) has become a major channel for information delivery, the data accumulated in the Internet increases at an incredible speed, and it derives the advances of information search technologies. It is the search engine that solves the problem of information overloading and helps people to identify relevant information. However, as search engines become a powerful tool for finding information, the opportunities of plagiarizing have increased significantly in e-Learning. In this paper, we developed an online plagiarism detection system for detecting plagiarized documents that incorporates the functions of search engines and acts in exactly the same way of plagiarizing. The plagiarism detection system uses morpheme analysis to improve the performance and sentence-based comparison to investigate document comes from multiple sources. As a result of applying this system in e-Learning, the performance of plagiarism detection was improved.

  • PDF

Modern Methods of Text Analysis as an Effective Way to Combat Plagiarism

  • Myronenko, Serhii;Myronenko, Yelyzaveta
    • International Journal of Computer Science & Network Security
    • /
    • v.22 no.8
    • /
    • pp.242-248
    • /
    • 2022
  • The article presents the analysis of modern methods of automatic comparison of original and unoriginal text to detect textual plagiarism. The study covers two types of plagiarism - literal, when plagiarists directly make exact copying of the text without changing anything, and intelligent, using more sophisticated techniques, which are harder to detect due to the text manipulation, like words and signs replacement. Standard techniques related to extrinsic detection are string-based, vector space and semantic-based. The first, most common and most successful target models for detecting literal plagiarism - N-gram and Vector Space are analyzed, and their advantages and disadvantages are evaluated. The most effective target models that allow detecting intelligent plagiarism, particularly identifying paraphrases by measuring the semantic similarity of short components of the text, are investigated. Models using neural network architecture and based on natural language sentence matching approaches such as Densely Interactive Inference Network (DIIN), Bilateral Multi-Perspective Matching (BiMPM) and Bidirectional Encoder Representations from Transformers (BERT) and its family of models are considered. The progress in improving plagiarism detection systems, techniques and related models is summarized. Relevant and urgent problems that remain unresolved in detecting intelligent plagiarism - effective recognition of unoriginal ideas and qualitatively paraphrased text - are outlined.

An Exploratory Study on the Current Status of Research Ethics in Higher Education and Its Improvement Methods -With a focus on DEVAC Paper Plagiarism Detection System- (대학교육에서의 연구윤리현황과 개선방안에 관한 탐색적 연구 - DEVAC 과제표절탐색 시스템을 중심으로 -)

  • Park, Su-Hong;Jung, Ju-Young
    • Journal of The Korean Association of Information Education
    • /
    • v.12 no.2
    • /
    • pp.183-194
    • /
    • 2008
  • This research was conducted from the perspective of student management focusing on such central topic as realization of research ethics on the basis of research ethics case study model. In this study, improvement method for research ethics education through means of application of DEVAC System, which is a paper plagiarism detection system, and survey on current status of research ethics in college education and degree of consciousness thereof were explored. Through these investigations, a topic relating to establishment of the foundation in order to foster consciousness of research ethics in the college education was established as the primary purpose of this study. To accomplish the purpose of this study, firstly, actual situation of paper plagiarism committed by the college students and their consciousness were surveyed. Secondly, the research ethics education was examined through means of applying DEVAC paper plagiarism detection system. The results from investigations revealed the followings: First, 424 students (65.43%) who participated in this research and survey on the fact of paper plagiarism had experience of report plagiarism, and the result of investigation showed that 49.3% of students among those who had experience of paper plagiarism committed report plagiarism more than three times in a semester. And, 34.1% of participants showed a positive response to the use of a paper plagiarism detection system in the college, and results from the investigation displayed that the creative education (39.0%) marked the highest scores as in the educational method to reinforce the research ethics. Second, the results from examination of paper plagiarism having applied DEVAC system indicated that use of this system can be an alternative to prevent paper plagiarism from students. It is realized through this study that there is a necessity in various respects to build up the foundation which will enable individual students to improve their consciousness to such a degree so as to make them clearly recognize the fact that plagiarism is criminal act.

  • PDF

A Survey of Plagiarism Inspection Method for Efficient Protecting of Intellectual Properties and Proposal of Art works Plagiarism Inspection (지적재산권의 효율적 보호를 위한 표절 감정 기법의 고찰 및 예술품의 위작 감정 방법의 제안)

  • 조동욱
    • Proceedings of the Korea Contents Association Conference
    • /
    • 2003.11a
    • /
    • pp.72-78
    • /
    • 2003
  • In this paper, survey of technical methods for protecting intellectual properties and proposal of art works plagiarism detection are accomplished. For this, in this paper, a survey of technical methods for inspecting of program source code plagiarism, analysis of natural languages plagiarism types and existing inspection methods are accomplished Also, author verification system and plagiarism detection about ancient literatures or art works is proposed because of ancient literatures or art work are important in the aspect of cultural properties control, protecting of author's intellectual property and owner's property estimation.

  • PDF

Development of a System for Music Plagiarism Detection Using Melody Databases (음악 데이터베이스를 이용한 음악 표절 감지 시스템 개발)

  • Park Jeong il;Kim Sang Wook
    • Journal of Korea Multimedia Society
    • /
    • v.8 no.1
    • /
    • pp.1-8
    • /
    • 2005
  • Similar melody searching is an operation that finds such melodies similar to a given query melody from a music database. In this paper, we address the development of a system that detects plagiarism based on the similar melody searching. We first Propose a novel similarity model that supports alignment as well as shifting. Also, we suggest a method for indexing the features extracted from each melody, and a method for processing plagiarism detection by using the index. By our plagiarism detection system composers can easily searches for such melodies that are similar to their ones from music databases. Through performance evaluation via a series of experiments, we show the effectiveness of our approach. The results reveal that our approach outperforms the sequential-scan-based one in speed up to around 31 times.

  • PDF