• Title/Summary/Keyword: Sentence Plagiarism Detection

Search Result 5, Processing Time 0.015 seconds

A Study on Text Pattern Analysis Applying Discrete Fourier Transform - Focusing on Sentence Plagiarism Detection - (이산 푸리에 변환을 적용한 텍스트 패턴 분석에 관한 연구 - 표절 문장 탐색 중심으로 -)

  • Lee, Jung-Song;Park, Soon-Cheol
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.22 no.2
    • /
    • pp.43-52
    • /
    • 2017
  • Pattern Analysis is One of the Most Important Techniques in the Signal and Image Processing and Text Mining Fields. Discrete Fourier Transform (DFT) is Generally Used to Analyzing the Pattern of Signals and Images. We thought DFT could also be used on the Analysis of Text Patterns. In this Paper, DFT is Firstly Adapted in the World to the Sentence Plagiarism Detection Which Detects if Text Patterns of a Document Exist in Other Documents. We Signalize the Texts Converting Texts to ASCII Codes and Apply the Cross-Correlation Method to Detect the Simple Text Plagiarisms such as Cut-and-paste, term Relocations and etc. WordNet is using to find Similarities to Detect the Plagiarism that uses Synonyms, Translations, Summarizations and etc. The Data set, 2013 Corpus, Provided by PAN Which is the One of Well-known Workshops for Text Plagiarism is used in our Experiments. Our Method are Fourth Ranked Among the Eleven most Outstanding Plagiarism Detection Methods.

Development of A Plagiarism Detection System Using Web Search and Morpheme Analysis (인터넷 검색과 형태소분석을 이용한 표절검사시스템의 개발에 관한 연구)

  • Hwang, In-Soo
    • Journal of Information Technology Applications and Management
    • /
    • v.16 no.1
    • /
    • pp.21-36
    • /
    • 2009
  • As the World Wide Web (WWW) has become a major channel for information delivery, the data accumulated in the Internet increases at an incredible speed, and it derives the advances of information search technologies. It is the search engine that solves the problem of information overloading and helps people to identify relevant information. However, as search engines become a powerful tool for finding information, the opportunities of plagiarizing have increased significantly in e-Learning. In this paper, we developed an online plagiarism detection system for detecting plagiarized documents that incorporates the functions of search engines and acts in exactly the same way of plagiarizing. The plagiarism detection system uses morpheme analysis to improve the performance and sentence-based comparison to investigate document comes from multiple sources. As a result of applying this system in e-Learning, the performance of plagiarism detection was improved.

  • PDF

Modern Methods of Text Analysis as an Effective Way to Combat Plagiarism

  • Myronenko, Serhii;Myronenko, Yelyzaveta
    • International Journal of Computer Science & Network Security
    • /
    • v.22 no.8
    • /
    • pp.242-248
    • /
    • 2022
  • The article presents the analysis of modern methods of automatic comparison of original and unoriginal text to detect textual plagiarism. The study covers two types of plagiarism - literal, when plagiarists directly make exact copying of the text without changing anything, and intelligent, using more sophisticated techniques, which are harder to detect due to the text manipulation, like words and signs replacement. Standard techniques related to extrinsic detection are string-based, vector space and semantic-based. The first, most common and most successful target models for detecting literal plagiarism - N-gram and Vector Space are analyzed, and their advantages and disadvantages are evaluated. The most effective target models that allow detecting intelligent plagiarism, particularly identifying paraphrases by measuring the semantic similarity of short components of the text, are investigated. Models using neural network architecture and based on natural language sentence matching approaches such as Densely Interactive Inference Network (DIIN), Bilateral Multi-Perspective Matching (BiMPM) and Bidirectional Encoder Representations from Transformers (BERT) and its family of models are considered. The progress in improving plagiarism detection systems, techniques and related models is summarized. Relevant and urgent problems that remain unresolved in detecting intelligent plagiarism - effective recognition of unoriginal ideas and qualitatively paraphrased text - are outlined.

A Detection Method of Similar Sentences Considering Plagiarism Patterns of Korean Sentence (한국어 문장 표절 유형을 고려한 유사 문장 판별)

  • Ji, Hye-Sung;Joh, Joon-Hee;Lim, Heui-Seok
    • The Journal of Korean Association of Computer Education
    • /
    • v.13 no.6
    • /
    • pp.79-89
    • /
    • 2010
  • In this paper, we proposed a method to find out similar sentences from documents to detect plagiarized documents. The proposed model adapts LSA and N-gram techniques to detect every type of Korean plagiarized sentence type. To evaluate the performance of the model, we constructed experimental data using students' essays on the same theme. Students made their essay by intentionally plagiarizing some reference documents. The experimental results showed that our proposed model outperforms the conventional N-gram model, Vector model, LSA model in precision, recall, and F measures.

  • PDF

Development of Automatic Reference-Citation-Mark Attachment Support System (참고문헌 인용부호 자동부착 지원 시스템 개발)

  • Song, Kwangho;Min, Jihong;Kim, Yoo-sung
    • KIISE Transactions on Computing Practices
    • /
    • v.21 no.10
    • /
    • pp.623-630
    • /
    • 2015
  • In this paper, the design and implementation of an automatic reference-citation-mark attachment system are introduced. The system automatically attaches a citation mark to the end of a sentence in a technical document if the corresponding statement has a high similarity to another statement in the same document; simultaneously, the corresponding bibliographic data is automatically created from the cited-document information. In accordance with functional specifications, a Web-based, online service model and the development of its prototype system are proposed. The developed system can help in the elimination of unexpected plagiarism issues, and will alleviate the burdens of reference citation and reference-list creation for technical writers.