Browse > Article
http://dx.doi.org/10.33778/kcsa.2019.19.4.133

Modified File Title Normalization Techniques for Copyright Protection  

Hwang, Chan Woong (호서대학교 정보보호학과)
Ha, Ji Hee (호서대학교 정보보호학과)
Lee, Tea Jin (호서대학교 컴퓨터정보공학부)
Publication Information
Abstract
Although torrents and P2P sites or web hard are frequently used by users simply because they can be easily downloaded freely or at low prices, domestic torrent and P2P sites or web hard are very sensitive to copyright. Techniques have been researched and applied. Among these, title and string comparison method filtering techniques that block the number of cases such as file titles or combinations of key words are blocked by changing the title and spacing. Bypass is easy through. In order to detect and block illegal works for copyright protection, a technique for normalizing modified file titles is essential. In this paper, we compared the detection rate by searching before and after normalizing the modified file title of illegal works and normalizing the file title. Before the normalization, the detection rate was 77.72%, which was unfortunate while the detection rate was 90.23% after the normalization. In the future, it is expected that better handling of nonsense terms, such as common date and quality display, will yield better results.
Keywords
P2P Sites; Banned Word; Filtering; File Title Normalization; Detection;
Citations & Related Records
Times Cited By KSCI : 2  (Citation Analysis)
연도 인용수 순위
1 [한국저작권보호원 보도자료] 한국저작권보호원, 2017년 불법복제물 유통 실태 발표.hwp
2 윤희돈, 조성환 "효과적인 웹툰 저작권 보호 방법에 관한 연구" 한국정보전자통신기술학회논문지(jkiiect)'19-2, Vol.12 No.1
3 김봉환 "파일 공유를 위한 P2P 어플리케이션 구조와 보안 위협" 한국콘텐츠학회지 7(1), 2009.3, 20-27(8 pages)   DOI
4 김병룡 "DHT 기반 P2P 네트워크에서 효과적인 다중 키워드 검색 기법 연구" 한국정보과학회 학술발표논문집 , 2014.6, 1236-1237(2 pages)
5 윤효근, 이상용 "협력적 필터링 기법을 이용한 P2P 모바일 에이전트 기반 사용자 컨텍스트 인식 및 서비스 처리 구조" 한국지능시스템학회논문지 15(1), 2005.2, 104-109(6 pages)   DOI
6 Changbin Lee, Kwangwoo Lee, Dongho Won and Seungjoo Kim "Weaknesses and Improvements of P2P File-sharing Filtering System"
7 강승식, 장두성, "SMS 변형된 문자열의 자동 오류 교정 시스템," 정보과학회논문지, 제35권, 제6호, 386-391쪽, 2008년 6월
8 강승식, "스팸 문자 필터링을 위한 변형된 한글 SMS 문장의 정규화 기법," 정보처리학회논문지, 제3권, 제7호, 271-276쪽, 2014년 7월   DOI
9 이현영, 강승식 "워드 임베딩과 딥러닝 기법을 이용한 SMS 문자 메시지 필터링" (No.NRF-2017M3C4A7068186)
10 Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., & Dean, J., "Distributed Representations of Words and Phrases and their Compositionality," In Advances in neural information processing systems, Lake Tahoe, the United States, pp.3111-3119, Dec. 2013
11 Mikolov, Tomas, et al., "Recurrent neural network based language model," Eleventh Annual Conference of the International Speech Communication Association, Makuhari, Chiba, Japan, pp. 1045-1048, Sep. 2010
12 Mikolov, T., Yih, W. T., & Zweig, G., "Linguistic Regularities in Continuous Space Word Representations," In Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Atlanta, Georgia , the United States, pp. 746-751, Jun. 2013
13 M. S. Charikar, "Similarity estimation techniques from rounding algorithms," in Proceedings of the 34th Annual ACM Symposium on Theory of Computing, pp. 380-388, ACM, New York, ,NY, USA, 2002
14 DATAR, Mayur, et al. Locality-sensitive hashing scheme based on p-stable distributions. In: Proceedings of the twentieth annual symposium on Computational geometry. ACM, 2004. p. 253-262.
15 Manku, Gurmeet Singh, Arvind Jain, and Anish Das Sarma. "Detecting near-duplicates for web crawling." Proceedings of the 16th international conference on World Wide Web. ACM, pp. 141-150, 2007.