Browse > Article
http://dx.doi.org/10.3745/KTCCS.2020.9.10.231

CNN-Based Novelty Detection with Effectively Incorporating Document-Level Information  

Jo, Seongung (한국기술교육대학교 컴퓨터공학부)
Oh, Heung-Seon (한국기술교육대학교 컴퓨터공학과)
Im, Sanghun (한국기술교육대학교 컴퓨터공학부)
Kim, Seonho (한국과학기술정보연구원)
Publication Information
KIPS Transactions on Computer and Communication Systems / v.9, no.10, 2020 , pp. 231-238 More about this Journal
Abstract
With a large number of documents appearing on the web, document-level novelty detection has become important since it can reduce the efforts of finding novel documents by discarding documents sharing redundant information already seen. A recent work proposed a convolutional neural network (CNN)-based novelty detection model with significant performance improvements. We observed that it has a restriction of using document-level information in determining novelty but assumed that the document-level information is more important. As a solution, this paper proposed two methods of effectively incorporating document-level information using a CNN-based novelty detection model. Our methods focus on constructing a feature vector of a target document to be classified by extracting relative information between the target document and source documents given as evidence. A series of experiments showed the superiority of our methods on a standard benchmark collection, TAP-DLND 1.0.
Keywords
Deep Learning; CNN; Novelty Detection;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Y. Yang, J. Zhang, J. Carbonell, and C. Jin, "Topicconditioned novelty detection," in Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2002.
2 Z. Yi, J. Callan, and T. Minka, "Novelty and redundancy detection in adaptive filtering," in SIGIR Forum (ACM Special Interest Group on Information Retrieval), 2002.
3 T. Ghosal, V. Edithal, A. Ekbal, P. Bhattacharyya, G. Tsatsaronis, and S. S. S. K. Chivukula, "Novelty Goes Deep. A Deep Neural Solution To Document Level Novelty Detection," Proc. 27th Int. Conf. Comput. Linguist., pp. 2802-2813, 2018.
4 C. L. Wayne, "Topic detection and tracking in English and Chinese," Proc. 5th Int. Work. Inf. Retr. with Asian Lang. IRAL 2000, pp. 165-172, 2000.
5 Y. Kim, "Convolutional Neural Networks for Sentence Classification," in Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Vol. 2017-Janua, pp.1746-1751, 2014.
6 A. Conneau, D. Kiela, H. Schwenk, L. Barrault, and A. Bordes, "Supervised learning of universal sentence representations from natural language inference data," EMNLP 2017 - Conf. Empir. Methods Nat. Lang. Process. Proc., pp.670-680, 2017.
7 L. Mou et al., "Natural language inference by tree-based convolution and heuristic matching," 54th Annu. Meet. Assoc. Comput. Linguist. ACL 2016 - Short Pap., pp.130-136, 2016.
8 B. C. Wallace, "A Sensitivity Analysis of (and Practitioners' Guide to) Convolutional Neural Networks for Sentence Classification," 2014.
9 Y. Kim, Y. Jernite, D. Sontag, and A. M. Rush, "Character-Aware neural language models," in 30th AAAI Conference on Artificial Intelligence, AAAI 2016, 2016, pp. 2741-2749.
10 A. Conneau, H. Schwenk, Y. Le Cun, and L. Barrault, "Very deep convolutional networks for text classification," 15th Conf. Eur. Chapter Assoc. Comput. Linguist. EACL 2017 - Proc. Conf., Vol.1, No.2001, pp.1107-1116, 2017.
11 T. Ghosal, A. Salam, S. Tiwari, A. Ekbal, and P. Bhattacharyya, "TAP-DLND 1.0: A corpus for document level novelty detection," Lr. 2018 -11th Int. Conf. Lang. Resour. Eval., Vol.7, pp.3541-3547, 2019.
12 M. Karkali, F. Rousseau, A. Ntoulas, and M. Vazirgiannis, "Using temporal IDF for efficient novelty detection in text streams," pp.1-30, 2014.
13 D. P. Kingma and J. Ba, "Adam: A Method for Stochastic Optimization," 3rd Int. Conf. Learn. Represent. ICLR 2015 - Conf. Track Proc., pp.1-15, Dec. 2014.