Browse > Article
http://dx.doi.org/10.5762/KAIS.2015.16.8.5565

Recognition of Korean Implicit Citation Sentences Using Machine Learning with Lexical Features  

Kang, In-Su (Computer Science and Engineering, Kyungsung University)
Publication Information
Journal of the Korea Academia-Industrial cooperation Society / v.16, no.8, 2015 , pp. 5565-5570 More about this Journal
Abstract
Implicit citation sentence recognition is to locate citation sentences which lacks explicit citation markers, from articles' full-text. State-of-the-art approaches exploit word ngrams, clue words, researcher's surnames, mentions of previous methods, and distance relative to nearest explicit citation sentences, etc., reaching over 50% performance. However, most previous works have been conducted on English. As for Korean, a rule-based method using positive/negative clue patterns was reported to attain the performance of 42%, requiring further improvement. This study attempted to learn to recognize implicit citation sentences from Korean literatures' full-text using Korean lexical features. Different lexical feature units such as Eojeol, morpheme, and Eumjeol were evaluated to determine proper lexical features for Korean implicit citation sentence recognition. In addition, lexical features were combined with the position features representing backward/forward proximities to explicit citation sentences, improving the performance up to over 50%.
Keywords
Citation Sentence; Korean Lexical Feature; Machine Learning;
Citations & Related Records
Times Cited By KSCI : 1  (Citation Analysis)
연도 인용수 순위
1 H. Nanba, N. Kando, M. Okumura, "Classification of research papers using citation links and citation types: Towards automatic review article generation", Proc. of the 11th ASIS SIG/CR Classification Research Workshop, pp.117-134, 2000.
2 A. Ritchie, S. Robertson, S. Teufel, "Comparing citation contexts for information retrieval", Proc. of the 17th ACM Conference on Information and Knowledge Management, pp.213-222, 2008. DOI: http://dx.doi.org/10.1145/1458082.1458113   DOI
3 D. Kaplan, R. Iida, T. Tokunaga, "Automatic extraction of citation contexts for research paper summarization: a coreference-chain based approach", Proc. of the 2009 Workshop on Text and Citation Analysis for Scholarly Digital Libraries, pp.88-95, 2009. DOI: http://dx.doi.org/10.3115/1699750.1699764   DOI
4 P. Sondhi, C. Zhai, "A constrained hidden Markov model approach for non-explicit citation context extraction", Proc. of the 2014 SIAM International Conference on Data Mining, pp.361-369, 2014. DOI: http://dx.doi.org/10.1137/1.9781611973440.41   DOI
5 I. Kang, "A rule-based approach to identifying citation text from Korean academic literature", Journal of the Korean Society for information Management, 29(4), pp.43-60, 2012. DOI: http://dx.doi.org/10.3743/kosim.2012.29.4.043   DOI
6 V. Qazvinian, D. R. Radev, "Identifying non-explicit citing sentences for citation-based summarization", Proc. of the 48th Annual Meeting of the Association for Computational Linguistics, pp.555-564, 2010.
7 A. Athar, S. Teufel, "Detection of implicit citations for sentiment detection", Proc. of ACL-12 Workshop on Discovering Structure in Scholarly Discourse, pp.18-26, 2012.
8 A. Abu-Jbara, J. Ezra, D. R. Radev, "Purpose and polarity of citation: towards NLP-based bibliometrics", Proc. of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp.596-606, 2013.
9 C-C Chang, C-J Lin, "LIBSVM : a library for support vector machines", ACM Transactions on Intelligent Systems and Technology, 2(3):27:1-27:27, 2011. Software available at http://www.csie.ntu.edu.tw/-cjlin/ libsvm DOI: http://dx.doi.org/10.1145/1961189.1961199   DOI