Browse > Article
http://dx.doi.org/10.3745/KIPSTD.2011.18D.5.317

A Text-based Similarity Measure for Scientific Literature  

Yoon, Seok-Ho (한양대학교 전자컴퓨터통신공학과)
Kim, Sang-Wook (한양대학교 정보통신대학 정보통신학부)
Abstract
This paper addresses computing of similarity among papers using text-based measures. First, we analyze the accuracy of the similarities computed using different parts of a paper, and propose a method of Keyword-Extension, which is very useful when text information is incomplete. Via a series of experiments, we verify the effectiveness of Keyword-Extension.
Keywords
Text-based Similarity Measure; Keyword-extension; Scientific Literature;
Citations & Related Records
연도 인용수 순위
  • Reference
1 S. Yoon, S. Kim, and S. Park. A link-based similarity measure for scientific literature. In Proc. of Int''l. Conf. on World Wide Web, pp.1213-1214, April, 2010.
2 R. Baeza-Yates and B. Ribeiro-Neto, Modern Information Retrieval, Addison Wesley, 1999.
3 D. Gruhl, R. Guha, D. Liben-Nowell, and A. Tomkins, "Information diffusion through blogspace." In Proc. Int'l. Conf. on World Wide Web, pp.491-501, 2004.
4 T. Zhang, R. Ramakrishnam, and M. Livny, "BIRCH: an Efficient Data Clustering Method for Very Large Databases," In Proc. Int'l. Conf. on Management of Data, pp.103-114, 1996.
5 J. Tang, J. Zhang, L. Yao, J. Li, L. Zhang, and Z. Su, "ArnetMiner: Extraction and Mining of Academic Social Networks," In Proc. of ACM SIGKDD Int'l Conf. on Knowledge Discovery and Data Mining, pp.990-998, 2008.
6 X. Liu, S. Yu, Y. Moreau, B. Moor, and W. Glanzel, "Hybrid Clustering of Text Mining and Bibliometrics Applied to Journal Sets," In Proc. of SIAM Int'l Conf. on Data Mining, pp.49-60, 2009.
7 J. Han and M. Kamber, Data Mining: Concepts and Techniques, Morgan Kaufmann, 2006.