Browse > Article
http://dx.doi.org/10.3745/JIPS.2009.5.3.159

Automatic In-Text Keyword Tagging based on Information Retrieval  

Kim, Jin-Suk (Department of Information Technology Research, Knowledge Information Center, Korea Institute of Science & Technology Information (KISTI))
Jin, Du-Seok (Department of Information Technology Research, Knowledge Information Center, Korea Institute of Science & Technology Information (KISTI))
Kim, Kwang-Young (Department of Information Technology Research, Knowledge Information Center, Korea Institute of Science & Technology Information (KISTI))
Choe, Ho-Seop (Department of Information Technology Research, Knowledge Information Center, Korea Institute of Science & Technology Information (KISTI))
Publication Information
Journal of Information Processing Systems / v.5, no.3, 2009 , pp. 159-166 More about this Journal
Abstract
As shown in Wikipedia, tagging or cross-linking through major keywords in a document collection improves not only the readability of documents but also responsive and adaptive navigation among related documents. In recent years, the Semantic Web has increased the importance of social tagging as a key feature of the Web 2.0 and, as its crucial phenotype, Tag Cloud has emerged to the public. In this paper we provide an efficient method of automated in-text keyword tagging based on large-scale controlled term collection or keyword dictionary, where the computational complexity of O(mN) - if a pattern matching algorithm is used - can be reduced to O(mlogN) - if an Information Retrieval technique is adopted - while m is the length of target document and N is the total number of candidate terms to be tagged. The result shows that automatic in-text tagging with keywords filtered by Information Retrieval speeds up to about 6 $\sim$ 40 times compared with the fastest pattern matching algorithm.
Keywords
Automatic In-Text Keyword Tagging; Information Retrieval; Pattern Matching; Boyer-Moore-Horspool Algorithm; Keyword Dictionary; Cross-Referencing; in-text content link;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Sergey Brin and Lawrence Page, “The Anatomy of a Large-scale Hypertextual Web Search Engine,” In Computer Networks and ISDN Systems: Proceeings of the Seventh International World Wide Web Conference, Volume 30(1-7):107-117, Apr. 1998   DOI   ScienceOn
2 Hyeon Kim, “Handling XML documents in Hypertext Compilation of the Encyclopedia of Korean Local Culture,” Human Contents, 9:91-123, 2007
3 Jihong Zeng and Peter A. Bloniarz, “From Keywords to Links: An Automatic Approach,” In Proceedings: International Conference on Information Technology: Coding and Computing (ITCC'04), Vol.1, pp.283- 286, Las Vegas, Nevada, USA, Apr. 2004   DOI
4 R. Nigel Horspool, “Practical Fast Searching in Strings,” Software: Practice and Experience, 10(6):501-506, 1980   DOI
5 William B. Frakes and Ricardo Baeza-Yates, “Information Retrieval: Data Structures & Algorithms,” Prentice-Hall, 1992
6 Jinsuk Kim, Du-Seok Jin, Yusoo Choi, Chang-Hoo Jeong, Kwangyoung Kim, Sung-Pil Choi, Minho Lee, Min-Hee Cho, Ho-Seop Choe, Hwa-Mook Yoon, and Jeong-Hyun Seo, “Toward DB-IR Integration: Per- Document Basis Transactional Index Maintenance,” In Proceedings: The 6th International Conference on Advanced Language Processing and Web Information Technology (ALPIT'07), Vol.6, pp.452-462, Luoyang, Henan, China, Aug. 2007
7 Robert J. Glushko, “Transforming Text into Hypertext for a Compact Disc Encyclopedia,” ACM SIGCHI Bulletin, 20:293-298, 1989   DOI
8 Luc Goffinet and Monique Noirhomme-Fraiture, “Automatic Cross-referencing of HCI Guidelines by Statistical Methods,” Interacting with Computers, 12(2):161-177, 1999   DOI   ScienceOn
9 Jakob Voβ, “Tagging, Folksonomy & Co-Renaissance of Manual Indexing”, In Proceedings: The 10th International Symposium for Information Science, pp.234- 254, Cologne, Germany, 2007
10 Gilad Mishne, “AutoTag: A Collaborative Approach to Automated Tag Assignment for Weblog Posts”, In WWW '06: Proceedings of the 15th international conference on World Wide Web, pp.953-954, New York, USA, 2006   DOI
11 Technion Grigory Begelman, Citrin I. Philipp Keller, and Rawsugar Frank Smadja. “Automated Tag Clustering: Improving Search and Exploration in the Tag Space”. In Collaborative Web Tagging Workshop at WWW2006, Edinburgh, Scotland, 2006. Online: http://www.rawsugar.com/www2006/20.pdf
12 Robert Jaschke, Leandro Marinho, Andreas Hotho, Lars Schmidt-Thieme, and Gerd Stumme, “Tag Recommendations in Folksonomies”, In Proceedings of the 11th European conference on Principles and Practice of Knowledge Discovery in Databases (PKDD 2007), pp.506-514, September 17-21, 2007, Warsaw, Poland   DOI   ScienceOn
13 Yang Song, Ziming Zhuang, Huajing Li, Qiankun Zhao, Jia Li, Wang-Chien Lee, and C. Lee Giles, “Real-time Automatic Tag Recommendation”, In Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval (SIGIR'08), pp.515-522, July 20-24, 2008, Singapore, Singapore   DOI
14 Paul - Alexandru Chirita, Stefania Costache, Wolfgang Nejdl, and Siegfried Handschuh, “P-TAG: Large Scale Automatic Generation of Personalized Annotation Tags for the Web”, In Proceedings of the 16th international conference on World Wide Web, pp.845- 854, May 08-12, 2007, Banff, Alberta, Canada   DOI
15 Scott A. Golder and Bernardo A. Huberman, “Usage Patterns of Collaborative Tagging Systems”, Journal of Information Science, 32(2): 198-208, 2006   DOI
16 Sanjay Sood, Sara Owsley, Kristian Hammond and Larry Birnbaum, “TagAssist: Automatic Tag Suggestion for Blog Posts”, In Proceedings: International Conference on Weblogs and Social Media (ICWSM 2007), Colorado, USA, 2007
17 Stephen Dill, Nadav Eiron, David Gibson, Daniel Gruhl, R. Guha, Anant Jhingran, Tapas Kanungo, Sridhar Rajagopalan, Andrew Tomkins, John A. Tomlin, and Jason Y. Zien, “SemTag and Seeker: Bootstrapping the Semantic Web via Automated Semantic Annotation”, In Proceedings of the 12th International Conference on World Wide Web (WWW'03) Budapest, Hungary, 2003   DOI
18 Hsin-Chang Yang and Chung-Hong Lee, “A Text Mining Approach for Automatic Construction of Hypertexts,” Expert Systems with Applications, 29: 723-734, 2005   DOI   ScienceOn
19 Airi Salminen, Jean Tague-Sutcliffe, and Charles McClellan, “From Text to Hypertext by Indexing,” ACM Transactions on Information Systems, 13(1):69- 99, 1995   DOI   ScienceOn