Browse > Article

Semantic Document-Retrieval Based on Markov Logic  

Hwang, Kyu-Baek (숭실대학교 컴퓨터학부)
Bong, Seong-Yong (숭실대학교 컴퓨터학과)
Ku, Hyeon-Seo (서울시립대학교 기계정보공학과)
Paek, Eun-Ok (서울시립대학교 기계정보공학과)
Abstract
A simple approach to semantic document-retrieval is to measure document similarity based on the bag-of-words representation, e.g., cosine similarity between two document vectors. However, such a syntactic method hardly considers the semantic similarity between documents, often producing semantically-unsound search results. We circumvent such a problem by combining supervised machine learning techniques with ontology information based on Markov logic. Specifically, Markov logic networks are learned from similarity-tagged documents with an ontology representing the diverse relationship among words. The learned Markov logic networks, the ontology, and the training documents are applied to the semantic document-retrieval task by inferring similarities between a query document and the training documents. Through experimental evaluation on real world question-answering data, the proposed method has been shown to outperform the simple cosine similarity-based approach in terms of retrieval accuracy.
Keywords
information retrieval; semantic document-retrieval; supervised learning; ontology; Markov logic;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Atlam, E., Fuketa, M., Morita, K., and Aoe, J., Documents similarity measurement using field association terms, Information Processing and Management, vol.39, no.6, pp.809-824, 2003.   DOI   ScienceOn
2 Saracoglu, R., Tuetuencue, K., and Allahverdi, N., A fuzzy clustering approach for finding similar documents using a novel similarity measure, Expert Systems with Applications, vol.33, no.3, pp. 600-605, 2007.   DOI   ScienceOn
3 Takaki, T., Fujii, A., and Ishikawa, T., Associative document retrieval by query subtopic analysis and its application to invalidity patent search, Proceedings of the 13th ACM International Conference on Information and Knowledge Management, pp.399-405, 2004.
4 Wan, X., Yang, J., and Xiao, J., Towards a unified approach to document similarity search using manifold-ranking of blocks, Information Processing and Management, vol.44, no.3, pp.1032-1048, 2008.   DOI   ScienceOn
5 Domingos, P. and Lowd, D., Markov Logic: An Interface Layer for Artificial Intelligence, Morgan & Claypool, 2009.
6 Baeza-Yates, R. and Ribeiro-Neto, B., Modern Information Retrieval, ACM Press and Addison Wesley, 1999.
7 Domingos, P. and Pazzani, M., On the optimality of the simple Bayesian classifier under zero-one loss, Machine Learning, vol.29, pp.103-130, 1997.   DOI   ScienceOn