Browse > Article

An Equation Retrieval System Based on Weighted Sum of Heterogenous Indexing Terms  

Shin, Jun-Soo (강원대학교 컴퓨터정보통신공학과)
Kim, Hark-Soo (강원대학교 컴퓨터정보통신공학과)
Abstract
To effectively retrieve mathematical documents including various equations, mathaware search engines are needed. In this paper, we propose a equation retrieval system which helps users effectively search structurally similar equations. The proposed system disassembles MathML equations into three types of heterogeneous indexing terms; operators, variables, and partial structures of equations. Then, it independently indexes the disassembled terms. When a user inputs a MathML equation, the proposed system searches and ranks equations using weighted sums of three language models for the heterogeneous indexing terms. In the experiments with 244,744 MathML equations, three proposed system showed reliable performances (a P@1 of 53% in the closed test and a P@1 of 63% in the open test).
Keywords
Equation retrieval; Heterogenous indexing term; MathML;
Citations & Related Records
연도 인용수 순위
  • Reference
1 D. Hiemstra, "Using Language Models for Information Retrieval," Ph.D. Thesis, Centre for Telematics and Information Technology, University of Twente, ISBN 90-75296-05-3, 2001.
2 J. S. Shin, S. H. Lee, H. S. Kim, "Mathematical Equation Retrieval Based on Properties of Mathematical Symbols," Proceedings of the 36th KIISE Fall Conference, vol.36, no.2(C), pp.188-193, 2009. (in Korean)
3 M. E. Altamimi, A S. Youssef, "A More Canonical Form of Content MathML to Facilitate Math Search," Proceedings of the 2007 Extreme Markup Languages Conference, 2007.
4 Mathematical Markup Language, http://www.w3.org/math
5 M. Adeel, H. S. Cheung and S. H. Khiyal, "MATH GO! Prototype of a Content Based Mathematical Formula Search Engine," Journal of Theoretical and Applied Information Technology, vol.4, no.10, pp.1002-1012, 2008.
6 J. Misutka, L. Galambos, "Extending Full Text Search Engine for Mathematical Content," Proceedings of Towards Digital Mathematics Library, pp.55-67, 2008.
7 http://arxmliv.kwarc.info/files/math-ph/papers/
8 A. S. Youssef, "Relevance Ranking and Hit Description in Math Search," Mathematics in Computer Science, vol.2, no.2, pp.333-353, 2008.   DOI   ScienceOn
9 J. M. Ponte, W. B. Croft, "A Language Modeling Approach to Information Retrieval," Proceedings of ACM SIGIR, pp.275-281, 1998.