Browse > Article

Ranking Quality Evaluation of PageRank Variations  

Pham, Minh-Duc (Dept. of Computer Science, Korea Advanced Institute of Science and Technology)
Heo, Jun-Seok (Dept. of Computer Science, Korea Advanced Institute of Science and Technology)
Lee, Jeong-Hoon (Dept. of Computer Science, Korea Advanced Institute of Science and Technology)
Whang, Kyu-Young (Dept. of Computer Science, Korea Advanced Institute of Science and Technology)
Publication Information
Abstract
The PageRank algorithm is an important component for ranking Web pages in Google and other search engines. While many improvements for the original PageRank algorithm have been proposed, it is unclear which variations (and their combinations) provide the "best" ranked results. In this paper, we evaluate the ranking quality of the well-known variations of the original PageRank algorithm and their combinations. In order to do this, we first classify the variations into link-based approaches, which exploit the link structure of the Web, and knowledge-based approaches, which exploit the semantics of the Web. We then propose algorithms that combine the ranking algorithms in these two approaches and implement both the variations and their combinations. For our evaluation, we perform extensive experiments using a real data set of one million Web pages. Through the experiments, we find the algorithms that provide the best ranked results from either the variations or their combinations.
Keywords
information retrieval; PageRank; combination-based algorithms; search quality; performance evaluation;
Citations & Related Records
Times Cited By KSCI : 1  (Citation Analysis)
연도 인용수 순위
1 Chowdhury, A. and Soboroff, I., 'Automatic Evaluation of World Wide Web Search Services,' In ACM SIGIR, 2002
2 Gyongyi, Z., Berkhin, P., and Garcia-Molina, H., 'Web spam taxonomy,' In AIRWeb, 2005
3 Google Popular Queries Service, http://www.google.com/intl/en/press/intl-zeitgeist.html
4 Haveliwala, T. and Kamvar, S., The Second Eigenvalue of the Google Matrix, Technical Report, Dept. of Computer Science, Stanford Univ., 2003
5 Yoshida, Y. et al., 'What's Going on in Search Engine Rankings,' In AINAW, 2008
6 Naver, http://www.naver.com
7 MS Live Search, http://www.live.com
8 Krishnan, V. and Raj, R., 'Web Spam Detection With Anti-TrustRank,' In AIRWeb, 2006
9 Berkhin, P., 'A Survey on PageRank Computing,' Internet Mathematics, Vol. 2, No. 1, pp. 73-120, 2005   DOI
10 Fagin, R., Kumar, R., and Sivakumar, D., 'Comparing Top k Lists,' SIAM J. DISCRETE MATH, Vol. 17, No. 1, pp. 134-160, 2003   DOI   ScienceOn
11 Yahoo! Seach, http://www.yahoo.com
12 Arasu, A. et al., 'Searching the Web,' ACM Trans. on Internet Technology (TOIT), Vol. 1, No. 1, pp. 2-43, Aug. 2001   DOI
13 Sreangsu, A., and Joydeep, G., 'Outlink Estimation For PageRank Computation Under Missing Data,' In WWW, 2004
14 Advanced Information Technology Research Center (AITrc), http://aitrc.kaist.ac.kr
15 Page, L., et al., The PageRank Citation Ranking: Bringing Order to the Web, Technical Report SIDL-WP-1999-0120, Department of Computer Science, Stanford University, 1998
16 Gyongyi, Z., Garcia-Molina, H., and Jan, P., 'Combating Web Spam with TrustRank,' In VLDB, 2004
17 Kamvar, S. et al., Exploiting the Block Structure of theWeb for Computing Pagerank, Technical Report, Dept. of Computer Science, Stanford Univ., 2003
18 Bar-Ilan, J., Mat-Hassan, M., and Levene, M., 'Methods For Comparing Rankings of Search Engine Results,' Computer Networks, Vol. 50, No. 10, pp. 1448-1463, 2006   DOI   ScienceOn
19 Can, F., Nuray, R., and Sevdik, A., 'Automatic Performance Evaluation of Web Search Engines,' Information Processing and Management, Vol. 40, No. 3, pp. 495-514, 2004   DOI   ScienceOn
20 Devanshu, D., Wee, K., and Sourav, B., 'A Survey of Web Metrics,' ACM Computing Surveys, Vol. 34, No. 4, pp. 469-503, Dec. 2002   DOI   ScienceOn
21 Wang, Y. and Dewitt, D., 'Computing PageRank in a Distributed Internet Search System,' In VLDB, 2004
22 Whang, K. et al., 'Odysseus: a High- Performance ORDBMS Tightly-Coupled with IR Features,' In ICDE, 2005
23 Wikipedia, The free encyclopedia, http://www.wikipedia.org
24 Eiron, N., McCurley, K., and Tomlin, J., 'Ranking the Web Frontier,' In Proc. 13th Int'l Conf. on World Wide Web (WWW), pp. 309 - 318, May 2004
25 Nie, L., Wu, B., and Davison, B., Incorporating Trust into Web Search, Technical Report, Lehigh University, Dec. 2006
26 Shin, E. et al., 'Implementation of a Parallel Web Crawler for the Odysseus Large-Scale Search Engine,' Journal of The Korean Institute of Information Scientist and Engineers(KIISE): Computing Practice and Letters, Vol. 14, No. 6, pp. 567-581, Aug. 2008   과학기술학회마을   ScienceOn
27 Kamvar, S., Haveliwala, T., and Golub, G., 'Adaptive Methods for the Computation of Pagerank,' Linear Algebra and its Applications, Vol. 386, pp. 51-66, 2004   DOI   ScienceOn
28 Haveliwala, T. H., 'Topic-sensitive PageRank,' In WWW, 2002
29 Google Search, http://www.google.com
30 Yi, Z. et al., 'XRank: Learning More from Web User Behaviors,' In CIT, 2006