1 |
M. Henzinger, "Finding Near-Duplicate Web Pages: A Large-Scale Evaluation of Algorithms," In Proc. ACM Int'l. Conf. on Information Retrieval, SIGIR, pp.284-291, 2006.
|
2 |
A. Broder et al., "Min-Wise Independent Permutations," Journal of Computer and System Sciences, vol.60, no.3, pp.630-659, 2000.
|
3 |
A. Broder, "Identifying and Filtering Near-Duplicate Documents," In Proc. Int'l. Symp. on Combinatorial Pattern Matching, CPM, pp.1-10, 2000.
|
4 |
A. Broder, "Identifying and Filtering Near-Duplicate Documents," In Proc. Int'l. Symp. on Combinatorial Pattern Matching, CPM, pp.1-10, 2000.
|
5 |
N. Beckmann et al., "The R*-tree: An Efficient and Robust Access Method for Points and Rectangles," In Proc. ACM Int'l. Conf. on Management of Data, SIGMOD, pp.322-331, 1990
|
6 |
M. Rabin, "Fingerprinting by Random Polynomials," Technical Report TR-CSE-03-01, Harvard University, 1981.
|
7 |
A. Broder et al., "Syntactic Clustering of the Web," In Proc. Int'l. Web Wide World Wide Web Conference, WWW, pp.391-404, 1997.
|
8 |
SK Communications, http://www.egloos.com.
|
9 |
Jong Wook Kim, K. Selcuk Candan, and Junichi Tatemura, "Efficient Overlap and Content Reuse Detection in Blogs and Online News Articles," In Proc. Int'l. World Wide Web Conference, WWW, pp.81-90, 2009.
|