1 |
Konstantin Shvachko, Hairong Kuang, Sanjay Radia, Robert Chansler "The hadoop distributed file system," Mass Storage Systems and Technologies (MSST), pp.1-10, 2010.
|
2 |
Jeffrey Dean and Sanjay Ghemawat, "MapReduce: simplified data processing on large clusters," Communications of the ACM, Vol.51, Issue.1, pp.107-113, 2010.
DOI
|
3 |
Surajit Chaudhuri, Venkatesh Ganti, and Raghav Kaushik, "A primitive operator for similarity joins in data cleaning," Data Engineering, p.5, 2006.
|
4 |
A. Metwally, D. Agrawal, and A. El Abbadi, "DETECTIVES: DETEcting Coalition hiT Inflation attacks in adVertising nEtworks Streams," Proceedings of the 16th WWW International Conference on World Wide Web, pp.241-250, 2007.
|
5 |
A. Z. Broder, S. C. Glassman, M. S. Manasse, and G. Zweig, "Syntactic clustering of the web," Computer Networks, pp.1157-1166, 1997.
|
6 |
T. C. Hoad and J. Zobel, "Methods for identifying versioned and plagiarized documents," JASIST, Vol.54, Issue.3, pp.203-215, 2003.
DOI
|
7 |
Yasin N. Silva and Jason M. Reed, "Exploiting MapReduce-based similarity joins," Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data, pp.693-696, 2012.
|
8 |
Ahmed Metwally and Christos Faloutsos, "V-smart-join: A scalable mapreduce framework for all-pair similarity joins of multisets and vectors," Proceedings of the VLDB Endowment, Vol.5, No.8, pp.704-715, 2012.
DOI
|
9 |
Alper Okcan and Mirek Riedewald, "Processing theta-joins using MapReduce," Proceedings of the 2011 ACM SIGMOD International Conference on Management of data ACM, pp.949-960, 2011.
|
10 |
http://chorochronos.datastories.org/
|