Browse > Article
http://dx.doi.org/10.5626/JCSE.2013.7.3.147

Deep Web and MapReduce  

Tao, Yufei (Division of Web Science and Technology, Korea Advanced Institute of Science and Technology)
Publication Information
Journal of Computing Science and Engineering / v.7, no.3, 2013 , pp. 147-158 More about this Journal
Abstract
This invited paper introduces results on Web science and technology obtained during work with the Korea Advanced Institute of Science and Technology. In the first part, we discuss algorithms for exploring the deep Web, which refers to the collection of Web pages that cannot be reached by conventional Web crawlers. In the second part, we discuss sorting algorithms on the MapReduce system, which has become a dominant paradigm for massive parallel computing.
Keywords
Web; Big data; MapReduce; Parallel computing; Algorithm; Theory;
Citations & Related Records
연도 인용수 순위
  • Reference
1 C. Sheng, N. Zhang, Y. Tao, and X. Jin, "Optimal algorithms for crawling a hidden database in the Web," Proceedings of the VLDB Endowment, vol. 5, no. 11, pp. 1112-1123, 2012.   DOI
2 J. Dean and S. Ghemawat, "MapReduce: simplified data processing on large clusters," in Proceedings of the 6th Symposium on Operating Systems Design & Implementation, San Francisco, CA, 2004, pp. 137-150.
3 Y. Kwon, M. Balazinska, B. Howe, and J. Rolia, "Skew- Tune: mitigating skew in mapReduce applications," in Proceedings of the ACM SIGMOD International Conference on Management of Data, Scottsdale, AZ, 2012, pp. 25-36.
4 O. O'Malley, "Terabyte sort on apache hadoop," Yahoo, Sunnyvale, CA, Technical report, 2008.
5 Y. Tao, W. Lin, and X. Xiao, "Minimal mapReduce algorithms," in Proceedings of the ACM SIGMOD International Conference on Management of Data, New York, NY, 2013, pp. 529-540.
6 R. Vernica, A. Balmin, K. S. Beyer, and V. Ercegovac, "Adaptive mapReduce using situation-aware mappers," in Proceedings of the 15th International Conference on Extending Database Technology, Berlin, Germany, 2012, pp. 420-431.