Performance Impact of Large File Transfer on Web Proxy Caching: A Case Study in a High Bandwidth Campus Network Environment

  • Kim, Hyun-Chul (School of Computer Science and Engineering, Seoul National University) ;
  • Lee, Dong-Man (Department of Computer Science, Korea Advanced Institute of Science and Technology) ;
  • Chon, Kil-Nam (Graduate School of Media and Governance, Keio University) ;
  • Jang, Beak-Cheol (Department of Computer Science, North Carolina State University) ;
  • Kwon, Tae-Kyoung (School of Computer Science and Engineering, Seoul National University) ;
  • Choi, Yang-Hee (School of Computer Science and Engineering, Seoul National University)
  • Published : 2010.02.28

Abstract

Since large objects consume substantial resources, web proxy caching incurs a fundamental trade-off between performance (i.e., hit-ratio and latency) and overhead (i.e., resource usage), in terms of caching and relaying large objects to users. This paper investigates how and to what extent the current dedicated-server based web proxy caching scheme is affected by large file transfers in a high bandwidth campus network environment. We use a series of trace-based performance analyses and profiling of various resource components in our experimental squid proxy cache server. Large file transfers often overwhelm our cache server. This causes a bottleneck in a web network, by saturating the network bandwidth of the cache server. Due to the requests for large objects, response times required for delivery of concurrently requested small objects increase, by a factor as high as a few million, in the worst cases. We argue that this cache bandwidth bottleneck problem is due to the fundamental limitations of the current centralized web proxy caching model that scales poorly when there are a limited amount of dedicated resources. This is a serious threat to the viability of the current web proxy caching model, particularly in a high bandwidth access network, since it leads to sporadic disconnections of the downstream access network from the global web network. We propose a peer-to-peer cooperative web caching scheme to address the cache bandwidth bottleneck problem. We show that it performs the task of caching and delivery of large objects in an efficient and cost-effective manner, without generating significant overheads for participating peers.

Keywords

References

  1. A. Myers, J. Chuang, u. Hengartner, Y. Xie, W. Zhang, and H. Zhang, "A secure, publisher-centric web caching infrastructure," in Proc. INFOCOM, USA, Apr. 2001.
  2. M. Arlitt, R. Friedrich, and T. Jin, "Workload characterization of a web proxy in a cable modem environment," ACM SIGMETRICS Peiformance Evaluation Review, vol. 27, no. 2, pp. 25-36, Sept. 1999. https://doi.org/10.1145/332944.332951
  3. M. Crovella and A. Bestavros, "Self-similarity in world wide web traffic: Evidence and possible causes," IEEE/ACM Trans. Netw., vol. 5, no. 6, pp. 835-846, Dec. 1997. https://doi.org/10.1109/90.650143
  4. J. Jung, D. Lee, and K. Chon, "Proactive web caching with cumulative prefetching for large multimedia data," Computer Networks, vol. 33, pp. 645-655, May 2000. https://doi.org/10.1016/S1389-1286(00)00064-5
  5. J. Plank, M. Beck, W. Elwasif, T. Moore, M. Swany, and R. Wolsky, "Internet backplane protocol: Storage in the network," in Proc. Netstore Symp., Oct. 1999.
  6. M. Raunak, P. Shenoy, P. Goyal, and K. Ramamritham, "Implications of proxy caching for provisioning networks and servers," IEEE J. Sel. Areas. Commun., vol. 20, no. 7, pp. 1276-1289, Sept. 2002. https://doi.org/10.1109/JSAC.2002.801751
  7. Squid Web Proxy Cache. [Online]. Available: http://www.squid-cache.org
  8. A. Ogawa, K. Kobayashi, K. Sugiura, O. Nakamura, and J. Murai, "Design and implementation of DV-based video over RTP," WIDE Workshop, Stanford University, Jan. 2002.
  9. A. Feldmann, R. Caceres, F. Douglis, G. Glass, and M. Rabinovich, "Performance of web proxy caching in heterogeneous environments," in Proc. IEEE INFOCOM, Mar. 1999.
  10. Proxycizer. [Online]. Available: http://www.cs.duke.edulari/cisi/Proxycizer
  11. L. Rizzo, "Dummynet: A simple approach to the evaluation of network protocols," ACM Computer Communication Review, vol. 27, no. 1, pp. 31- 41, Jan. 1997. https://doi.org/10.1145/251007.251012
  12. J. Cooper. (2001, July). Squid cache market penetration. [Online]. Available: http://www.squid-cache.org/mail-archive/squid-users/200107/ 0639.html
  13. L. Breslau, P. Cao, L. Fan, G. Phillips, and S. Shenker, "Web caching and zipf-like distributions: Evidence and implications," in Proc. INFOCOM, Mar. 1999.
  14. J. P. Savage. (2005, Aug.). "A letter to the U.S. congress." [Online]. Available: http://www.ftthcouncil .org/documents/247 684. pdf
  15. Bsdsar. System Activity Reporter. [Online]. Available: http://www.googlebit. comlbsdsar
  16. M. Harchol-Balter, B. Schroeder, N. Bansal, and M. Agrawal, "Size-based scheduling to improve web performance," ACM Trans. Comput. Syst., vol. 21, no. 2, pp. 207-233, May 2003. https://doi.org/10.1145/762483.762486
  17. H. Kim, D. Lee, J. Lee, J. Suh, and K. Chon, "A measurement study of storage resource and multimedia contents on a high-performance research and education network," in Proc. IEEE High Speed Network and Multimedia Communications, July 2003.
  18. I. Foster, "P2P and Grid Computing," in Internet2 Peer-to-Peer Workshop: Collaborative Computing in Higher Education-Peer-to-Peer and Beyond, Jan. 2002.
  19. V. N. Padmanabhan and K. Sripanidkiu1chai, "The case for cooperative networking," in Proc. Int. Workshop on Peer-to-Peer Systems, Feb. 2002.
  20. S. Iyer, A. Rowstron, and P. Druschel, "Squirrel: A decentralized peer-topeer web cache," in Proc. ACM Symp. Principles of Distributed Computing, July 2002.
  21. A. Rousskov and V. Soloviev, "On performance of caching proxies," in Proc. ACM SIGMETRICS, June 1998.
  22. M. Crovella and P. Barford, "The network effects of prefetching," in Proc. IEEE INFOCOM, 1998.
  23. J. Jung and K. Chon, "RepliCache: A new approach to scalable network storage system for large objects," in Proc. Int. Web Caching Workshop, Apr. 1999.
  24. A. Fox, S. D. Gribble, Y. Chawathe, E. A. Brewer, and P. Gauthier, "Cluster-based scalable network services," in Proc. ACM Symp. Operating System Principles, Oct. 1997.
  25. S. Michel, K. Nguyen, A. Rosenstein, L. Zhang, S. Floyd, and V. Jacobson, "Adaptive web caching: Towards a new global caching architecture," Computer Networks and ISDN Systems, vol. 30, pp. 22-23, Nov. 1998.
  26. R. Tewari, M. Dahlin, H. M. Vin, and J. Kay, "Beyond hierarchies: Design considerations for distributed caching on the Internet," in Proc. Int. Conf. Distrib. Comput. Syst., June 1999.
  27. D. Povey and J. Harrison, "Distributed internet caches," in Proc. Australian Computer Science Conf, Feb. 1997.
  28. A. Chankhunthod, P. Danzig, C. Neerdaels, M. F. Schwartz, and K. J. Worrei, "A hierarchical internet object cache," in Proc. USENIX Technical Conf, Jan. 1996.
  29. B. Schroeder and M. Harchol-Balter, "Web servers under overload: How scheduling can help," ACM Trans. Internet Technologies, vol. 6, no. 1, pp. 20-52,Feb. 2006. https://doi.org/10.1145/1125274.1125276
  30. M. Crovella, B. Frangioso, and M. Harchol-Balter, "Connection scheduling in web servers," in Proc. USENIX Symp. Internet Technologies and Syst.,Oct. 1999.
  31. A. Rowstron and P. Druschel, "Pastry: Scalable, distributed object location, and routing for large-scale peer-to-peer systems," in Proc. Int. Conf. Distrib. Syst. Platform, Nov. 2001.
  32. I. Stoica, R. Morris, D. Karger, M. F. Kaashoek, and H. Balakrishnan, "Chord: A scalable peer-to-peer lookup service for Internet applications," in Proc. ACM SIGCOMM, Aug. 2001.
  33. Z. Xu, Y. Hu, and L. Bhuyan, "Exploiting client caches: A scalable and efficient approach to build large web cache," in Proc. Int. Parallel and Distrib. Process. Symp., Apr. 2004.
  34. P. Linga, I. Gupta, and K. Birman, "A chum-resistant peer-to-peer web caching system," in Proc. ACM Workshop on Survivable and Selfregenerative Systems, Oct. 2003.
  35. P. Linga, I. Gupta, and K. Birman, "Kache: Peer-to-peer web caching using kelips," in submission, June 2004.
  36. W. Shi and Y. Mao, "Performance evaluation of peer-to-peer web caching systems," J. Syst. Softw., vol. 79, no. 5, pp. 714-726, May 2006. https://doi.org/10.1016/j.jss.2005.08.012
  37. Y. Mao, Z. Zhu, and W. Shi, "Peer-to-peer web caching: Hype or reality?" in Proc. Int. Conf. Parallel and Distrib. Syst., July 2004.
  38. Napster. [Online]. Available: http://www.napster.com
  39. Gnutella Open Source Community. [Online]. Available: http://gnutella. wego.com
  40. Project Swarmcast. [Online]. Available: http://sourceforge.net/projects/ swarmcast
  41. Edonkey. [Online]. Available: http://www.edonkey2000.com
  42. KaZaA. [Online]. Available: http://www.kazaa.com
  43. M. Beck, T. Moore, and J. Plank, "An end-to-end approach to globally scalable network storage," in Proc. ACM SIGCOMM, Aug. 2002.
  44. T. Stading, P. Maniatis, and M. Baker, "Peer-to-peer caching schemes to address flash crowds," in Proc. Int. Workshop on Peer-to-Peer Syst., Feb. 2002.