Browse > Article

Comparison of Sampling Techniques for Passive Internet Measurement: An Inspection using An Empirical Study  

Kim, Jung-Hyun (Dept. of Electronics and Computer Engineering, Hanyang University)
Won, You-Jip (Dept. of Electronics and Computer Engineering, Hanyang University)
Ahn, Soo-Han (Dept. of Statistics, University of Seoul)
Publication Information
Abstract
Today, the Internet is a part of our life. For that reason, we regard revealing characteristics of Internet traffic as an important research theme. However, Internet traffic cannot be easily manipulated because it usually occupy huge capacity. This problem is a serious obstacle to analyze Internet traffic. Many researchers use various sampling techniques to reduce capacity of Internet traffic. In this paper, we compare several famous sampling techniques, and propose efficient sampling scheme. We chose some sampling techniques such as Systematic Sampling, Simple Random Sampling and Stratified Sampling with some sampling intensities such as 1/10, 1/100 and 1/1000. Our observation focused on Traffic Volume, Entropy Analysis and Packet Size Analysis. Both the simple random sampling and the count-based systematic sampling is proper to general case. On the other hand, time-based systematic sampling exhibits relatively bad results. The stratified sampling on Transport Layer Protocols, e.g.. TCP, UDP and so on, shows superior results. Our analysis results suggest that efficient sampling techniques satisfactorily maintain variation of traffic stream according to time change. The entropy analysis endures various sampling techniques well and fits detecting anomalous traffic. We found that a traffic volume diminishment caused by bottleneck could induce wrong results on the entropy analysis. We discovered that Packet Size Distribution perfectly tolerate any packet sampling techniques and intensities.
Keywords
Internet Measurement; Sampling Technique; Entropy Analysis; Anomaly Detection;
Citations & Related Records
연도 인용수 순위
  • Reference
1 M. Crovella and B. Krishnamurthy, "Internet Measurement: Infrastructure, Traffic, and Applications", John Wiley & Sons, Ltd, 2006
2 J. Mirkovic and P. Reiher, "A Taxonomy of DDoS attack and DDoS defense Mechanisms", ACM SIGCOMM Computer Communication Review, Vol. 34, Issue 2, pp. 39-53, April 2004
3 CERT Advisory W32/Blaster worm, "http://www.cert.org/advisories/CA-2003-20.html", August 2003
4 A. Lakhina and M. Crovella and C. Diot, "Characterization of Network-Wide Anomalies in Traffic Flows" In Proc. ACM Internet Measurement Conference, pp 201-206, Taormina, Sicily, Italy, October 2004
5 J. Mai and C. Chuah and A. Sridharan and T. Ye and H. Zang, "Is Sampled Data Sufficient for Anomaly Detection?", In Proc. ACM Internet Measurement Conference, pp 165-176, Rio de Janeriro, Brazil, October 2006
6 ping, "http://en.wikipedia.org/wiki/Ping", Wikipedia
7 RFC 1393, Traceroute Using an IP Option, "http://tools.ietf.org/html/rfc1393"
8 W. John and S. Tafvelin, "Analysis of Internet Backbone Traffic and Header Anomalies observed", In Proc. ACM Internet Measurement Conference, pp 111-116, San Diego, California, USA, October 2007
9 J. Kim and S. Ahn and Y. Won, "Mining An Anomaly: On The Small Time Scale Behavior of The Traffic Anomaly", In Proc. of IADIS International Conference WWW/Internet, Murcia, Spain, PP. 552-559, October 2006
10 D. Moore and G. M. Voelker and S. Savage, "Inferring Internet Denial-of-Service Activity", In Proc. of Usenix Security Symposium, pp. 9-22 Washington, DC, August 2001
11 R. R. Panko, "Corporate Computer and Network Security", Prentice Hall, 2004
12 Juniper Traffic Sampling, "http://www.juniper.net/techpubs/software/junos/junos60/swconfig60-policy/html/sampling-overview.html"
13 D. Brauckhoff and B. Tellenbach and A. Wagner and M. May and A. Lakhina, "Impact of Packet Sampling on Anomaly Detection Metrics", In Proc. ACM Internet Measurement Conference, pp 159-164, Rio de Janeriro Brazil, October 2006
14 CERT Advisory MS-SQL Server Worm, "http://www.cert.org/advisories/CA-2003-04.html", January 2003
15 J. Xia and L. Gao and T. Fei, "Flooding Attacks by Exploiting Persistent Forwarding Loops", In Proc. ACM Internet Measurement Conference, pp 36-41, Berkeley, CA, USA, October 2005
16 A. Soule and F. Silveira and H. Ringberg and C. Diot, "Challenging the Supremacy of Traffic Matrics", In Proc. ACM Internet Measurement Conference, pp 105-110, San Diego, California, USA, October 2007
17 T. M. Cover and J. A. Thomas, "Elements of Information Theory", Wiley Interscience, 1991
18 Nick Duffield, "Sampling for Passive Internet Measurement: A Review", Statistical Science Vol. 19, No. 3, pp. 472-498, 2004   DOI   ScienceOn
19 TCPDUMP/LIBPCAP public repository, "http://tcpdump.org"
20 D. Moore and V. Paxson and S. Savage and S. Staniford and N. Weaver, "Inside the Slammer worm", IEEE Security & Privacy, Vol. 1 issue 4, pp. 33-39, August 2003
21 A. Lakhina and M. Crovella and C. Diot, "Mining Anomalies using Traffic Feature Distributions", ACM SIGCOMM Computer Communication Review, Vol 35, Issue 4, pp. 217-228, October 2005   DOI