Wireless Internet Service Classification using Data Mining

데이터 마이닝을 이용한 무선 인터넷 서비스 분류기법

  • Published : 2009.06.15

Abstract

It is a challenging work for service operators to accurately classify different services, which runs on various wireless networks based upon numerous platforms. This works focuses on design and implementation of a classifier, which accurately classifies applications, which are captured horn WiBro Network. Notion of session is introduced for the classifier, instead of commonly used Flow to develop a classifier. Based on session information of given traffic, two classification algorithms are presented, Classification and Regression Tree and Support Vector Machine. Both algorithms are capable of classifying accurately and effectively with misclassification rate of 0.85%, and 0.94%, respectively. This work shows that classifier using CART provides ease of interpreting the result and implementation.

오늘 날 다양한 플랫폼을 기반으로 한 무선 네트워크 위에 실행되고 있는 수 많은 응용 프로그램은 서비스 운영자 입장에서 정확히 분류해내는 것은 중요하다. 이 연구는 WiBro 상용망에서 임의로 생성한 트래픽 데이터에서 다양한 응용프로그램들을 분류하는 것을 목적으로 한다. 분류기를 개발하는데 있어서 기존에 Flow기반으로 분류를 하는 대신 세션이라는 단위로 실험을 진행하였다. 이 단위를 사용하여 두 가지 분류 기법을 사용하였다. Classification and Regression Tree와 Support Vector Machine. 각 판별기는 생성된 변수들을 기반으로 판별을 시도하였을 때 CART의 경우 0.85%, SVM의 경우 0.94%의 오차를 보여 우수한 성능을 보였지만, 판별기의 구현과 결과 해석이 용이한 CART를 이용하여 판별시스템을 구축하는 것이 유리함을 보였다.

Keywords

References

  1. D. Halma, V. Sandrinc, and R. David. "A markovian signature- based approach to IP traffic classification," in Proceedings of the 3rd annual ACM, workshop on Mining network data. San Diego, California, ACM: 2007
  2. C. Manuel, D. :Maurizio, G. Francesco, and S. Luca, "Traffic classification through simple statistical fingerprinting," SIGCMM Comput. Commuln. Rev., Vol.37, pp. 5-16, 2007
  3. V. Paxson, "Bro: a system for detecting network intruders in real-time," pp. 3-3, 1998
  4. M. Roesch, "Snort: Lightweight Intrusion Detection for Networks."
  5. L. Breiman, Classification and Regression Trees: Chapman &: Hall/CRC. 1998
  6. T. Hastie, R Tibshirani. and J. H. Friedman, The Elements of Statistical Learning: Springer, 2001
  7. N. Cristianini and J. Shawe-Taylor, An Introduction to Support Vector Machines: Cambridge University Press, 2000
  8. E. Jeffrey, A. Martin, and M. Anirban, "Traffic classification using clustering algorithms," in Proceedings of the 2006 SIGCOMM workshop on Mining network data, Pisa, Italy: ACM, 2006
  9. W. M. Andrew and Z. Denis, "Internet traffic classification using bayesian analysis techniques," in Proceedings of the 2005 ACM SIGMETRICS international conference on Measurement and modeling of computer systems Banff, Alberta, Canada: ACM. 2005
  10. W. Nigel, Z. Sebastian, and A. Grenville, "A preliminary performance comparison of five machine learning algorithms for practical IP traffic Flow classification," SIGCOMM Comput. Commun. Rev" Vol.36, pp.5-16, 2006
  11. H. Patrick, S. Subhabrata, S. Oliver, and W. Dongmei, "ACAS: automated construction of application signatures," in Proceeding of the 2005 ACH SIGCOMM workshop on Mining network data, Philadelphia, Pennsylvania, USA: ACM, 2005
  12. X. Kuai, Z. ZhiLi. and B. Supratik "Profiling internet backbone traffic: behavior models and applications," in Proceedings of the 2005 conference on Applications. technologies, architectures. and protocols for computer communications Philadelphia, Pennsylvania, USA: ACM, 2005
  13. M. Roughan, S. Sen, O. Spatscheck, and N. Duffield, "Class of Service Mapping for QoS: A Statistical Signature based Approach to IP Traffic classification," in IMC'04 Taormina. Sicily', Italy, 2004
  14. K. Thomas. P. Konstantina. and F. Michalis, "BLIINC multilevel traffic classification in the dark" SIGCOMM Comput. Commun. Rev., VoI.35, pp. 229-240 ,2005 https://doi.org/10.1145/1090191.1080119
  15. E. Jeffrey, ;\1. Anirban, and A. Martin, "Byte me: a case for byte accuracy in traffic classification," in Proceedings of the 3rd annual ACM workshop on AIilllng network data San Diego, California, USA: ACM, 2007
  16. E. T. David, "Survey and taxonomy of packet classification techniques," ACM Comput. Surv., Vol.37, pp. 238-275, 2005 https://doi.org/10.1145/1108956.1108958
  17. http://www.tcpdump.org/, TCPDump/LIBPCAP Public Repository
  18. http://www.wireshark.org/, WireShark-Network Protocol Analyzer
  19. http::h,YWw.xtoc.com, V/ebHard Service Company
  20. http://www..netfolder.co.kr/, NetFolder
  21. http://koongpa.nexon.com/, 3D Online Action Game
  22. htt://www·.maplestory.com/. IHMORPG Game
  23. http://ucc.daum.net/, Daum UCC
  24. http//www.youtube.com. YouTube.
  25. http//radio.shs.co.kr, SBS PM Radio
  26. P. Gill, M. Arlitt, Z. Li, and A. Mahanti, "Youtube traffic characterization: a view from the edge," pp.15-28, 2007
  27. M. Cha, H. Kwak, P. Rodriguez, Y. Ahn, and S. Moon, "I tube, you tube, everyhody tubes: analyzing the world's largest user generated content video system," pp.1-14, 2007
  28. http//www.skype.com. SkyPe
  29. http//nateonweh.nate.com/en/, NateOn Messenger
  30. D. Bonfiglio, M. Mellia, M. Meo, D. Rossi. and P. Tofanelli, "Revealing skype traffic: when randorness plays with you," pp. 37-48, 2007
  31. http//www.naver.com. Naver
  32. http//www.daum.net. Daum
  33. http://www.empas.com. Empas.
  34. D. Tang and M. Baker, "Analysis of a local-area wireless network," pp.1-10, 2000
  35. H. Kang, M. Kim, and J. Hong, "Streaming Media and Multimedia Conferencing Traffic Analysis Using Payload Examination," ETRI Journal, Vol.26, pp. 203-217, 2004 https://doi.org/10.4218/etrij.04.0103.0052
  36. A. Lakhina, K. Papagiannaki, M. Crovella, C. Diot. L Kolaczyk, and N. Taft, "Structural analysis of net work traffic flows," pp, 61-72, 2004,
  37. X. Li, F. Bian, M. Crovella, C. Diot, R. Govindan, G. Iannaccone, and A. Lakhina, "Detection and identification of network anomalies using sketch subspaces," pp. 147-152, 2006
  38. S. Zander, T. Nguyen, and G. Armitage:, "Selflearning IP Traffic Classification based on Statistical Flow Characteristics," 2005
  39. R. Lewis, "An Introduction to Classification and Regression Tree (CART) Analysis," pp 1-14 2000
  40. c. Burges, "A Tutorial on Support Vector Machines for Pattern Recognition," Data Mining and Knowledge Discovery, Vol 2, pp. 121-167, 1998 https://doi.org/10.1023/A:1009715923555
  41. K Duan, S. Keerthi, and A. Poo, "Evaluation of simple perioI1'JlaJ1ce measures for tuning SVM hypeparameters," Neurocomputing, Vol.51 , pp. 41-59, 2003 https://doi.org/10.1016/S0925-2312(02)00601-X
  42. http://www.r-project.org, The R Project for Statistical Computing