Browse > Article

Traffic Classification Using Machine Learning Algorithms in Practical Network Monitoring Environments  

Jung, Kwang-Bon (포항공과대학교 컴퓨터공학과)
Choi, Mi-Jung (포항공과대학교 컴퓨터공학과)
Kim, Myung-Sup (고려대학교 컴퓨터정보학과)
Won, Young-J. (포항공과대학교 컴퓨터공학과)
Hong, James W. (포항공과대학교 컴퓨터공학과)
Abstract
The methodology of classifying traffics is changing from payload based or port based to machine learning based in order to overcome the dynamic changes of application's characteristics. However, current state of traffic classification using machine learning (ML) algorithms is ongoing under the offline environment. Specifically, most of the current works provide results of traffic classification using cross validation as a test method. Also, they show classification results based on traffic flows. However, these traffic classification results are not useful for practical environments of the network traffic monitoring. This paper compares the classification results using cross validation with those of using split validation as the test method. Also, this paper compares the classification results based on flow to those based on bytes. We classify network traffics by using various feature sets and machine learning algorithms such as J48, REPTree, RBFNetwork, Multilayer perceptron, BayesNet, and NaiveBayes. In this paper, we find the best feature sets and the best ML algorithm for classifying traffics using the split validation.
Keywords
Traffic classification; algorithm;
Citations & Related Records
연도 인용수 순위
  • Reference
1 N. Williams, S. Zander, G. Armitage, "A Preliminary Performance Comparison of Five Machine Learning Algorithms for Practical IP Traffic Flow Classification", SIGCOMM Computer Communication Review, Oct. 2006, pp.7-15.
2 Andrew W. Moore, Denis Zuev, "Internet Traffic Classification Using Bayesian Analysis Techniques", SIGMETRICS'05, Banff, Alberta, Canada, Jun. 2005, pp.50-60
3 Etheral, http://www.ethereal.com
4 Jeffrey Erman, Anirban Mahanti, Martin Arlitt, "Byte Me: A Case for byte accuracy in Traffic Classification", MineNet'07, J San Diego, California, USA, Jun. 2007, pp.35-37
5 Artificial Neural Network, http://en.wikipedia.org/wiki/Artificial_neural_n etwork
6 Sebastian Zander, Thuy Nguyen, Grenville Armitage, "Automated Traffic Classification and Application Identification using Machine Learning", Proceedings of the IEEE Conference on Local Computer Networks, Sydney, Australia, Nov. 2005, pp.250-257
7 Thuy T. T. Nguyen, Grenville Armitage, "Training on multiple sub‐flows to optimize the use of Machine Learning classifiers in real world IP networks", IEEE Conference on Local Computer Networks, Tampa, Florida, USA, Nov. 2006, pp. 369-376
8 Junghun Park, Hsiao‐Rong Tyan, and C. C. Jay Kuo, "Inetnet Traffic Classification For Scalable QoS Provision", IEEE International Conference on Multimedia and Expo, Jul. 2006, pp.1221-1224
9 Lei Yu and Huan Liu, "Feature selection for high-dimensional data: A fast correlation-based filter solution", Proceedings of the International Conference on Machine Learning, Washington, DC, USA, Aug. 2003, pp.856-863
10 Andrew Moore, Denis Zuev and Michael Crogan, "Discriminators for use in flow‐based classification", Technical Report, Intel Research Cambridge, 2005
11 Se‐Hee Han, Myung‐Sup Kim, Hong‐Taek Ju and James W. Hong, "The Architecture of NG‐MON: A Passive Network Monitoring System", IFIP/IEEE International Workshop on Distributed Systems: Operations and Management, LNCS 2506, Montreal, Canada, Oct. 2002, pp.16-27
12 Jeffrey Erman, Martin Arlitt, Anirban Mahanti, "Traffic Classification Using Clustering Algorithms", SIGCOMM'06 Workshops, Pisa, Italy, Sep. 2006, pp.281-286
13 Ethem Alpaydin, "Introduction to Machine Learning", MIT Press, 2004
14 Jeffrey Erman, Anirban Mahanti, Martin Arlitt, "Internet Traffic Identification using Machine Learning", IEEE Global Telecommunications Conference, California, USA, Nov.-Dec. 2006, pp.1-6
15 Machine Learning Lab in The University of Waikato, "Weka", [Online] Available: http://www.cs.waikato.ac.nz/ml
16 Junghun Park, Hsiao‐Rong Tyan, C.‐C. Jay Kuo, "GA‐Based Internet Traffic Classification Technique for QoS Provisioning", International Conference on Intelligent Information Hiding and Multimedia, Pasadena, California, USA, Dec. 2006, pp.251-254