Browse > Article
http://dx.doi.org/10.9708/jksci.2014.19.3.045

P2P Traffic Classification using Advanced Heuristic Rules and Analysis of Decision Tree Algorithms  

Ye, Wujian (Dept. of Computer, Dankook University)
Cho, Kyungsan (Dept. of Software Science, Dankook University)
Abstract
In this paper, an improved two-step P2P traffic classification scheme is proposed to overcome the limitations of the existing methods. The first step is a signature-based classifier at the packet-level. The second step consists of pattern heuristic rules and a statistics-based classifier at the flow-level. With pattern heuristic rules, the accuracy can be improved and the amount of traffic to be classified by statistics-based classifier can be reduced. Based on the analysis of different decision tree algorithms, the statistics-based classifier is implemented with REPTree. In addition, the ensemble algorithm is used to improve the performance of statistics-based classifier Through the verification with the real datasets, it is shown that our hybrid scheme provides higher accuracy and lower overhead compared to other existing schemes.
Keywords
P2P traffic classification; hybrid scheme; signature-based; heuristic rules; decision tree;
Citations & Related Records
Times Cited By KSCI : 2  (Citation Analysis)
연도 인용수 순위
1 Myung-Yoon Lee, Jang-Su Park and Im-Yeong Lee, "SPNS realization for secure P2P Service," Korea Multimedia Society, pp. 67-70, Nov. 2006.
2 Jaehak Yu, Hansung Lee, Yuonghee Im, Myung-sup Kim and Daihee Park, "Hierarchical Internet Application Traffic Classification using a Multi-class SVM," Korean Institute of Intelligent Systems, Vol. 20, No. 1, pp. 7-14, Oct. 2010.   과학기술학회마을   DOI   ScienceOn
3 Nam-Kyoung Um, Sung-Hee Woo and Sang-Ho Lee, "Flow-based P2P traffic identification using SVM," Vol. 13, No. 3, pp. 123-130, May 2008.   과학기술학회마을
4 Yu-Shui Geng, Tao Han and Xue-Song Jiang, "The Research of P2P Traffic Identification Technology," Proc. of International Conference on E-Business and Information System Security, Wuhan, pp. 1-4, May 2009.
5 Subhabrata Sen, Oliver Spatscheck and Dong-Mei Wang, "Accurate, scalable in network identification of P2P traffic using application signature," Proc. the 13th international conference on World Wide Web, New York, pp. 512-521, May 2004.
6 Alok Madhukar and Carey Williamson, "A longitudinal study of P2P traffic classification," Proc. 4th IEEE International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems, pp. 179-188, Sept. 2006.
7 Xin-Bin Liu, Jian-Hua Yang, Gao-Gang Xie and Yao Hu, "Automated mining of packet signatures for traffic identification application layer with apriori algorithm," Journal on Communications, Vol. 29, No. 12, pp. 51-59, March 2008.
8 G'eza Szab'o, Daniel Orincsay, Szabolcs Malomsoky, and Istvan Szab'o, "On the Validation of Traffic Classification algorithms," Passive and Active Network Measurement Lecture Notes in Computer Science, pp. 72-81, April 2008.
9 Thomas Karagiannis, Andre Broido, Michalis Faloutsos and Kc Claffy, "Transport layer identification of P2P traffic," Proc. the 4th ACM SIGCOMMConference on Internet Measurement, NewYork, pp. 121-134, Oct. 2004.
10 Yaou Zhao, Xiao Xie and Mingyan Jiang, "Hierarchical real-time network traffic classification based on ECOC," TELKOMNIKA Indonesian Journal of Electrical Engineering, Vol. 12, No. 2, pp. 1551-1560, Feb. 2014.
11 Oded Maimon and Lior Rokach, "Data Mining and Knowledge Discovery Handbook," Second Edition, Springer, 2010.
12 Marina Skurichina and Robert P. W. Duin, "Bagging, Boosting and the Random Subspace Method for Linear Classifiers," Pattern Analysis and Applications, Vol. 5, No. 2, pp. 121-135, June 2002.   DOI
13 Aiqing Zhu, "A P2P Network Traffic Classification Method Based on C4.5 Decision Tree Algorithm," Proc. of the 9th International Symposium on Linear Drives for Industry Applications, Vol. 4, pp.373-379, Jan. 2014.
14 Wujian Ye and Kyungsan Cho, "Hybrid P2P traffic classification with heuristic rules and machine learning," Soft Computing Journal (to be published)
15 Pruning, http://en.wikipedia.org/wiki/Pruning_(decisi on_trees)
16 S. B. Kotsiantis, "Decision trees: a recent overview," Artificial Intelligence Review, Vol. 39, No. 4, pp. 261-283, April 2013.   DOI
17 S. Kotsiantis, "Combining bagging, boosting, rotation forest and random subspace methods," Artificial Intelligence Review, Vol. 35, No. 3, pp. 223-240, March 2011.   DOI
18 Tin Kam Ho, "The Random Subspace Method for Constructing Decision Forests," IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 20, No. 8, pp. 832-844, Aug. 1998.   DOI   ScienceOn
19 Wujian Ye and Kyungsan Cho, "Two-Step P2P Traffic Classification with Connection Heuristics," Proc. of IMIS2013-Seventh International Conference on Innovative Mobile and Internet Services in Ubiquitous Computing, pp.135-141, July 2013.
20 Robert E. Banfield, Lawrence O. Hall, Kevin W, Bowyer and W. P. KegeImeyer, "A comparison of decision tree ensemble creation techniques," IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 29, No. 1, pp. 173-180, Jan. 2007.   DOI   ScienceOn
21 Tianyan Jiang, Jian Li, Yuanbing Zheng and Caixin Sun, "Improved Bagging Algorithm for Pattern Recognition in UHF Signals of Partial Discharges," Energies, Vol. 4, No. 7, pp. 1087-1101, April 2011.   DOI
22 Jun Li, Shui-Yi Zhang, Yan-Qing Lu and Jun-Rong Yan, "Hybrid Internet Traffic Classification Technique," Journal of Electronics (China), Vol. 26, No. 1, pp. 101-112, Jan. 2009.   DOI   ScienceOn
23 Ram Keralapura, Antonio Nucci and Chen-Nee Chuah, "A novel self-learning architecture for p2p traffic classification in high speed networks," Computer Networks, Vol. 54, No. 7, pp. 1055-1068, May 2010.   DOI   ScienceOn
24 JPcap, http://www.eden.rutgers.edu/-muscarim/jpcap/index.html
25 Weka, http://www.cs.waikato.ac.nz/ml/weka/
26 Thuy T. T. Nguyen and Grenville J. Armitage, "A survey of techniques for internet traffic classification using machine learning," Proc. of IEEE Communications Surveys and Tutorials, Vol. 10, No. 4, pp. 56-76, Fourth Quarter 2008.   DOI
27 Precision and recall, http://en.wikipedia.org/wiki/Recall_and_precision
28 Zhen-Xiang Chen, Bo Yang, Yue-Hui Chen, Ajith Abraham, Crina Grosan and Li-Zhi Peng, "Online Hybrid traffic classifier for Peer-to Peer Systems based on Network Processors," Applied Soft Computing, Vol. 9, No. 2, pp. 685-694, Mar. 2009.   DOI   ScienceOn
29 F. Gringoli, L. Salgarelli, M. Dusi, N. Cascarano, F. Risso and K.C. Claffy, "GT: picking up the truth from the ground for Internet traffic,"ACM SIGCOMM Computer Communication Review, Vol. 39, No. 5, pp. 12-18, Oct. 2009.   DOI