Ⅰ Introduction
Modern network security environment is like this: a firewall is installed in the network boundary checking network connections. If the port of the connection is not allowed by network administrator, then the firewall will block the network connection targeting on this port. In a network, an Intrusion Detection System (IDS) is probably used to detect possible attacks by checking the payloads of network packets. If one packet payload contains the signature matching IDS rules, then alert or action might be taken.
Normally web browsing is a common behavior within a local network, which is allowed by network administrators. HTTP is the protosl widely used for web browsing. Hypertext Transfer Protocol Secure (HTTPS) is also used for browsing w얀b pa-g샨s, but so few w쟌b pages ne연d HTTPS in comparison with HTTP. Therefore, HTTP is normally allowed by firewall. IDS may also checks the payload of HTTP traffic t샤 see if there is something abnormal.
This security environment however, can b얀 easily broken by several techniques. For example, file sharing applications (eMule/ eDonkey etc.) disguise their traffic as HTTP traffic in order to pass the examination of firewall and IDS. Tunnel technique is another big thread to this security environment. DNS tunneHl] and HTTP tunnel(2](3] are two typical tunnel techniques making use of the chai/eter of this security environment. By encapsulating application data into protocols allowed by se-이uity policies, any applications using tunn쟝 1 techniques could work properly even if they are actually forbidden by the security P 시 icies.
However, HTTP tunnels could be recognized using statistical methods because regular HTTP traffic have different statistical characteristics with tunneled traffic. Suppose a user is chatting using timneled instant m젼ssaging. The chatting behavior, which includes sending short messag얀s and g워iving short messages is quite different from web browsing, which includes sending short messages and receiving long messages.
Since applications have different b른hav— iors, we proposed a HTTP tunnel detection method based on statists(湖 mechanism. The proposed method is site independent, which means there is only one training time. One쟌 being trained, this method can be applied to any other sites without furthe호 training. Besides, the proposed m순fhod achieves a high accuracy, mostly 99%.
The rest of this paper is organized as follows. Related work are discussed in Section 2, then in Section 3 we describe the datasets we us량d, In Section 4 we describ쟌 how SVM works and how we set our classi™ fication param연Detailed classification process are presented in Section 5 and classification results are discussed in Section 6. Finally we conclude in Section 7.
Ⅱ. Related Work
Serving a very long time historically, port based IP traffic classification is based on registered port numbers in Internet Assigned Numbers Authority (IANA)[4L IP traffic are classified by their port numbers into different applications, e.g. port 80 for HTTP. This 이assificati갾n method is simp!픈 but quite unreliable. The reasons are that not every application has its registered port, and not every application follows its registered port. For example p연얀!' to p운er applications may use port 80 to transmit data. Madhukar and Williamson[5] observed that p야Ft based classification is not able to identify 30-70% of Internet traffic they investigated. This method is unable to detect HTTP tunnels since HTTP tunnels are using HTTP port as the transmit port.
To 양vercome the flaw of port based traffic classification, payload based IP traffic classification was propos원d. Payload based IP traffic classification inspects payload of traffic to see if there is a match between examined payload and its ow효 signature database. Many IDS such as Bro⑹〔7〕and Snort[8) are using this approach. Several research works also used this ap~ proach(9] (10K11X12L Since payload based IP traffic classification depends on its si용“ nature database, the biggest disadvantage is that it should maintain an up-to-date signature database, otherwise its classification ability will decrease. Another problem payload based IP traffic classification me탸ts is that it must inspect the traffic payload which is against users privacy. In some are젔.s, inspecting payload is against the law. Th얀 third problem of payload based IP traffic classification is that it needs high computational power sine수 it performs deep inspection of large number of network traffic. This classification method could detect some HTTP tunnels using plain text, such as Telnet, but its detection on encrypted HTTP tunnels, such as instant messaging is doubtable.
Statistical based traffic classification is an alternative to p잖ylo허d based traffic classification. This approach relies on statistical properties of network traffic. It is believed that applications have different characteristics in network traffic features, such as flow duration, packet size, inter-arrival time, etc. By analysis these features statistically, different applications co냐Id be classified. This approach eds training phase and classification phase. In training phase, training samples are used to feed the classification algorithm in order to establish the classification model. As soon as the classification model is established, the classification can be performed. Paxson(13] noticed the relationship be-tw샨®n the class of traffic and its observed statistical properties. Roughan et al.[14] proposed to map different network applicae tions to predetermined QoS traffic classes using the Nearest Neighbors (NN), Linear Discriminate Analysis (LDA) and Quadratic Discriminant Analysis (QDA) algorithms. Erman et al, [15] presented a semisupervised traffic classification method which combines supervised and unsupervised approaches. In 2007 Erman et al.[16] proposed an approach to identify Web and peer-to-peer traffic of network core using K-means. Their research is mainly about how to distinguish HTTP traffic and peer-to-peer traffic, not targeting on HTTP tunnels. S, Kaoprakhon and V. Visoottiviseth(17] proposed a eombi財tion of signature based and behavior based approach to distinguish between normal HTTP traffic and audio/video traffic. They got a good experimental results, but still not aiming at HTTP tunnels.
M. Crotti et al.[18] presented a statistical fingerprints approach to distinguish between HTTP and HTTP tunnels. But their approach faces the problem of site dependenc샨, which means if their method moves to another place, it should be trained again in order to adapt to local environment. Their fingerprints method works in this way-
They extracted packet size, packet order and inter-arrival time as classification features to build a vector of Probability Density Functions (PDF). These PDFs are called Protocol Fingerprints. Elements in PDF a脾 0&J pairs, where 饥 is th안 size of the i-th packet and △弓 represents the inter-arrival time between the i-th packet and the preceding one.
During the training phase, such PDFs are established. While in the classification phase, they computed the probability of each packet in one session falling on th영 popular area of the fingerprints. If every packet falls on th잔 popular regions of each PDFs, then they consider this session is normal. Otherwise it is HTTP tunnel session.
But this method has location restriction. Their other works on tunnel detect』젾n〔19〕〔20〕 face the same problem.
Ⅲ. Date Collection
In order to prove our method is site in-dependent on HTTP tunnel detection, we collected data from six diff연!'ent places. The collection environment is like (Figure 1). There are two tunnel servers in our experiments, one in each country, China and Korea. We used computers with public IP addresses as tunnel servers. We tri슨d to simulate the real tunnel environment. Suppose there is a user wh량 wants to tunnel his mail data in HTTP from his company. he w이ild like to set his hom얀 computer as the tunnel server which is normally not far from his company. In other words, his tunnel server is in his own (x)imtry. Thus, we set two tunnel servers and let each country's tunnel clients connect to local tunnel servers.
[Figure 1) Data Coliection Environment
There are six tunnel clients, three in China, Henan Province (CHHEN), Sichuan Province (CHSIC) and Jiangsu Province (CHJIA). The other three tunnel clients are in Korea, Incheon City (KOINC), Seoul City (KOSEO) and Suwon City (KOSUW). We let friends in these six locations run HTTP tunnel client program on their computers. Tunnel clients connect with tunnel servers that will corm응et with the real mail servers and telnet servers when receivi그흠 data request from tunnel clients.
During one week, our friends in six locations accessed HTTP web pages, sent and received mails using different mail accounts and obtained telnet &젾fvice from different telnet servers through tunneled HTTP. At the same time, an open network flow recording tool Wireshark(21] was running on their computers to capture HTTP data as well as tunneled HTTP data.
In order to guarantee that 理는 collected the real HTTP data and the r으ai tunneled HTTP data, we let our friends do the experiments manually. For example, to collect HTTP data, they first started Wireshark, set capture filter to 'port 80', th얀n started web browser to access web pages. Moreover, we used different ports for different tunneled service, e.g. port 2500 for SMTP tunnel, port 11000 for POP tunnel, so we know if we capture traffic in these ports, we will get pure tunned HTTP data.
Th쟌se data provide our training datasets and test datasets, as shows in (Table 1) and (Table 2). Our training datasets are from Incheon, Korea (KOINC), having 292 MB HTTP data, 439, 022 packets and 15, 818 sessions. Tunneled SMTP data are 168 MB, 479, 808 packets and 1, 831 sessions. There are 52 MB, 153, 049 packets, 871 sessions, and 12 MB, 91, 211 packets, 93 sessions for tunneled POP and tunneled Telnet respectively.
(Table 1) Training Datasets
〔Table 2) Test Datasets
There are two terms should be explained here- packet and session. When we say packet, it is a TCP packet with Ethernet part, IP part, TCP part and maybe payload. A session is a transmission unit starting by the TCP three-way handshake ending by the FIN or RST packets. In our experiment, we tested how many sessions were correctly recognized.
Ⅳ. Support Vector Machines
Support Vector Machines (SVM) are a set of related supervised learning methods which analyze data and recognize patterns, used for statistical classification and regression analysis. Since an SVM is a classifier, then given a set of training examples, each marked as belonging to one of two categories, an SVM training algorithm builds a model that predicts whether a new example falls into one category or the other. Intuitively, an SVM model is a representation of the examples as points in space, mapped so that the examples of the separate categories are divided by a clear gap that is as wide as possible. New examples are then mapped into that same space and predicted to belong to a category based on which side of the gap they fall on(22).
Given a training set of instance-label pairs (%, 伤)3 = kwhere x. eRn and
SVMs require the solution of the following optimization problem:
#
subject to
#
Here training vectors $x_i$ are mapped into a higher (maybe infinite) dimensional space by the function S SVM finds a linear separating hyperplane with the maximal margin in this higher dimensional space. 0 0 is the penalty parameter of the error term. Furthermore, 及的, 灼)三 0(%)七(吒)is called the kernel function. There are four basic kernels as below-
. Linear: K(/注勻)=:/勺*
• Polynomial: Kl * 可)=(從;&勺 + 广)叩),>0
. Radial Basis Function (RBF):
KI毎 £j) = exp (-仙" - 찌μ), 丁>0
, Sigmoid- K(r灼)= tanh(花《匂 + 广)
Here, ” and d are kernel parameters. In our experiment, we used Radial Basis Function (RBF) as kernel because of its good performance shown in different applications. When using RBF kernel, two parameters C and 7 must be carefully chosen, as SVM classification accuracy depends on these two parameters.
We used libsvm[23] to get the best (Gt) values as (128, 0.5) with prediction accuracy of 99.9283% on training dataset KOINC as shows in [Figure 2〕.
Therefore in the following experiments, we also used parameters Q=128 and 7^ 0.5.
〔Figure 2] Kernel Function Parameters Selection
Ⅴ. Classification of HTTP Tunnels
5.1 HTTP Tunnel Mechanism
HTTP tunnel is a technique that other protocols are wrapped up by HTTP in order to circumvent local security policies.
Some netw갾rks define strict access policies to enhance local network security, such as block port 1863 to limit instant chat, block port 25 to limit mail sending, etc. While in most networks web browsing is al-low션d, which means port 80 is not blocked by security policy. In order to use other applications that are blocked by local security policies, HTTP tunnel was invent양d.
Data of oth여r applications could be wrapped up by HTTP to disguise as normal HTTP data. By doing this, these data could pass examination of security devices usually locating in the boundary of the network. HTTP tunnel technique normally contains two parts, tunnel client and tunnel server. Th얁 tunnel client wraps application data into HTTP data, and th샨n sends data to the tunnel server. The tunnel server unwraps the received HTTP data into normal application data and then 维nds the data to the real destination. The tunnel server acts as a proxy server between the application and the real destination. When the tunnel server receives any data from th슌 real destination, it does the similar thing as the tunnel client.
(Figure 3] shows an exampl연 of GNU HTTP tunnel(2J for SMTP. The following instructions might establish such tunnel:
(Figure 3) GNU HTTP Turin으] for SMTP
htc.exe -F 25 tunnel_server. 80
hts.exe -F smtpserv샨r:25 80
The first instruction sets the listening port (port 25) of the tunnel client and its forwarding address (tunneLserver:80). Any data received from port 25 will be forwarded to tunnel_server-80. The second instruction tells the tunnel server to listen on port 80, and forward data to smtp„serv-er:25.
The tunnel client htc listens on TCP port 25 which is used to communicate with SMTP client. Any data from SMTP client will be encapsulated into HTTP data by tunnel client, and then send to tunnel server, i.e. tunnel_server:80.
Since the encapsulated data are much like regular HTTP data at first 이ance: with HTTP 'GET/POST' request, with HTTP Tespons얐, and using port 80, firewall or other gateway security devices would let them pass according to the policies.
When the tunnel server hts receives th운 encapsulated data, it performs decapsulation to get the original SMTP data. Then it communicates with the real SMTP server. Any data from real SMTP server would be wrapped up into HTTP data and th야員 sent back through the tunnel between the funii즌 1 client and the tunnel server.
After the tunnel client gets data from th영 tunnel, it decapsulates the data and sends to SMTP client.
Usually the tun교el client htc runs on local user's machine, the tunnel server hts runs on a machine outside the local network, a computer with a public IP address at home, for example.
5.2 Classification Features Selection
Our classification features are chosen based on the observation of (Figure 4), the difference between a typical HTTP session and a Telnet session. In [Figure 4] we omit the control packets, such as SYN, ACK, FIN, RST packets etc., because they are related to TCP transmission instead of application. In our method, we only consider packets with payload.
[Figure 4) Typical Sessions of HTTP and Telnet
In a typical HTTP session, the client first sends a HTTP 'GET' or 'POST' request packet, usually with a large payload, more than 100 bytes and with a PUSH flag. According to TCP RFC(24], the PUSH flag of a packet indicates that the receiver must not wait for more data from the sender and process th쟌 buffered data immediately. After receiving the client's request, the server will respond with requested HTTP data. Normally several large packets with PUSH flags will be sent back to the client.
In a typical Telnet session, the server first sends welcome information to the c* lient, then the client authenticates himself with user name and password. Every character the client types will send back to the client in order to display on the client's screen. After authenticate successfully the client sends commands to the server, and gets responses. Normally the response packets from the server are larger than the command packets from the client. Most of the packets in a Telnet session have PUSH flags.
Through above analysis we can find that a HTTP session is quite different from a Telnet session in the numbers of PUSH packets, packet size and packet numbers.
The similar situation happens to SMTP and POP sessions. We could also demonstrate this point through 2D scatter plot graphs ([Figure 5〕and (Figure 6]). These graphs are from KOINC training data.
EFigure 5] Number of PUSH Packets from Client to Server
[Figure 6(a)] Minimal Packet Length from Client to Server
[Figure 6(b)) Packet Numbers from Ghent to Server
From [Figure 5〕we could see that PUSH numbers of normal HTTP session are scattered in the area of (L 30). For instance, one HTTP session has 20 PUSH packets, while another HTTP session may has just 2 PUSH packets. In overall, these PUSH numbers are probably confined in the area of 1 to 30.
POP data are mainly scattered in (30, 60), and SMTP in (60, 80). While Telnet are spread between 140 and 311.
[Figure 6(a)] and [Figure 6(b)〕show 난)e scattered plots of the minimal packets length and packet numbers respectively. From these figures we could also find that HTTP has differences in packet length and numbers with HTTP tunnels.
Therefore, the number of PUSH packets, packet length and packet numbers of both client/server and server/client directions are selected as our classification features. (Table 3] shows the total 16 features we used in our method.
[Table 3〕Classification Features
5.3 Classifier Accuracy Comparison
Except SVM, we also tried three other classifiers: ZeroR, Naive Bayes and Ada-Boost on test datasets of Incheon City, Korea (KOINC).
ZeroR or 0-R classifier belongs to rule classifier. In rule classifier, different rules are applied to different attributes, and based on these rules an output is chosen. The ZeroR classifier takes a look at the target attribute and will always output the
value that is commonly found in that col-umn[25L It predicts the test data's majority class (if nominal) or average value (if numeric) [26L
Naive Bayes is based on applying Bayes' theorem with strong independence assumptions. It assumes that the presence (or absence) of a particular feature of a class is unrelated to the presence (or absence) of any other feature. This classification method analyzes the relationship between instance of each class and each attributes to get a conditional probability for the relationship between the attributes and the class. An advantage of Naive Bayes classifier is that it requires a small amount of training data to estimate the parameters (means and variances of the variables) necessary for classification. More information can be found in〔27〕.
AdaBoost or Adaptive Boosting is a meta-algorithm which be used in conjunction with many other learning algorithms to improve their performance. It incrementally constructs a complex classifier by overlapping the performance of possibly hundreds of simple classifiers using a voting scheme. AdaBoost is simple to implement and known to work well on very large sets of features by selecting the features required for good classification[28L Detailed information could be found in(29].
The experiment result shows as [Table 4〕. We found that ZeroR has the poorest performance, only has the accuracy of 28.99%. While SVM achieves the accuracy of 97.97%, the best performance among all of them. Besides, SVM and Naive Bayes have the shortest training time periods. However in comparison with the accuracy performance, the 0.03 seconds' difference between SVM and Naive Bayes can be ignored. Therefore, SVM was chosen as the classifier of our method.
〔Table 4〕Classifier Accuracy and Training Time
5.4 Training Size
We also conducted the experiment on how many samples needed for SVM to get a high accuracy. [Figure 7〕shows the results.
From [Figure 7] we can see that using SVM algorithm can achieve an acceptable results (about 96%) with 10 training samples, i.e. 10 sessions from training datasets and a very promising results (almost 100%) with 100 training samples. Thus, we chose 100 training samples in the following experiments,
(Figure 7] Accuracy of Different Training Size
Ⅵ. Experimental Results
We tested six datasets from China and Korea and got the results as showed in [Figure 8〕. Most of them are quite accurate. 99%. only one is below 95%. The training dataset are from Incheon, Korea, but the test datasets are from different locations of Korea and China. We even tried to use KOSEO and CHSIC as our training dataset and tested five other datasets, sim-ilar results have been gotten.
Therefore we conclude that our proposed method can detect HTTP tunnels without location restrictions, that means it is site independent. B연ing train샨d once, the pro~ posed method cx)uld work on other sites with high accuracy.
We also compared our experimental results with M. Crotti et al/s Protocol Fingerprints method (FP)〔18〕in [Figure 9〕 and [Figure 10〕.
〔Figure 8] Experimental Results
(Figure 9) Comparison Results: HTTP and SMTP
[Figure 10 Comparision Resuks- POP and Telnet
[Figure 9〕shows the comparison results between our HTTP accuracy and FP accuracy, as well as SMTP comparison. From this figure we can see that when dealing with dataset KOINC, both methods perform quite well, achieving the accuracy about 100%. This is because the training dataset and classification dataset are from the same place, Incheon, Korea. It is coincident with the experimental results of FP meth-od(18L
When we tested both methods with datasets from different places, differences illustrated in [Figure 9〕. Both HTTP and SMTP using FP method achieve the lower accuracy in comparison with our method with datasets from five other places. FP accuracy using the five datasets besides KOINC are around 92%, while our accuracy are around 98%. This confirms that FP method is location restricted and our method is not restricted to single location.
The same situation happened when we tested POP and Telnet using both methods. The comparison results show in (Figure 10], Being trained and tested with dataset from same place, both methods get good results. Being tested with different datasets, our method is superior to FP method, except one point for Telnet in KOSUW.
Ⅶ. Collusion
We proposed a HTTP turm이 detection method based on statistical mechanism. We did experiments to train our method with datasets from one location and test our method with six di任erent locations from two nations. In comparison with the existing methods, the experimental results showed that our proposed method is site independent. It only needs one training time and could be applied to other networks without training any more. Besides, the experimental results showed that our proposed method achieves high accuracy on HTTP tunnels detection. Furthermore, since it needs so few training samples, i.e. 100, the training time is quite short. This gives our proposed method another advantage of deployment. Currently we tested our proposed method offline. Online HTTP tunnel detection would be our further work.
참고문헌
- Julius Plenz, "DNS tunnel," http:// www.dnstunnel.de/, Aug. 2010.
- Lars Brinkhoff, "GNU httptunnel," http://www.nocrew.org/software/httptunnel.html, Aug. 2010.
- Richard Mills, "The Linux Academy HTTP Tunnel," http://the-linux-academy.co.uk/downloads.htm, Aug. 2010.
- Internet Assigned Numbers Authority. http://www.iana.org/assignments/port-numbers. Aug. 2010.
- A. Madhukar and C. Williamson, "A longitudinal study of P2P traffic classification," 14th IEEE International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems, pp. 179-188. Sep. 2006.
- Lawrence Berkeley National Laboratory, "Bro Intrusion Detection System", http://www.bro-ids.org/, Aug. 2010.
- V. Paxson, "Bro: A system for detecting network intruders in real-time," Computer Networks, no. 31(23-24), pp. 2435-2463, Dec. 1999. https://doi.org/10.1016/S1389-1286(99)00112-7
- Sourcefire, "Snort IDS/IPS," http://www. snort.org/.Aug. 2010.
- T. Choi, C. Kim, S. Yoon, J. Park, B. Lee, H. Kim, and H. Chung, "Content-aware internet application traffic measurement and analysis," Proceedings of the 2004 IEEE/IFIP Network Operations and Management Symposium (NOMS'04), vol. 1, pp. 511-524, Apr. 2004.
- S. Sen, O. Spatscheck, and D. Wang, "Accurate, scalable in-network identification of p2p traffic using application signatures," Proceedings of the 13th international conference on World Wide Web, pp. 512-521. May 2004.
- P. Haffner, S. Sen, O. Spatscheck, and D. Wang, "Acas: Automataed construction of applicatin signatures," SIGCOMM MineNet Workshop, pp. 107-202, Aug. 2005.
- A. Moore, and K. Papagiannaki, "Toward the accurate identification of network applications," Proceedings of 6th passive active measurement workshop (PAM), vol. 3431, pp. 41-54, Apr. 2005.
- V. Paxson, "Empirically derived analytic models of wide-area TCP connections," IEEE/ACM Transactions on Networking, vol. 2, no. 4, pp. 316-336, Aug. 1994. https://doi.org/10.1109/90.330413
- M. Roughan, S. Sen, O. Spatscheck, and N. Duffield, "Class-of-service mapping for QoS: A statistical signature-based approach to IP traffic classification," Proceedings of ACM/SIGCOMM Internet Measurement Conference (IMC) 2004, Taormina, Sicily, Italy, pp. 135-148, Oct. 2004.
- J. Erman, A. Mahanti, M. Arlitt, I. Cohen, and C. Williamson, "Semisupervised network traffic classification," ACM International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS) Performance Evaluation Review, vol. 35, no. 1, pp. 369-370, Jun. 2007.
- J. Erman, A. Mahanti, M. Arlitt, and C. Williamson, "Identifying and Discriminating Between Web and Peer-to-Peer traffic in the Network Core," Proceedings of the 16th International World Wide Web Conference (WWW), Banff, Canada, pp. 883-892, May 2007.
- Samruay Kaoprakhon, and Vasaka Visoottiviseth, "Classification of audio and video traffic over HTTP protocol," Proceedings of the 9th international conference on Communications and information technologies, Incheon, Korea, pp. 1534-1539, Sep. 2009.
- M. Crotti, M. Dusi, F. Gringoli, and L. Salgarelli, "Detecting HTTP tunnels with statistical mechanisms," Proceedings of the 42nd IEEE International Conference on Communications (ICC 2007), Glasgow, Scotland, pp. 6162-6168, Jun. 2007.
- M. Crotti, M. Dusi, F. Gringoli, and L. Salgarelli, "Traffic classification through simple statistical fingerprinting," ACM SIGCOMM Computer Communication Review, vol. 37, issue 1, pp. 7-16, Jan. 2007.
- M. Dusi, M. Crotti, F. Gringoli, and L. Salgarelli, "Tunnel Hunter: Detecting application-layer tunnels with statistical fingerprinting," Computer Networks: The International Journal of Computer and Telecommunications Networking, vol. 53, issue 1, pp. 81-97, Jan. 2009.
- Wireshark Foundation, "Wireshark", http://www.wireshark.org/, Aug. 2010.
- C. Burges, "A Tutorial on Support Vector Machines for Pattern Recognition," Data Mining and Knowledge Discovery, vol. 2, pp. 121-167, Jun. 1998. https://doi.org/10.1023/A:1009715923555
- C.C. Chang and C.J. Lin, "LIBSVM : a library for support vector machines," http://www.csie.ntu.edu.tw/-cjlin/libsvm/, Aug. 2010.
- Jon Postel, "Transmission Control Protocol," RFC793, Sep. 1981.
- Robert Russo, "Bayesian and Neural Networks for Motion Picture Recommendation," Ph.D thesis, Boston College. May 2006.
- I.H. Witten, and E. Frank, Data Mining: Practical Machine Learning Tools and Techniques, 1st Ed., Morgan Kaufmann, San Francisco, CA, ISBN: 9781558605527, Oct. 2005.
- f Irina Rish, "An empirical study of the naive Bayes classifier," IJCAI 2001 Workshop on Empirical Methods in Artificial Intelligence, pp. 41-46, Jan. 2001.
- Riyad Alshammari, and A. Nur Zincir-Heywood, "Machine learning based encrypted traffic classification: identifying SSH and skype," Proceedings of the Second IEEE international conference on Computational intelligence for security and defense applications, Ottawa, Ontario, Canada, pp. 1-8, Jul. 2009.
- Yoav Freund, and Robert E. Schapire, "A decision-theoretic generalization of on-line learning and an application to boosting," Proceedings of the Second European Conference on Computational Learning Theory, pp. 23-37, Mar. 1995.