[KSCI] Korea Science Citation Index Service

http://dx.doi.org/10.7472/jksii.2017.18.1.65

A Detection Model using Labeling based on Inference and Unsupervised Learning Method

Hong, Sung-Sam (Department of Computer Engineering, Gachon University)
Kim, Dong-Wook (Department of Computer Engineering, Gachon University)
Kim, Byungik (Department of Security R&D Team 1, Korea Internet& Security Agency)
Han, Myung-Mook (Department of Computer Engineering, Gachon University)

Publication Information

Journal of Internet Computing and Services / v.18, no.1, 2017 , pp. 65-75 More about this Journal

Abstract

The Detection Model is the model to find the result of a certain purpose using artificial intelligent, data mining, intelligent algorithms In Cyber Security, it usually uses to detect intrusion, malwares, cyber incident, and attacks etc. There are an amount of unlabeled data that are collected in a real environment such as security data. Since the most of data are not defined the class labels, it is difficult to know type of data. Therefore, the label determination process is required to detect and analysis with accuracy. In this paper, we proposed a KDFL(K-means and D-S Fusion based Labeling) method using D-S inference and k-means(unsupervised) algorithms to decide label of data records by fusion, and a detection model architecture using a proposed labeling method. A proposed method has shown better performance on detection rate, accuracy, F1-measure index than other methods. In addition, since it has shown the improved results in error rate, we have verified good performance of our proposed method.

Keywords

Labeling; Detection Model based on classification; Data Mining; Inference; Supervised/Unsupervised Learning; Security;

Citations & Related Records

Reference

1	Chapman, P., Clinton, J., Kerber, R., Khabaza, T., Reinartz, T., Shearer, C. and Wirth, R, "CRISP-DM 1.0 Step-by-step data mining guide", IBM, 2000.
2	Soukaena Hassan Hashem, "Efficiency of SVM and PCA to Enhance Intrusion Detection System," Journal of Asian Scientific Research, Vol.3, No.4, pp.381-395, 2013.
3	Hong, Sung-Sam, Wanhee Lee, and Myung-Mook Han, "The Feature Selection Method based on Genetic Algorithm for Efficient of Text Clustering and Text classification," International Journal of Advances in Soft Computing & Its Applications, Vol.7, No.1, 2015.
4	Rampure, Vinod, and Akhilesh Tiwari. "A Rough Set Based Feature Selection on KDD CUP 99 Data Set." International Journal of Database Theory and Application, Vol.8, No.1, pp.149-156, 2015. https://doi.org/10.14257/ijdta.2015.8.1.16 DOI
5	http://www.r-project.org/
6	https://cran.r-project.org/package=e1071
7	KDD' cup 99, "Knowledge discovery in databases DARPA archive," http://www.kdd.ics.uci.edu/databases/kddcup99/task.html, 1999.
8	Monowar H. Bhuyan, D. K. Bhattacharyya, and J. K. Kalita, "Network Anomaly Detection: Methods, Systems and Tools," IEEE Communications Surveys & Tutorials, Vol.16, No.1, pp.303-336, 2014. https://doi.org/10.1109/surv.2013.052213.00046 DOI
9	Syarif, A. Prugel-Bennett, G. Wills, "Unsupervised clustering approach for network anomaly detection," Networked digital technologies communications in computer and information science, Vol.293, Springer, pp.135-145, 2012. https://doi.org/10.1007/978-3-642-30507-8_13 DOI
10	Anna L. Buczak, Erhan Guven, "A Survey of Data Mining and Machine Learning Methods for Cyber Security Intrusion Detection," IEEE COMMUNICATIONS SURVEYS & TUTORIALS, Vol.18, No.2, 2016. https://doi.org/10.1109/comst.2015.2494502 DOI
11	N. B. Amor, S. Benferhat, and Z. Elouedi, "Naive Bayes vs. decision trees in intrusion detection systems," in Proc ACM Symp. Appl. Comput., pp.420-424, 2004. https://doi.org/10.1145/967900.967989 DOI
12	Sannasi Ganapathy, Kanagasabai Kulothungan, Sannasy Muthurajkumar, Muthusamy Vijayalakshmi, Palanichamy Yogesh, and Arputharaj Kannan, "Intelligent feature selection and classification techniques for intrusion detection in networks: a survey," EURASIP Journal on Wireless Communications and Networking (open access), 2013. https://dx.doi.org/10.1186/1687-1499-2013-271 DOI
13	R. Hendry and S. J. Yang, "Intrusion signature creation via clustering anomalies," Proc. SPIE Defense Secur. Symp. Int. Soc. Opt. Photonics, pp.69730C- 69730C, 2008. https://doi.org/10.1117/12.775886 DOI
14	Claudio Mazzariello, "Multiple classifier Systems for Network Security from data collection to attack detection," Universita degli Studi di Napoli Federico Il Open Archive, Doctor Thesis, 2008.
15	Burroughs, Daniel J., Linda F. Wilson and George V. Cybenko, "Analysis of distributed intrusion detection systems using Bayesian methods. Performance," The 21st IEEE International Computing, and Communications, 2002. https://doi.org/10.1109/ipccc.2002.995166 DOI
16	Bass, Tim, "Intrusion detection systems and multisensor data fusion," Communications of the ACM, Vol.43, No.4, pp.99-105, 2000. https://doi.org/10.1145/332051.332079 DOI
17	MLA Deng, Xinyang, and Yong Deng, "Multisensor Information Fusion Based on Dempster-shafer Theory and Power Average Operator," Journal of Computational Information Systems, Vol.9, No.16 pp.6417-6424, 2013. https://doi.org/10.12733/jcis7841 DOI
18	Seo, Young Mi Jee, Hong Ke and Soontak Lee, "Rainfall Frequency Analysis and Uncertainty Quantification Using Dempster-Shafer Theory," Korea Water Resources Association 2010 KWRA conference, pp.1390-1394, 2010.

KSCI

A Detection Model using Labeling based on Inference and Unsupervised Learning Method 추론 및 비교사학습 기법 기반 레이블링을 적용한 탐지 모델

A Detection Model using Labeling based on Inference and Unsupervised Learning Method