Browse > Article
http://dx.doi.org/10.7472/jksii.2017.18.1.65

A Detection Model using Labeling based on Inference and Unsupervised Learning Method  

Hong, Sung-Sam (Department of Computer Engineering, Gachon University)
Kim, Dong-Wook (Department of Computer Engineering, Gachon University)
Kim, Byungik (Department of Security R&D Team 1, Korea Internet& Security Agency)
Han, Myung-Mook (Department of Computer Engineering, Gachon University)
Publication Information
Journal of Internet Computing and Services / v.18, no.1, 2017 , pp. 65-75 More about this Journal
Abstract
The Detection Model is the model to find the result of a certain purpose using artificial intelligent, data mining, intelligent algorithms In Cyber Security, it usually uses to detect intrusion, malwares, cyber incident, and attacks etc. There are an amount of unlabeled data that are collected in a real environment such as security data. Since the most of data are not defined the class labels, it is difficult to know type of data. Therefore, the label determination process is required to detect and analysis with accuracy. In this paper, we proposed a KDFL(K-means and D-S Fusion based Labeling) method using D-S inference and k-means(unsupervised) algorithms to decide label of data records by fusion, and a detection model architecture using a proposed labeling method. A proposed method has shown better performance on detection rate, accuracy, F1-measure index than other methods. In addition, since it has shown the improved results in error rate, we have verified good performance of our proposed method.
Keywords
Labeling; Detection Model based on classification; Data Mining; Inference; Supervised/Unsupervised Learning; Security;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Chapman, P., Clinton, J., Kerber, R., Khabaza, T., Reinartz, T., Shearer, C. and Wirth, R, "CRISP-DM 1.0 Step-by-step data mining guide", IBM, 2000.
2 Soukaena Hassan Hashem, "Efficiency of SVM and PCA to Enhance Intrusion Detection System," Journal of Asian Scientific Research, Vol.3, No.4, pp.381-395, 2013.
3 Hong, Sung-Sam, Wanhee Lee, and Myung-Mook Han, "The Feature Selection Method based on Genetic Algorithm for Efficient of Text Clustering and Text classification," International Journal of Advances in Soft Computing & Its Applications, Vol.7, No.1, 2015.
4 Rampure, Vinod, and Akhilesh Tiwari. "A Rough Set Based Feature Selection on KDD CUP 99 Data Set." International Journal of Database Theory and Application, Vol.8, No.1, pp.149-156, 2015. https://doi.org/10.14257/ijdta.2015.8.1.16   DOI
5 http://www.r-project.org/
6 https://cran.r-project.org/package=e1071
7 KDD' cup 99, "Knowledge discovery in databases DARPA archive," http://www.kdd.ics.uci.edu/databases/kddcup99/task.html, 1999.
8 Monowar H. Bhuyan, D. K. Bhattacharyya, and J. K. Kalita, "Network Anomaly Detection: Methods, Systems and Tools," IEEE Communications Surveys & Tutorials, Vol.16, No.1, pp.303-336, 2014. https://doi.org/10.1109/surv.2013.052213.00046   DOI
9 Syarif, A. Prugel-Bennett, G. Wills, "Unsupervised clustering approach for network anomaly detection," Networked digital technologies communications in computer and information science, Vol.293, Springer, pp.135-145, 2012. https://doi.org/10.1007/978-3-642-30507-8_13   DOI
10 Anna L. Buczak, Erhan Guven, "A Survey of Data Mining and Machine Learning Methods for Cyber Security Intrusion Detection," IEEE COMMUNICATIONS SURVEYS & TUTORIALS, Vol.18, No.2, 2016. https://doi.org/10.1109/comst.2015.2494502   DOI
11 N. B. Amor, S. Benferhat, and Z. Elouedi, "Naive Bayes vs. decision trees in intrusion detection systems," in Proc ACM Symp. Appl. Comput., pp.420-424, 2004. https://doi.org/10.1145/967900.967989   DOI
12 Sannasi Ganapathy, Kanagasabai Kulothungan, Sannasy Muthurajkumar, Muthusamy Vijayalakshmi, Palanichamy Yogesh, and Arputharaj Kannan, "Intelligent feature selection and classification techniques for intrusion detection in networks: a survey," EURASIP Journal on Wireless Communications and Networking (open access), 2013. https://dx.doi.org/10.1186/1687-1499-2013-271   DOI
13 R. Hendry and S. J. Yang, "Intrusion signature creation via clustering anomalies," Proc. SPIE Defense Secur. Symp. Int. Soc. Opt. Photonics, pp.69730C- 69730C, 2008. https://doi.org/10.1117/12.775886   DOI
14 Claudio Mazzariello, "Multiple classifier Systems for Network Security from data collection to attack detection," Universita degli Studi di Napoli Federico Il Open Archive, Doctor Thesis, 2008.
15 Burroughs, Daniel J., Linda F. Wilson and George V. Cybenko, "Analysis of distributed intrusion detection systems using Bayesian methods. Performance," The 21st IEEE International Computing, and Communications, 2002. https://doi.org/10.1109/ipccc.2002.995166   DOI
16 Bass, Tim, "Intrusion detection systems and multisensor data fusion," Communications of the ACM, Vol.43, No.4, pp.99-105, 2000. https://doi.org/10.1145/332051.332079   DOI
17 MLA Deng, Xinyang, and Yong Deng, "Multisensor Information Fusion Based on Dempster-shafer Theory and Power Average Operator," Journal of Computational Information Systems, Vol.9, No.16 pp.6417-6424, 2013. https://doi.org/10.12733/jcis7841   DOI
18 Seo, Young Mi Jee, Hong Ke and Soontak Lee, "Rainfall Frequency Analysis and Uncertainty Quantification Using Dempster-Shafer Theory," Korea Water Resources Association 2010 KWRA conference, pp.1390-1394, 2010.