• Title/Summary/Keyword: UNSW-NB 15 Dataset

Search Result 7, Processing Time 0.019 seconds

Malicious Traffic Classification Using Mitre ATT&CK and Machine Learning Based on UNSW-NB15 Dataset (마이터 어택과 머신러닝을 이용한 UNSW-NB15 데이터셋 기반 유해 트래픽 분류)

  • Yoon, Dong Hyun;Koo, Ja Hwan;Won, Dong Ho
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.12 no.2
    • /
    • pp.99-110
    • /
    • 2023
  • This study proposed a classification of malicious network traffic using the cyber threat framework(Mitre ATT&CK) and machine learning to solve the real-time traffic detection problems faced by current security monitoring systems. We applied a network traffic dataset called UNSW-NB15 to the Mitre ATT&CK framework to transform the label and generate the final dataset through rare class processing. After learning several boosting-based ensemble models using the generated final dataset, we demonstrated how these ensemble models classify network traffic using various performance metrics. Based on the F-1 score, we showed that XGBoost with no rare class processing is the best in the multi-class traffic environment. We recognized that machine learning ensemble models through Mitre ATT&CK label conversion and oversampling processing have differences over existing studies, but have limitations due to (1) the inability to match perfectly when converting between existing datasets and Mitre ATT&CK labels and (2) the presence of excessive sparse classes. Nevertheless, Catboost with B-SMOTE achieved the classification accuracy of 0.9526, which is expected to be able to automatically detect normal/abnormal network traffic.

Classification Performance Improvement of UNSW-NB15 Dataset Based on Feature Selection (특징선택 기법에 기반한 UNSW-NB15 데이터셋의 분류 성능 개선)

  • Lee, Dae-Bum;Seo, Jae-Hyun
    • Journal of the Korea Convergence Society
    • /
    • v.10 no.5
    • /
    • pp.35-42
    • /
    • 2019
  • Recently, as the Internet and various wearable devices have appeared, Internet technology has contributed to obtaining more convenient information and doing business. However, as the internet is used in various parts, the attack surface points that are exposed to attacks are increasing, Attempts to invade networks aimed at taking unfair advantage, such as cyber terrorism, are also increasing. In this paper, we propose a feature selection method to improve the classification performance of the class to classify the abnormal behavior in the network traffic. The UNSW-NB15 dataset has a rare class imbalance problem with relatively few instances compared to other classes, and an undersampling method is used to eliminate it. We use the SVM, k-NN, and decision tree algorithms and extract a subset of combinations with superior detection accuracy and RMSE through training and verification. The subset has recall values of more than 98% through the wrapper based experiments and the DT_PSO showed the best performance.

Machine Learning Based Hybrid Approach to Detect Intrusion in Cyber Communication

  • Neha Pathak;Bobby Sharma
    • International Journal of Computer Science & Network Security
    • /
    • v.23 no.11
    • /
    • pp.190-194
    • /
    • 2023
  • By looking the importance of communication, data delivery and access in various sectors including governmental, business and individual for any kind of data, it becomes mandatory to identify faults and flaws during cyber communication. To protect personal, governmental and business data from being misused from numerous advanced attacks, there is the need of cyber security. The information security provides massive protection to both the host machine as well as network. The learning methods are used for analyzing as well as preventing various attacks. Machine learning is one of the branch of Artificial Intelligence that plays a potential learning techniques to detect the cyber-attacks. In the proposed methodology, the Decision Tree (DT) which is also a kind of supervised learning model, is combined with the different cross-validation method to determine the accuracy and the execution time to identify the cyber-attacks from a very recent dataset of different network attack activities of network traffic in the UNSW-NB15 dataset. It is a hybrid method in which different types of attributes including Gini Index and Entropy of DT model has been implemented separately to identify the most accurate procedure to detect intrusion with respect to the execution time. The different DT methodologies including DT using Gini Index, DT using train-split method and DT using information entropy along with their respective subdivision such as using K-Fold validation, using Stratified K-Fold validation are implemented.

Intrusion Detection System based on Packet Payload Analysis using Transformer

  • Woo-Seung Park;Gun-Nam Kim;Soo-Jin Lee
    • Journal of the Korea Society of Computer and Information
    • /
    • v.28 no.11
    • /
    • pp.81-87
    • /
    • 2023
  • Intrusion detection systems that learn metadata of network packets have been proposed recently. However these approaches require time to analyze packets to generate metadata for model learning, and time to pre-process metadata before learning. In addition, models that have learned specific metadata cannot detect intrusion by using original packets flowing into the network as they are. To address the problem, this paper propose a natural language processing-based intrusion detection system that detects intrusions by learning the packet payload as a single sentence without an additional conversion process. To verify the performance of our approach, we utilized the UNSW-NB15 and Transformer models. First, the PCAP files of the dataset were labeled, and then two Transformer (BERT, DistilBERT) models were trained directly in the form of sentences to analyze the detection performance. The experimental results showed that the binary classification accuracy was 99.03% and 99.05%, respectively, which is similar or superior to the detection performance of the techniques proposed in previous studies. Multi-class classification showed better performance with 86.63% and 86.36%, respectively.

Enhancing cloud computing security: A hybrid machine learning approach for detecting malicious nano-structures behavior

  • Xu Guo;T.T. Murmy
    • Advances in nano research
    • /
    • v.15 no.6
    • /
    • pp.513-520
    • /
    • 2023
  • The exponential proliferation of cutting-edge computing technologies has spurred organizations to outsource their data and computational needs. In the realm of cloud-based computing environments, ensuring robust security, encompassing principles such as confidentiality, availability, and integrity, stands as an overarching imperative. Elevating security measures beyond conventional strategies hinges on a profound comprehension of malware's multifaceted behavioral landscape. This paper presents an innovative paradigm aimed at empowering cloud service providers to adeptly model user behaviors. Our approach harnesses the power of a Particle Swarm Optimization-based Probabilistic Neural Network (PSO-PNN) for detection and recognition processes. Within the initial recognition module, user behaviors are translated into a comprehensible format, and the identification of malicious nano-structures behaviors is orchestrated through a multi-layer neural network. Leveraging the UNSW-NB15 dataset, we meticulously validate our approach, effectively characterizing diverse manifestations of malicious nano-structures behaviors exhibited by users. The experimental results unequivocally underscore the promise of our method in fortifying security monitoring and the discernment of malicious nano-structures behaviors.

Improving prediction performance of network traffic using dense sampling technique (밀집 샘플링 기법을 이용한 네트워크 트래픽 예측 성능 향상)

  • Jin-Seon Lee;Il-Seok Oh
    • Smart Media Journal
    • /
    • v.13 no.6
    • /
    • pp.24-34
    • /
    • 2024
  • If the future can be predicted from network traffic data, which is a time series, it can achieve effects such as efficient resource allocation, prevention of malicious attacks, and energy saving. Many models based on statistical and deep learning techniques have been proposed, and most of these studies have focused on improving model structures and learning algorithms. Another approach to improving the prediction performance of the model is to obtain a good-quality data. With the aim of obtaining a good-quality data, this paper applies a dense sampling technique that augments time series data to the application of network traffic prediction and analyzes the performance improvement. As a dataset, UNSW-NB15, which is widely used for network traffic analysis, is used. Performance is analyzed using RMSE, MAE, and MAPE. To increase the objectivity of performance measurement, experiment is performed independently 10 times and the performance of existing sparse sampling and dense sampling is compared as a box plot. As a result of comparing the performance by changing the window size and the horizon factor, dense sampling consistently showed a better performance.

Experimental Comparison of Network Intrusion Detection Models Solving Imbalanced Data Problem (데이터의 불균형성을 제거한 네트워크 침입 탐지 모델 비교 분석)

  • Lee, Jong-Hwa;Bang, Jiwon;Kim, Jong-Wouk;Choi, Mi-Jung
    • KNOM Review
    • /
    • v.23 no.2
    • /
    • pp.18-28
    • /
    • 2020
  • With the development of the virtual community, the benefits that IT technology provides to people in fields such as healthcare, industry, communication, and culture are increasing, and the quality of life is also improving. Accordingly, there are various malicious attacks targeting the developed network environment. Firewalls and intrusion detection systems exist to detect these attacks in advance, but there is a limit to detecting malicious attacks that are evolving day by day. In order to solve this problem, intrusion detection research using machine learning is being actively conducted, but false positives and false negatives are occurring due to imbalance of the learning dataset. In this paper, a Random Oversampling method is used to solve the unbalance problem of the UNSW-NB15 dataset used for network intrusion detection. And through experiments, we compared and analyzed the accuracy, precision, recall, F1-score, training and prediction time, and hardware resource consumption of the models. Based on this study using the Random Oversampling method, we develop a more efficient network intrusion detection model study using other methods and high-performance models that can solve the unbalanced data problem.