Utilising Sampling Techniques to Enhance Network Intrusion Detection System (NIDS) Performance in Imbalanced Data Scenarios

Dicko, Idrissa Mahamoudou; Rageb, Sabry Said Sabry; Kindzeka, Shalanyuy Nabil

Utilising Sampling Techniques to Enhance Network Intrusion Detection System (NIDS) Performance in Imbalanced Data Scenarios

Dicko, Idrissa Mahamoudou; Rageb, Sabry Said Sabry; Kindzeka, Shalanyuy Nabil

URI: http://hdl.handle.net/123456789/2374

Date: 2024-06-19

Abstract:

In contemporary cybersecurity research, the fusion of sampling techniques with state of-the-art Machine Learning (ML) and Deep Learning (DL) models has emerged as a pivotal area of exploration, aimed at enhancing the efficacy of Intrusion Detection Systems (IDS). This thesis delves into the intersection of sampling methodologies and advanced learning algorithms to address the inherent challenges in class imbalance prevalent in network intrusion datasets. Class imbalance, a common issue in IDS datasets, often leads to suboptimal perfor mance as models tend to be biased towards the majority class, compromising their ability to detect instances of the minority class—typically representing intrusions. The proposed research harnesses the power of sampling techniques, encompassing oversampling, undersampling, and hybrid approaches, to rectify this imbalance and create a more representative learning environment. Sampling methods are strategically combined with ML models like Support Vector Machines (SVM), Random Forests, and k-Nearest Neighbors, along with DL mod els such as Convolutional Neural Networks (CNN) and Long Short-Term Memory (LSTM) networks. This collaboration seeks to leverage the capabilities of these mod els to uncover complex structures and connections within the data. Central to our research is the introduction of Feature Relevance and Adaptive Over sampling (FAAO), a novel approach that combines feature relevance assessment with adaptive oversampling to address class imbalance. FAAO evaluates the importance of different features to identify those most influential in distinguishing between classes, ensuring that the oversampling process focuses on the most relevant features and im proves the quality of the synthetic samples. Our primary objectives include exploring various sampling techniques, including FAAO, and their impact on the performance of ML and DL-based IDS models. We will imple ment and assess the effectiveness of these techniques in mitigating class imbalance, thereby enhancing the models’ overall detection accuracy, sensitivity, and specificity. Furthermore, this study seeks to provide insights into the optimal pairing of sampling techniques with specific ML and DL architectures, while equally paying attention to feature relevance considering the inherent characteristics of intrusion detection xi datasets. The findings are anticipated to provide valuable guidelines for practition ers and researchers seeking to deploy robust and adaptive IDS solutions in real-world scenarios. By outlining the collaborative relationship between feature relevance, sampling tech niques, and advanced learning models, our work endeavors to pave the way for more adaptive, resilient, and accurate intrusion detection mechanisms, ultimately fortify ing the cybersecurity landscape against evolv

Description:

Supervised by Mr. Faisal Hussain, Assistant Professor, Department of Computer Science and Engineering (CSE) Islamic University of Technology (IUT) Board Bazar, Gazipur, Bangladesh This thesis is submitted in partial fulfillment of the requirement for the degree of Bachelor of Science in Computer Science and Engineering, 2024

Show full item record