The 0-day attack is a cyber-attack based on vulnerabilities that have not yet been published. The detection of anomalous traffic generated by such attacks is vital, as it can represent a critical problem, both in a technical and economic sense, for a smart enterprise as for any system largely dependent on technology. To predict this kind of attack, one solution can be to use unsupervised machine learning approaches, as they guarantee the detection of anomalies regardless of their prior knowledge. It is also essential to identify the anomalous and unknown behaviors that occur within a network in near real-time. Three different approaches have been proposed and benchmarked in exactly the same condition: Deep Autoencoding with GMM and Isolation Forest, Deep Autoencoder with Isolation Forest, and Memory Augmented Deep Autoencoder with Isolation Forest. These approaches are thus the result of combining different unsupervised algorithms. The results show that the addition of the Isolation Forest improves the accuracy values and increases the inference time, although this increase does not represent a relevant problematic factor. This paper also explains the features that the various models consider most important for classifying an event as an attack using the explainable artificial intelligence methodology called Shapley Additive Explanations (SHAP). Experiments were conducted on KDD99, NSL-KDD, and CIC-IDS2017 datasets.

Combining Unsupervised Approaches for Near Real-Time Network Traffic Anomaly Detection

Vincenzo Dentamaro;Stefano Galantucci
;
Donato Impedovo;Giuseppe Pirlo
2022-01-01

Abstract

The 0-day attack is a cyber-attack based on vulnerabilities that have not yet been published. The detection of anomalous traffic generated by such attacks is vital, as it can represent a critical problem, both in a technical and economic sense, for a smart enterprise as for any system largely dependent on technology. To predict this kind of attack, one solution can be to use unsupervised machine learning approaches, as they guarantee the detection of anomalies regardless of their prior knowledge. It is also essential to identify the anomalous and unknown behaviors that occur within a network in near real-time. Three different approaches have been proposed and benchmarked in exactly the same condition: Deep Autoencoding with GMM and Isolation Forest, Deep Autoencoder with Isolation Forest, and Memory Augmented Deep Autoencoder with Isolation Forest. These approaches are thus the result of combining different unsupervised algorithms. The results show that the addition of the Isolation Forest improves the accuracy values and increases the inference time, although this increase does not represent a relevant problematic factor. This paper also explains the features that the various models consider most important for classifying an event as an attack using the explainable artificial intelligence methodology called Shapley Additive Explanations (SHAP). Experiments were conducted on KDD99, NSL-KDD, and CIC-IDS2017 datasets.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11586/410831
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 17
  • ???jsp.display-item.citation.isi??? 14
social impact