Recent research trends definitely recognise deep learning as an important approach in cybersecurity. Deep learning allows us to learn accurate threat detection models in various scenarios. However, it often suffers from training data over-fitting. In this paper, we propose a supervised machine learning method for cyber-threat detection, which modifies the training set to reduce data over-fitting when training a deep neural network. This is done by re-positioning the decision boundary that separates the normal training samples and the threats. Particularly, it re-assigns the normal training samples that are close to the boundary to the opposite class and trains a competitive deep neural network from the modified training set. In this way, it learns a classification model that can detect unseen threats, which behave similarly to normal samples. The experiments, performed by considering three benchmark datasets, prove the effectiveness of the proposed method. They provide encouraging results, also compared to several prominent competitors.
Improving Cyber-Threat Detection by Moving the Boundary Around the Normal Samples
Andresini G.
;Appice A.;CAFORIO F. P.;Malerba D.
2021-01-01
Abstract
Recent research trends definitely recognise deep learning as an important approach in cybersecurity. Deep learning allows us to learn accurate threat detection models in various scenarios. However, it often suffers from training data over-fitting. In this paper, we propose a supervised machine learning method for cyber-threat detection, which modifies the training set to reduce data over-fitting when training a deep neural network. This is done by re-positioning the decision boundary that separates the normal training samples and the threats. Particularly, it re-assigns the normal training samples that are close to the boundary to the opposite class and trains a competitive deep neural network from the modified training set. In this way, it learns a classification model that can detect unseen threats, which behave similarly to normal samples. The experiments, performed by considering three benchmark datasets, prove the effectiveness of the proposed method. They provide encouraging results, also compared to several prominent competitors.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.