An XAI-based adversarial training approach for cyber-threat detection

Al-Essa, M.; Andresini, G.; Appice, A.; Malerba, D.

doi:10.1109/DASC/PiCom/CBDCom/Cy55231.2022.9927842

adversarial samples. In addition, eXplainable Artificial Intelligence (XAI) has been recently investigated to improve the interpretability and explainability of black-box artificial systems such as deep neural models. In this study, we propose a methodology that combines adversarial training and XAI, in order to increase the accuracy of deep neural models trained for cyber-threat detection. In particular, we use the FGSM technique to generate the adversarial samples for the adversarial training stage, and SHAP to produce the local explanations of decisions made during the adversarial training stage. These local explanations are, subsequently, used to produce a new feature set that describes the effect of the original cyber-data characteristics on the classifications of the examples processed during the adversarial training stage. Leveraging this XAI-based information, we apply a transfer learning strategy, namely fine-tuning, to improve the accuracy performance of the deep neural model. Experiments conducted on two benchmark cybersecurity datasets prove the effectiveness of the proposed methodology in the multi-class classification of cyber-data.