One of the main challenges of cybersecurity is the detection and classification of malware to prevent damage to systems by both companies and private users. Identifying the specific type of malware is critical to performing targeted actions. This study proposes a classification approach that generates synthetic images of malware using Conditional Generative Adversarial Networks (cGAN) and Wasserstein Generative Adversarial Networks (WGAN). Using the Malimg dataset, consisting of 25 malware classes, the ResNet50 model shows an overall accuracy of 91.4% and an F1-score of 90.8% for synthetic images generated with WGAN. Resizing and resampling were employed as preprocessing strategies to obtain images of size 48 × 48; resampling has been shown to be more effective. Thus, the proposed methodology allows malware to be classified quickly and efficiently, and, on the other hand, unbalanced datasets can be enriched to aid classification performance.
MAGICIAN: Malware classification Approach through Generation Image using a Conditional and wassersteIn generative Adversarial Network variants
Galantucci S.
;Pirlo G.;Sarcinella L.;
2025-01-01
Abstract
One of the main challenges of cybersecurity is the detection and classification of malware to prevent damage to systems by both companies and private users. Identifying the specific type of malware is critical to performing targeted actions. This study proposes a classification approach that generates synthetic images of malware using Conditional Generative Adversarial Networks (cGAN) and Wasserstein Generative Adversarial Networks (WGAN). Using the Malimg dataset, consisting of 25 malware classes, the ResNet50 model shows an overall accuracy of 91.4% and an F1-score of 90.8% for synthetic images generated with WGAN. Resizing and resampling were employed as preprocessing strategies to obtain images of size 48 × 48; resampling has been shown to be more effective. Thus, the proposed methodology allows malware to be classified quickly and efficiently, and, on the other hand, unbalanced datasets can be enriched to aid classification performance.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


