: Deep learning (DL) has been demonstrated to be a valuable tool for analyzing signals such as sounds and images, thanks to its capabilities of automatically extracting relevant patterns as well as its end-to-end training properties. When applied to tabular structured data, DL has exhibited some performance limitations compared to shallow learning techniques. This work presents a novel technique for tabular data called adaptive multiscale attention deep neural network architecture (also named excited attention). By exploiting parallel multilevel feature weighting, the adaptive multiscale attention can successfully learn the feature attention and thus achieve high levels of F1-score on seven different classification tasks (on small, medium, large, and very large datasets) and low mean absolute errors on four regression tasks of different size. In addition, adaptive multiscale attention provides four levels of explainability (i.e., comprehension of its learning process and therefore of its outcomes): 1) calculates attention weights to determine which layers are most important for given classes; 2) shows each feature's attention across all instances; 3) understands learned feature attention for each class to explore feature attention and behavior for specific classes; and 4) finds nonlinear correlations between co-behaving features to reduce dataset dimensionality and improve interpretability. These interpretability levels, in turn, allow for employing adaptive multiscale attention as a useful tool for feature ranking and feature selection.

An Interpretable Adaptive Multiscale Attention Deep Neural Network for Tabular Data

Vincenzo Dentamaro
Conceptualization
;
Paolo Giglio;Donato Impedovo;Giuseppe Pirlo;
2024-01-01

Abstract

: Deep learning (DL) has been demonstrated to be a valuable tool for analyzing signals such as sounds and images, thanks to its capabilities of automatically extracting relevant patterns as well as its end-to-end training properties. When applied to tabular structured data, DL has exhibited some performance limitations compared to shallow learning techniques. This work presents a novel technique for tabular data called adaptive multiscale attention deep neural network architecture (also named excited attention). By exploiting parallel multilevel feature weighting, the adaptive multiscale attention can successfully learn the feature attention and thus achieve high levels of F1-score on seven different classification tasks (on small, medium, large, and very large datasets) and low mean absolute errors on four regression tasks of different size. In addition, adaptive multiscale attention provides four levels of explainability (i.e., comprehension of its learning process and therefore of its outcomes): 1) calculates attention weights to determine which layers are most important for given classes; 2) shows each feature's attention across all instances; 3) understands learned feature attention for each class to explore feature attention and behavior for specific classes; and 4) finds nonlinear correlations between co-behaving features to reduce dataset dimensionality and improve interpretability. These interpretability levels, in turn, allow for employing adaptive multiscale attention as a useful tool for feature ranking and feature selection.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11586/477060
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? 0
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact