We present GraphCLIP, a novel contrastive learning framework for multimodal artwork classification that integrates visual and contextual information to improve predictive accuracy and interpretability. Traditional computer vision methods often fall short in visual arts, where context is crucial. GraphCLIP leverages image data and a Knowledge Graph to extract features from both perspectives. Evaluated on the ArtGraph dataset, with over 100,000 artworks in 32 styles and 18 genres, GraphCLIP outperforms existing models in single-task (up to +8% in F1-score) and multi-task settings (up to +6%), demonstrating robustness even with unseen classes. Additionally, visual and contextual qualitative explanations enhance model transparency. The versatility of GraphCLIP extends beyond art classification: its methodology can be adapted to other domains where integrating diverse data types is essential. (The code is publicly available at: https://github.com/CILAB-ArtGraph/graphclip.git.)
GraphCLIP: Image-graph contrastive learning for multimodal artwork classification
Scaringi, Raffaele
;Vessio, Gennaro;Castellano, Giovanna
2025-01-01
Abstract
We present GraphCLIP, a novel contrastive learning framework for multimodal artwork classification that integrates visual and contextual information to improve predictive accuracy and interpretability. Traditional computer vision methods often fall short in visual arts, where context is crucial. GraphCLIP leverages image data and a Knowledge Graph to extract features from both perspectives. Evaluated on the ArtGraph dataset, with over 100,000 artworks in 32 styles and 18 genres, GraphCLIP outperforms existing models in single-task (up to +8% in F1-score) and multi-task settings (up to +6%), demonstrating robustness even with unseen classes. Additionally, visual and contextual qualitative explanations enhance model transparency. The versatility of GraphCLIP extends beyond art classification: its methodology can be adapted to other domains where integrating diverse data types is essential. (The code is publicly available at: https://github.com/CILAB-ArtGraph/graphclip.git.)I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.