Recognizing attributes of unknown artworks relies on more than visual information: prior knowledge and emotional context can play a crucial role. Building an AI system mimicking this perception requires a multi-modal model integrating computer vision and contextual factors. In this paper, we propose a new model that uses vision transformers and graph attention networks to learn new artworks’ visual and contextual features and predict their style, genre, and emotion. Contextual features are acquired from an extended version of our ArtGraph knowledge graph, enriched with emotion information from the ArtEmis dataset. Our inductive end-to-end multi-task architecture enables real-time execution and resilience to graph evolutions. Combining computer vision and knowledge graphs could facilitate a deeper understanding of the fine arts, bridging the gap between computer science and the humanities (The new version of the graph is available at https://doi.org/10.5281/zenodo.8172374, while the code is available at https://github.com/CILAB-ArtGraph/multi-modal-end-to-end-art-classifier).

Recognizing the Style, Genre, and Emotion of a Work of Art Through Visual and Knowledge Graph Embeddings

Castellano, Giovanna;Scaringi, Raffaele
;
Vessio, Gennaro
2023-01-01

Abstract

Recognizing attributes of unknown artworks relies on more than visual information: prior knowledge and emotional context can play a crucial role. Building an AI system mimicking this perception requires a multi-modal model integrating computer vision and contextual factors. In this paper, we propose a new model that uses vision transformers and graph attention networks to learn new artworks’ visual and contextual features and predict their style, genre, and emotion. Contextual features are acquired from an extended version of our ArtGraph knowledge graph, enriched with emotion information from the ArtEmis dataset. Our inductive end-to-end multi-task architecture enables real-time execution and resilience to graph evolutions. Combining computer vision and knowledge graphs could facilitate a deeper understanding of the fine arts, bridging the gap between computer science and the humanities (The new version of the graph is available at https://doi.org/10.5281/zenodo.8172374, while the code is available at https://github.com/CILAB-ArtGraph/multi-modal-end-to-end-art-classifier).
2023
978-3-031-47545-0
978-3-031-47546-7
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11586/451500
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact