While AI techniques have enabled automated analysis and interpretation of visual content, generating meaningful captions for artworks presents unique challenges. These include understanding artistic intent, historical context, and complex visual elements. Despite recent developments in multi-modal techniques, there are still gaps in generating complete and accurate captions. This paper contributes by introducing a new dataset for artwork captioning generated using prompt engineering techniques and ChatGPT. We refined the captions with CLIPScore to filter out noise; then, we fine-tuned GIT-Base, resulting in visually accurate captions that surpass the ground truth. Enrichment of descriptions with predicted metadata improves their informativeness. Artwork captioning has implications for art appreciation, inclusivity, education, and cultural exchange, particularly for people with visual impairments or limited knowledge of art.

Exploring the Synergy Between Vision-Language Pretraining and ChatGPT for Artwork Captioning: A Preliminary Study

Giovanna Castellano;Raffaele Scaringi;Gennaro Vessio
2024-01-01

Abstract

While AI techniques have enabled automated analysis and interpretation of visual content, generating meaningful captions for artworks presents unique challenges. These include understanding artistic intent, historical context, and complex visual elements. Despite recent developments in multi-modal techniques, there are still gaps in generating complete and accurate captions. This paper contributes by introducing a new dataset for artwork captioning generated using prompt engineering techniques and ChatGPT. We refined the captions with CLIPScore to filter out noise; then, we fine-tuned GIT-Base, resulting in visually accurate captions that surpass the ground truth. Enrichment of descriptions with predicted metadata improves their informativeness. Artwork captioning has implications for art appreciation, inclusivity, education, and cultural exchange, particularly for people with visual impairments or limited knowledge of art.
2024
978-3-031-51025-0
978-3-031-51026-7
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11586/457344
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
social impact