WikiFragments is a multimodal dataset built from Wikipedia (en), consisting of cleaned textual paragraphs paired with related images (infobox and thumbnail) from the same page. Each pair forms a multimodal fragment, which serves as an atomic knowledge unit ideal for information retrieval and multimodal research.
WikiFragments
Nicola Fanelli
;Gennaro Vessio;Giovanna Castellano
2026-01-01
Abstract
WikiFragments is a multimodal dataset built from Wikipedia (en), consisting of cleaned textual paragraphs paired with related images (infobox and thumbnail) from the same page. Each pair forms a multimodal fragment, which serves as an atomic knowledge unit ideal for information retrieval and multimodal research.File in questo prodotto:
Non ci sono file associati a questo prodotto.
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


