Many European film archives are involved in the digitization of 20th century historical paper documents. In the context of the IST project COLLATE three of them were interested in the semi-automatic annotation of censorship cards and their subsequent retrieval on the basis of both annotations and content. Processing censorship cards, which is the main subject of this paper, leads to a number of challenges for many document image analysis (DIA) systems. Problems arise due to the low layout quality and standard of such material, which introduces a considerable amount of noise in its description. The layout quality is often negatively affected by the presence of stamps, signatures, ink specks, manual annotations and so on that overlap those layout components involved in the understanding or annotation processes. In order to effectively reduce the presence and the effect of noise, we propose an improved version of the knowledge-based DIA system WISDOM++ allowing it to take full advantage of the use of colour information in all processing steps: namely, image segmentation, layout analysis, document image classification and understanding. Experiments have been conducted on a corpus of multi-format documents concerning rare historic film censorships provided by the three film archives involved in the COLLATE project.

Using color information to understand censorship cards of film archives

CECI, MICHELANGELO;MALERBA, Donato;
2007-01-01

Abstract

Many European film archives are involved in the digitization of 20th century historical paper documents. In the context of the IST project COLLATE three of them were interested in the semi-automatic annotation of censorship cards and their subsequent retrieval on the basis of both annotations and content. Processing censorship cards, which is the main subject of this paper, leads to a number of challenges for many document image analysis (DIA) systems. Problems arise due to the low layout quality and standard of such material, which introduces a considerable amount of noise in its description. The layout quality is often negatively affected by the presence of stamps, signatures, ink specks, manual annotations and so on that overlap those layout components involved in the understanding or annotation processes. In order to effectively reduce the presence and the effect of noise, we propose an improved version of the knowledge-based DIA system WISDOM++ allowing it to take full advantage of the use of colour information in all processing steps: namely, image segmentation, layout analysis, document image classification and understanding. Experiments have been conducted on a corpus of multi-format documents concerning rare historic film censorships provided by the three film archives involved in the COLLATE project.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11586/127633
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 3
  • ???jsp.display-item.citation.isi??? 3
social impact