One of the aims of the EU project COLLATE (IST-1999- 20882 Collaboratory for annotation, indexing and retrieval of digitized historical archive material) is to design and implement a Web-based collaboratory for archives, scientists and end-users working with digitized cultural material. Since the originals of such a material are often unique and scattered in various archives, severe problems arise for their wide fruition. A solution would be to develop intelligent document processing tools that automatically transform printed documents into a webaccessible form such as XML. Here, we propose the use of a document processing system, WISDOM++, which uses heavily machine learning techniques in order to perform such a task, and report promising results obtained in preliminary experiments.

Machine Learning methods for automatically processing historical documents: from paper acquisition to XML transformation

ESPOSITO, Floriana;MALERBA, Donato;SEMERARO, Giovanni;FERILLI, Stefano;BASILE, TERESA MARIA;CECI, MICHELANGELO;DI MAURO, NICOLA
2004

Abstract

One of the aims of the EU project COLLATE (IST-1999- 20882 Collaboratory for annotation, indexing and retrieval of digitized historical archive material) is to design and implement a Web-based collaboratory for archives, scientists and end-users working with digitized cultural material. Since the originals of such a material are often unique and scattered in various archives, severe problems arise for their wide fruition. A solution would be to develop intelligent document processing tools that automatically transform printed documents into a webaccessible form such as XML. Here, we propose the use of a document processing system, WISDOM++, which uses heavily machine learning techniques in order to perform such a task, and report promising results obtained in preliminary experiments.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11586/98839
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 20
  • ???jsp.display-item.citation.isi??? 14
social impact