The paper outlines the multifunction system ICRPad for recognizing text in digital images, which reproduce pages of ancient handwritten or hand-printed artifacts. The system was developed aiming at proposing an innovative approach in research and retrieval of information in historical digital libraries and archives. This approach is based on application to data humanities of the fourth knowledge paradigm that underlies data science. Following this approach, the algorithms are used to deduce new research hypotheses through the discovery of models directly inferred from large digital libraries. The system has two modules: ICR++ module and ICR M-Evo (Multi-Evolution) module. The first performs the graph or word recognition by a training process based on segmentation of Regions Of Interest (ROI). The M-Evo module uses a graphic matching algorithm based on a shape contour recognition feature, without any segmentation process. The system was tested on case studies related to digital libraries reproducing ancient artifacts. Experimental results both showed high accuracy of ICRPad in recognizing text, and some interesting development in approach to digital humanities research by applying the fourth knowledge paradigm.

An Innovative Multifunction System for Text Recognition of Digital Resources Reproducing Ancient Handwritten and Hand-Printed Artifacts

Barbuti, Nicola
;
2018-01-01

Abstract

The paper outlines the multifunction system ICRPad for recognizing text in digital images, which reproduce pages of ancient handwritten or hand-printed artifacts. The system was developed aiming at proposing an innovative approach in research and retrieval of information in historical digital libraries and archives. This approach is based on application to data humanities of the fourth knowledge paradigm that underlies data science. Following this approach, the algorithms are used to deduce new research hypotheses through the discovery of models directly inferred from large digital libraries. The system has two modules: ICR++ module and ICR M-Evo (Multi-Evolution) module. The first performs the graph or word recognition by a training process based on segmentation of Regions Of Interest (ROI). The M-Evo module uses a graphic matching algorithm based on a shape contour recognition feature, without any segmentation process. The system was tested on case studies related to digital libraries reproducing ancient artifacts. Experimental results both showed high accuracy of ICRPad in recognizing text, and some interesting development in approach to digital humanities research by applying the fourth knowledge paradigm.
2018
9781450364515
File in questo prodotto:
File Dimensione Formato  
dtuc_p21_barbuti.pdf

non disponibili

Descrizione: Articolo principale
Tipologia: Documento in Versione Editoriale
Licenza: NON PUBBLICO - Accesso privato/ristretto
Dimensione 625.25 kB
Formato Adobe PDF
625.25 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11586/255441
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
social impact