Detecting the reading order among the layout components of a document's page is fundamental to ensure effectiveness or even applicability of subsequent content extraction steps. While in single-column documents the reading flow can be straightforwardly determined, in more complex documents the task may become very hard. This paper proposes an automatic strategy for identifying the correct reading order of a document page's components based on abstract argumentation. The technique is unsupervised, and works on any kind of document based only on general assumptions about how humans behave when reading documents. Experimental results show that it is effective in more complex cases, and requires less background knowledge, than previous solutions that have been proposed in the literature.

Abstract argumentation for reading order detection

FERILLI, Stefano;REDAVID, DOMENICO;ESPOSITO, Floriana
2014-01-01

Abstract

Detecting the reading order among the layout components of a document's page is fundamental to ensure effectiveness or even applicability of subsequent content extraction steps. While in single-column documents the reading flow can be straightforwardly determined, in more complex documents the task may become very hard. This paper proposes an automatic strategy for identifying the correct reading order of a document page's components based on abstract argumentation. The technique is unsupervised, and works on any kind of document based only on general assumptions about how humans behave when reading documents. Experimental results show that it is effective in more complex cases, and requires less background knowledge, than previous solutions that have been proposed in the literature.
2014
978-1-4503-2949-1
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11586/37262
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 4
  • ???jsp.display-item.citation.isi??? ND
social impact