A novel approach to automatic classification of digitized office documents, based on the inductive generalization of their layout style, is presented. It is supported by the observation that for a number of printed documents it is possible to find a set of relevant and invariant layout features. These are geometrical characteristics automatically detected through a segmentation and layout analysis process. The learning step, in which significant examples of document classes are used to train the classification system, involves the novel idea of integrating parametric (numerical) and conceptual (symbolic) learning methods.

An experimental page layout recognition system for office document automatic classification: an integrated approach for inductive generalization

ESPOSITO, Floriana;MALERBA, Donato;SEMERARO, Giovanni;
1990-01-01

Abstract

A novel approach to automatic classification of digitized office documents, based on the inductive generalization of their layout style, is presented. It is supported by the observation that for a number of printed documents it is possible to find a set of relevant and invariant layout features. These are geometrical characteristics automatically detected through a segmentation and layout analysis process. The learning step, in which significant examples of document classes are used to train the classification system, involves the novel idea of integrating parametric (numerical) and conceptual (symbolic) learning methods.
1990
0-8186-2062-5
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11586/116231
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 38
  • ???jsp.display-item.citation.isi??? ND
social impact