This work presents the application of a first-order logic incremental learning system, INTHELEX, to learn rules for the automatic identification of a wide range of significant document classes and their related components. Specifically, the material includes multi-format cultural heritage documents concerning European films from the 20's and 30's provided by the EU project COLLATE. Incrementality plays a key role when the set of documents is continuously augmented. To ensure that there is no performance loss with respect to classical one-step systems, a comparison with Progol was carried out. Experimental results prove that the proposed approach is a viable solution, for both its performance and its effectiveness in the document processing domain.
Incremental Induction of Classification Rules for Cultural Heritage Documents
FERILLI, Stefano;BASILE, TERESA MARIA;DI MAURO, NICOLA;ESPOSITO, Floriana
2004-01-01
Abstract
This work presents the application of a first-order logic incremental learning system, INTHELEX, to learn rules for the automatic identification of a wide range of significant document classes and their related components. Specifically, the material includes multi-format cultural heritage documents concerning European films from the 20's and 30's provided by the EU project COLLATE. Incrementality plays a key role when the set of documents is continuously augmented. To ensure that there is no performance loss with respect to classical one-step systems, a comparison with Progol was carried out. Experimental results prove that the proposed approach is a viable solution, for both its performance and its effectiveness in the document processing domain.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.