Advances of genome sequencing techniques have risen an overwhelming increase in the literature on discovered genes, proteins and their role in biological processes. However, the biomedical literature remains a greatly unexploited source of biological information. Information Extraction (IE) techniques are necessary to map this information into structured representations that allow facts relating domain-relevant entities to be automatically recognized. In this paper, we present a framework that supports biologists in the task of automatic extraction of information from texts. The framework integrates a data mining module that discovers extraction rules from a set of manually labelled texts. Extraction models are subsequently applied in an automatic mode on unseen texts. We report an application to a real-world dataset composed by publications selected to support biologists in the annotation of the HmtDB database.

Mining Information Extraction Models for HmtDB annotation

MALERBA, Donato;ATTIMONELLI, Marcella
2006

Abstract

Advances of genome sequencing techniques have risen an overwhelming increase in the literature on discovered genes, proteins and their role in biological processes. However, the biomedical literature remains a greatly unexploited source of biological information. Information Extraction (IE) techniques are necessary to map this information into structured representations that allow facts relating domain-relevant entities to be automatically recognized. In this paper, we present a framework that supports biologists in the task of automatic extraction of information from texts. The framework integrates a data mining module that discovers extraction rules from a set of manually labelled texts. Extraction models are subsequently applied in an automatic mode on unseen texts. We report an application to a real-world dataset composed by publications selected to support biologists in the annotation of the HmtDB database.
0-7695-2702-7
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: http://hdl.handle.net/11586/137048
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact