Induction of terminological cluster trees: Preliminaries, model, method and perspectives

IRIS

In this paper, we tackle the problem of clustering individual resources in the context of the Web of Data, that is characterized by a huge amount of data published in a standard data model with a well-defined semantics based on Web ontologies. In fact, clustering methods offer an effective solution to support a lot of complex related activities, such as ontology construction, debugging and evolution, taking into account the inherent incompleteness underlying the representation. Web ontologies already encode a hierarchical organization of the resources by means of the subsumption hierarchy of the classes, which may be expressed explicitly, with proper subsumption axioms, or it must be detected indirectly, by reasoning on the available axioms that define the classes (classification). However it frequently happens that such classes are sparsely populated as the hierarchy often reflect a view of the knowledge engineer prior to the actual introduction of assertions involving the individual resources. As a result, very general classes are often loosely populated, but this may happen also to specific subclasses, making it more difficult to check the types of a resource (instance checking), even through reasoning services. Among the large number of algorithms proposed in the Machine Learning literature, we propose a clustering method that is able to organize groups of resources hierarchically. Specifically, in this work, we introduce a conceptual clustering approach that combines a distance measure between individuals in a knowledge base in a divide-and-conquer solution that is intended to elicit ex post the underlying hierarchy based on the actual distributions of the instances.

Induction of terminological cluster trees: Preliminaries, model, method and perspectives

RIZZO, GIUSEPPE;D'AMATO, CLAUDIA;FANIZZI, Nicola;ESPOSITO, Floriana

2016-01-01

Abstract

In this paper, we tackle the problem of clustering individual resources in the context of the Web of Data, that is characterized by a huge amount of data published in a standard data model with a well-defined semantics based on Web ontologies. In fact, clustering methods offer an effective solution to support a lot of complex related activities, such as ontology construction, debugging and evolution, taking into account the inherent incompleteness underlying the representation. Web ontologies already encode a hierarchical organization of the resources by means of the subsumption hierarchy of the classes, which may be expressed explicitly, with proper subsumption axioms, or it must be detected indirectly, by reasoning on the available axioms that define the classes (classification). However it frequently happens that such classes are sparsely populated as the hierarchy often reflect a view of the knowledge engineer prior to the actual introduction of assertions involving the individual resources. As a result, very general classes are often loosely populated, but this may happen also to specific subclasses, making it more difficult to check the types of a resource (instance checking), even through reasoning services. Among the large number of algorithms proposed in the Machine Learning literature, we propose a clustering method that is able to organize groups of resources hierarchically. Specifically, in this work, we introduce a conceptual clustering approach that combines a distance measure between individuals in a knowledge base in a divide-and-conquer solution that is intended to elicit ex post the underlying hierarchy based on the actual distributions of the instances.

Scheda breve

Scheda completa

Scheda completa (DC)

Anno

2016

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11586/185980

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni

ND

3

ND

social impact