User profiling is a fundamental task in Web personalization. Fuzzy clustering is a valid approach to derive user profiles by capturing similar user interests from web usage data available in log files. Often, fuzzy clustering is based on the assumption that data lay on an Euclidean space; however, clustering based on Euclidean distance can lead the clustering process to find user representations that do not capture the semantic information incorporated in the original Web usage data. In this paper, we propose a different approach to express similarity between Web users. The measure is based on the evaluation of similarity between fuzzy sets. The proposed measure is employed in a elational fuzzy clustering algorithm to discover clusters embedded in the Web usage data and derive profiles modeling the real user preferences. An application example on usage data extracted from log files of a sample Web site is reported and a comparison with the results obtained using the cosine measure is shown to demonstrate the effectiveness of the proposed similarity measure.

Similarity-based clustering for user profiling

CASTELLANO, GIOVANNA;FANELLI, Anna Maria;MENCAR, CORRADO;
2007

Abstract

User profiling is a fundamental task in Web personalization. Fuzzy clustering is a valid approach to derive user profiles by capturing similar user interests from web usage data available in log files. Often, fuzzy clustering is based on the assumption that data lay on an Euclidean space; however, clustering based on Euclidean distance can lead the clustering process to find user representations that do not capture the semantic information incorporated in the original Web usage data. In this paper, we propose a different approach to express similarity between Web users. The measure is based on the evaluation of similarity between fuzzy sets. The proposed measure is employed in a elational fuzzy clustering algorithm to discover clusters embedded in the Web usage data and derive profiles modeling the real user preferences. An application example on usage data extracted from log files of a sample Web site is reported and a comparison with the results obtained using the cosine measure is shown to demonstrate the effectiveness of the proposed similarity measure.
978-076953028-4
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: http://hdl.handle.net/11586/139251
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 32
  • ???jsp.display-item.citation.isi??? 16
social impact