In this paper, we formulate a hierarchical Bayesian version of the Mixture of Unigrams model for text clustering and approach its posterior inference through variational inference. We compute the explicit expression of the variational objective function for our hierarchical model under a mean-field approximation. We then derive the update equations of a suitable algorithm based on coordinate ascent to find local maxima of the variational target, and estimate the model parameters through the optimized variational hyperparameters. The advantages of variational algorithms over traditional Markov Chain Monte Carlo methods based on iterative posterior sampling are also discussed in detail.

Variational Bayes estimation of hierarchical Dirichlet-multinomial mixtures for text clustering

Massimo Bilancia
;
Fabio Manca
;
Gianvito Pio
2023-01-01

Abstract

In this paper, we formulate a hierarchical Bayesian version of the Mixture of Unigrams model for text clustering and approach its posterior inference through variational inference. We compute the explicit expression of the variational objective function for our hierarchical model under a mean-field approximation. We then derive the update equations of a suitable algorithm based on coordinate ascent to find local maxima of the variational target, and estimate the model parameters through the optimized variational hyperparameters. The advantages of variational algorithms over traditional Markov Chain Monte Carlo methods based on iterative posterior sampling are also discussed in detail.
File in questo prodotto:
File Dimensione Formato  
s00180-023-01350-8.pdf

non disponibili

Descrizione: ARTICOLO IN RIVISTA
Tipologia: Documento in Versione Editoriale
Licenza: Copyright dell'editore
Dimensione 2.15 MB
Formato Adobe PDF
2.15 MB Adobe PDF   Visualizza/Apri   Richiedi una copia
s00180-023-01350-8-2.pdf

accesso aperto

Descrizione: ARTICOLO IN RIVISTA
Tipologia: Documento in Pre-print
Licenza: Creative commons
Dimensione 3.91 MB
Formato Adobe PDF
3.91 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11586/429445
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 4
  • ???jsp.display-item.citation.isi??? 2
social impact