We tackle the problem of merging information originating from several imperfect raters in labelling items, without the availability of a ground truth. This problem can be approached by STAPLE, which is an algorithm for estimating the ground truth and assessing the quality of raters based on Expectation-Maximization. However, the results of STAPLE are precise estimations, which hide the uncertainty on the true values. In this paper, we introduce a fully Bayesian extension of STAPLE, which provides posterior distributions of the ground truth and the raters’ performance. Sampling is based on Gibbs method to ensure fast estimation even with large data. Experimental results show that the Bayesian extension uncovers some potential bias of the original STAPLE, and offers a representation of uncertainty to help the decision maker in assessing the reliability of the estimations.
Uncertainty Estimation of Raters’ Performance and Ground Truth Through a Bayesian Extension of STAPLE
Davide Cazzorla;Corrado Mencar
2024-01-01
Abstract
We tackle the problem of merging information originating from several imperfect raters in labelling items, without the availability of a ground truth. This problem can be approached by STAPLE, which is an algorithm for estimating the ground truth and assessing the quality of raters based on Expectation-Maximization. However, the results of STAPLE are precise estimations, which hide the uncertainty on the true values. In this paper, we introduce a fully Bayesian extension of STAPLE, which provides posterior distributions of the ground truth and the raters’ performance. Sampling is based on Gibbs method to ensure fast estimation even with large data. Experimental results show that the Bayesian extension uncovers some potential bias of the original STAPLE, and offers a representation of uncertainty to help the decision maker in assessing the reliability of the estimations.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


