Speaker recognition/identification is a challenge for the implementation of security applications. Unfortunately, degradation in performance is usually observed for high pitched speakers and also whenever average pitch varies significantly between enrolment and testing. In this paper, a study on the impact of the frame length used to extract features from speech signal on the performance of speaker identification is presented. Tests have been carried out on a text-dependent database. Results show that a combination of different frame sizes between the training and the recognition phases can cope with the degradation. A reduction between 40% and 65% in false rejections has been generally observed. © 2007 IEEE.
The influence of frame length on speaker identification performance
IMPEDOVO, DONATO;
2007-01-01
Abstract
Speaker recognition/identification is a challenge for the implementation of security applications. Unfortunately, degradation in performance is usually observed for high pitched speakers and also whenever average pitch varies significantly between enrolment and testing. In this paper, a study on the impact of the frame length used to extract features from speech signal on the performance of speaker identification is presented. Tests have been carried out on a text-dependent database. Results show that a combination of different frame sizes between the training and the recognition phases can cope with the degradation. A reduction between 40% and 65% in false rejections has been generally observed. © 2007 IEEE.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.