A collaborative AI dataset creation for speech therapies

IRIS

Artificial Intelligence (AI) and Human-Computer Interaction are getting closer and closer in modern systems, leading to a slow but constant increasing synergy between the two topics. Text prediction and voice recognition are the most know application for AI techniques. Common examples are virtual keyboards, that suggests the next word to be used in the text, systems that recognise people’s sentences, voice commands in voice assistants such as Google Home and Alexa, and many others. However, things get more difficult in contexts where a specific recognition, outside the”usual” models, are not the rule, but the exception, as in the case of the recognition of right and wrong phonemes in speech therapies. These difficulties often lie in the lack of the AI model generalisation abilities due to the small datasets used for training. In this position paper, we address this issue and we discuss the role that the culture of participation might have to support the dataset creation for speech therapy. Our aim is to investigate how combining the support of people in the creation of the phonemes samples, and the validation of those elements by the speech therapist, AI models’ accuracy can improve.

A collaborative AI dataset creation for speech therapies

Barletta V. S.;Cassano F.;Pagano A.;Piccinno A.

2022-01-01

Abstract

Artificial Intelligence (AI) and Human-Computer Interaction are getting closer and closer in modern systems, leading to a slow but constant increasing synergy between the two topics. Text prediction and voice recognition are the most know application for AI techniques. Common examples are virtual keyboards, that suggests the next word to be used in the text, systems that recognise people’s sentences, voice commands in voice assistants such as Google Home and Alexa, and many others. However, things get more difficult in contexts where a specific recognition, outside the”usual” models, are not the rule, but the exception, as in the case of the recognition of right and wrong phonemes in speech therapies. These difficulties often lie in the lack of the AI model generalisation abilities due to the small datasets used for training. In this position paper, we address this issue and we discuss the role that the culture of participation might have to support the dataset creation for speech therapy. Our aim is to investigate how combining the support of people in the creation of the phonemes samples, and the validation of those elements by the speech therapist, AI models’ accuracy can improve.

Scheda breve

Scheda completa

Scheda completa (DC)

Anno

2022

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11586/407013

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni

ND

16

ND

social impact