Artificial Intelligence (AI) and Human-Computer Interaction are getting closer and closer in modern systems, leading to a slow but constant increasing synergy between the two topics. Text prediction and voice recognition are the most know application for AI techniques. Common examples are virtual keyboards, that suggests the next word to be used in the text, systems that recognise people’s sentences, voice commands in voice assistants such as Google Home and Alexa, and many others. However, things get more difficult in contexts where a specific recognition, outside the”usual” models, are not the rule, but the exception, as in the case of the recognition of right and wrong phonemes in speech therapies. These difficulties often lie in the lack of the AI model generalisation abilities due to the small datasets used for training. In this position paper, we address this issue and we discuss the role that the culture of participation might have to support the dataset creation for speech therapy. Our aim is to investigate how combining the support of people in the creation of the phonemes samples, and the validation of those elements by the speech therapist, AI models’ accuracy can improve.
A collaborative AI dataset creation for speech therapies
Barletta V. S.;Cassano F.;Pagano A.;Piccinno A.
2022-01-01
Abstract
Artificial Intelligence (AI) and Human-Computer Interaction are getting closer and closer in modern systems, leading to a slow but constant increasing synergy between the two topics. Text prediction and voice recognition are the most know application for AI techniques. Common examples are virtual keyboards, that suggests the next word to be used in the text, systems that recognise people’s sentences, voice commands in voice assistants such as Google Home and Alexa, and many others. However, things get more difficult in contexts where a specific recognition, outside the”usual” models, are not the rule, but the exception, as in the case of the recognition of right and wrong phonemes in speech therapies. These difficulties often lie in the lack of the AI model generalisation abilities due to the small datasets used for training. In this position paper, we address this issue and we discuss the role that the culture of participation might have to support the dataset creation for speech therapy. Our aim is to investigate how combining the support of people in the creation of the phonemes samples, and the validation of those elements by the speech therapist, AI models’ accuracy can improve.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.