In this work, we focus on integrating LLMs into a neuro-symbolic framework to enhance the quality of explanations associated with IF-THEN rules generated by neuro-fuzzy inference systems. To address the challenge posed by the lack of a reference ground truth in explanation tasks, we propose a quantitative evaluation based on linguistic and semantic quality metrics, aiming to assess the clarity, coherence, and relevance of the generated text. We systematically compare a selection of LLMs varying in size and architectural family, and investigate the impact of different prompting strategies—including zero-shot, persona-based, and fact-checking approaches—on the resulting explanations. The proposed framework is applied to a real-world case study on EEG-based seizure detection, illustrating its potential in high-stakes medical contexts where transparency and reliability are critical. The findings show that quantitative metrics alone are insufficient to capture the true quality of explanations, highlighting the critical role of both model selection and prompt design in generating effective, trustworthy, and human-aligned explanations.
Exploring the Expressive Power of Large Language Models in Neuro-Fuzzy System Explainability: A Study on EEG-Based Seizure Detection
Gabriella Casalino
;Giovanna Castellano;Daniele Margherita;Alberto Gaetano Valerio;Gennaro Vessio;Gianluca Zaza
2025-01-01
Abstract
In this work, we focus on integrating LLMs into a neuro-symbolic framework to enhance the quality of explanations associated with IF-THEN rules generated by neuro-fuzzy inference systems. To address the challenge posed by the lack of a reference ground truth in explanation tasks, we propose a quantitative evaluation based on linguistic and semantic quality metrics, aiming to assess the clarity, coherence, and relevance of the generated text. We systematically compare a selection of LLMs varying in size and architectural family, and investigate the impact of different prompting strategies—including zero-shot, persona-based, and fact-checking approaches—on the resulting explanations. The proposed framework is applied to a real-world case study on EEG-based seizure detection, illustrating its potential in high-stakes medical contexts where transparency and reliability are critical. The findings show that quantitative metrics alone are insufficient to capture the true quality of explanations, highlighting the critical role of both model selection and prompt design in generating effective, trustworthy, and human-aligned explanations.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


