Raman spectroscopy shows great potential as a diagnostic tool for thyroid cancer due to its ability to detect biochemical changes during cancer development. This technique is particularly valuable because it is non-invasive and label/dye-free. Compared to molecular tests, Raman spectroscopy analyses can more effectively discriminate malignant features, thus reducing unnecessary surgeries. However, one major hurdle to using Raman spectroscopy as a diagnostic tool is the identification of significant patterns and peaks. In this study, we propose a Machine Learning procedure to discriminate healthy/benign versus malignant nodules that produces interpretable results. We collect Raman spectra obtained from histological samples, select a set of peaks with a data-driven and label independent approach and train the algorithms with the relative prominence of the peaks in the selected set. The performance of the considered models, quantified by area under the Receiver Operating Characteristic curve, exceeds 0.9. To enhance the interpretability of the results, we employ eXplainable Artificial Intelligence and compute the contribution of each feature to the prediction of each sample.
An eXplainable Artificial Intelligence analysis of Raman spectra for thyroid cancer diagnosis
Bellantuono, Loredana;Tommasi, Raffaele;Pantaleo, Ester;Amoroso, Nicola;Monaco, Alfonso;Tangaro, Sabina;Bellotti, Roberto
2023-01-01
Abstract
Raman spectroscopy shows great potential as a diagnostic tool for thyroid cancer due to its ability to detect biochemical changes during cancer development. This technique is particularly valuable because it is non-invasive and label/dye-free. Compared to molecular tests, Raman spectroscopy analyses can more effectively discriminate malignant features, thus reducing unnecessary surgeries. However, one major hurdle to using Raman spectroscopy as a diagnostic tool is the identification of significant patterns and peaks. In this study, we propose a Machine Learning procedure to discriminate healthy/benign versus malignant nodules that produces interpretable results. We collect Raman spectra obtained from histological samples, select a set of peaks with a data-driven and label independent approach and train the algorithms with the relative prominence of the peaks in the selected set. The performance of the considered models, quantified by area under the Receiver Operating Characteristic curve, exceeds 0.9. To enhance the interpretability of the results, we employ eXplainable Artificial Intelligence and compute the contribution of each feature to the prediction of each sample.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.