[Submitted on 2 Dec 2024]
View a PDF of the paper titled Unveiling Interpretability in Self-Supervised Speech Representations for Parkinson’s Diagnosis, by David Gimeno-G’omez and Catarina Botelho and Anna Pompili and Alberto Abad and Carlos-D. Mart’inez-Hinarejos
Abstract:Recent works in pathological speech analysis have increasingly relied on powerful self-supervised speech representations, leading to promising results. However, the complex, black-box nature of these embeddings and the limited research on their interpretability significantly restrict their adoption for clinical diagnosis. To address this gap, we propose a novel, interpretable framework specifically designed to support Parkinson’s Disease (PD) diagnosis. Through the design of simple yet effective cross-attention mechanisms for both embedding- and temporal-level analysis, the proposed framework offers interpretability from two distinct but complementary perspectives. Experimental findings across five well-established speech benchmarks for PD detection demonstrate the framework’s capability to identify meaningful speech patterns within self-supervised representations for a wide range of assessment tasks. Fine-grained temporal analyses further underscore its potential to enhance the interpretability of deep-learning pathological speech models, paving the way for the development of more transparent, trustworthy, and clinically applicable computer-assisted diagnosis systems in this domain. Moreover, in terms of classification accuracy, our method achieves results competitive with state-of-the-art approaches, while also demonstrating robustness in cross-lingual scenarios when applied to spontaneous speech production.
Submission history
From: David Gimeno-Gómez [view email]
[v1]
Mon, 2 Dec 2024 22:23:43 UTC (1,832 KB)
Source link
lol