Table of Contents
Fetching ...

Tackling Cognitive Impairment Detection from Speech: A submission to the PROCESS Challenge

Catarina Botelho, David Gimeno-Gómez, Francisco Teixeira, John Mendonça, Patrícia Pereira, Diogo A. P. Nunes, Thomas Rolland, Anna Pompili, Rubén Solera-Ureña, Maria Ponte, David Martins de Matos, Carlos-D. Martínez-Hinarejos, Isabel Trancoso, Alberto Abad

TL;DR

This work presents a PROCESS-2024 submission aiming to detect cognitive decline from spontaneous speech across three elicitation tasks. It combines knowledge-based acoustic/textual features, macrodescriptors derived from LLM prompts, pause-based biomarkers, and multiple neural representations, exploiting diverse classifiers and a late-fusion ensemble to leverage complementary information. The two best-performing ensembles (each with six single systems) demonstrate improved dementia-class performance by integrating Longformer CTD representations, ECAPA-TDNN/TRILLsson embeddings, and pause/macro-descriptor features. The study highlights the value of multimodal, task-diverse representations for early dementia detection while acknowledging dataset-imposed limitations that motivate validation on larger, demographically richer corpora.

Abstract

This work describes our group's submission to the PROCESS Challenge 2024, with the goal of assessing cognitive decline through spontaneous speech, using three guided clinical tasks. This joint effort followed a holistic approach, encompassing both knowledge-based acoustic and text-based feature sets, as well as LLM-based macrolinguistic descriptors, pause-based acoustic biomarkers, and multiple neural representations (e.g., LongFormer, ECAPA-TDNN, and Trillson embeddings). Combining these feature sets with different classifiers resulted in a large pool of models, from which we selected those that provided the best balance between train, development, and individual class performance. Our results show that our best performing systems correspond to combinations of models that are complementary to each other, relying on acoustic and textual information from all three clinical tasks.

Tackling Cognitive Impairment Detection from Speech: A submission to the PROCESS Challenge

TL;DR

This work presents a PROCESS-2024 submission aiming to detect cognitive decline from spontaneous speech across three elicitation tasks. It combines knowledge-based acoustic/textual features, macrodescriptors derived from LLM prompts, pause-based biomarkers, and multiple neural representations, exploiting diverse classifiers and a late-fusion ensemble to leverage complementary information. The two best-performing ensembles (each with six single systems) demonstrate improved dementia-class performance by integrating Longformer CTD representations, ECAPA-TDNN/TRILLsson embeddings, and pause/macro-descriptor features. The study highlights the value of multimodal, task-diverse representations for early dementia detection while acknowledging dataset-imposed limitations that motivate validation on larger, demographically richer corpora.

Abstract

This work describes our group's submission to the PROCESS Challenge 2024, with the goal of assessing cognitive decline through spontaneous speech, using three guided clinical tasks. This joint effort followed a holistic approach, encompassing both knowledge-based acoustic and text-based feature sets, as well as LLM-based macrolinguistic descriptors, pause-based acoustic biomarkers, and multiple neural representations (e.g., LongFormer, ECAPA-TDNN, and Trillson embeddings). Combining these feature sets with different classifiers resulted in a large pool of models, from which we selected those that provided the best balance between train, development, and individual class performance. Our results show that our best performing systems correspond to combinations of models that are complementary to each other, relying on acoustic and textual information from all three clinical tasks.
Paper Structure (11 sections, 5 figures, 4 tables)

This paper contains 11 sections, 5 figures, 4 tables.

Figures (5)

  • Figure 1: Overall schema of our proposed method.
  • Figure 2: Distribution of the MMSE scores per diagnostic class.
  • Figure 3: Single system performance, in terms of macro F1 (a) and F1 on the dementia class (b).
  • Figure 4: Model ensemble performance, in terms of macro F1.
  • Figure 5: Frequency of single classifiers appearing in the top best-performing ensemble model combinations, categorized by the type of features and tasks used during their training.