Integrating Pause Information with Word Embeddings in Language Models for Alzheimer's Disease Detection from Spontaneous Speech

Yu Pu; Wei-Qiang Zhang

Integrating Pause Information with Word Embeddings in Language Models for Alzheimer's Disease Detection from Spontaneous Speech

Yu Pu, Wei-Qiang Zhang

TL;DR

This work tackles early Alzheimer's disease detection from spontaneous speech by leveraging pauses as temporal cues. It introduces a pause-embedding mechanism that encodes word durations and inter-word pauses, and integrates these embeddings into a BERT-based language model. A two-stage training regime uses GigaSpeech for pretraining the pause representations and ADReSSo for task-specific fine-tuning, achieving a top-1 accuracy of 83.1% on ADReSSo. The results demonstrate that pause information is a valuable non-invasive biomarker for AD and suggests broader applicability to neurodegenerative disease detection.

Abstract

Alzheimer's disease (AD) is a progressive neurodegenerative disorder characterized by cognitive decline and memory loss. Early detection of AD is crucial for effective intervention and treatment. In this paper, we propose a novel approach to AD detection from spontaneous speech, which incorporates pause information into language models. Our method involves encoding pause information into embeddings and integrating them into the typical transformer-based language model, enabling it to capture both semantic and temporal features of speech data. We conduct experiments on the Alzheimer's Dementia Recognition through Spontaneous Speech (ADReSS) dataset and its extension, the ADReSSo dataset, comparing our method with existing approaches. Our method achieves an accuracy of 83.1% in the ADReSSo test set. The results demonstrate the effectiveness of our approach in discriminating between AD patients and healthy individuals, highlighting the potential of pauses as a valuable indicator for AD detection. By leveraging speech analysis as a non-invasive and cost-effective tool for AD detection, our research contributes to early diagnosis and improved management of this debilitating disease.

Integrating Pause Information with Word Embeddings in Language Models for Alzheimer's Disease Detection from Spontaneous Speech

TL;DR

Abstract

Paper Structure (17 sections, 4 figures, 3 tables)

This paper contains 17 sections, 4 figures, 3 tables.

Introduction
Method
Speech recognition with timestamps
Pauses and word durations extraction
Integration of pause information into language models
Model training and fine-tuning
Experiment
Dataset
GigaSpeech
ADReSS
ADReSSo
Experimental Setting
Results
Pause Prediction Task
Alzheimer's Detection Task
...and 2 more sections

Figures (4)

Figure 1: General flowgram of our AD detection system. Our main innovation lies in the part marked with a pentagram, which integrates pause information with the language model.
Figure 2: The detailed process of embedding calculation.
Figure 3: The distribution of pause durations in the ADReSSo dataset, with pause duration between 0.8s and 3s amplified.
Figure 4: Results of the pause prediction task.

Integrating Pause Information with Word Embeddings in Language Models for Alzheimer's Disease Detection from Spontaneous Speech

TL;DR

Abstract

Integrating Pause Information with Word Embeddings in Language Models for Alzheimer's Disease Detection from Spontaneous Speech

Authors

TL;DR

Abstract

Table of Contents

Figures (4)