Reverse-Speech-Finder: A Neural Network Backtracking Architecture for Generating Alzheimer's Disease Speech Samples and Improving Diagnosis Performance

Victor OK Li; Yang Han; Jacqueline CK Lam; Lawrence YL Cheung

Reverse-Speech-Finder: A Neural Network Backtracking Architecture for Generating Alzheimer's Disease Speech Samples and Improving Diagnosis Performance

Victor OK Li, Yang Han, Jacqueline CK Lam, Lawrence YL Cheung

TL;DR

The paper tackles non-invasive Alzheimer's disease diagnosis from speech, addressing data scarcity and interpretability. It introduces Reverse-Speech-Finder (RSF), a neural backtracking architecture that uses pre-trained language models to identify the most probable AD speech markers and their neuronal activations, then backtracks to input speech tokens to uncover novel markers and generate data. RSF's key contributions are the discovery of MPMs/MPNs, a speech-token input representation, and a backtracking mechanism that yields MPTs and MPMs for data generation, leading to superior diagnostic performance over SHAP and Integrated Gradients across BERT and GPT-2 backbones (e.g., GPT-2: accuracy 85.6%, F1 86.9%). This approach provides new linguistic insights into AD-related deficits and offers a scalable, non-invasive tool to enhance early detection, with potential generalization to other domains through marker-driven data augmentation and reverse-engineering strategies.

Abstract

This study introduces Reverse-Speech-Finder (RSF), a groundbreaking neural network backtracking architecture designed to enhance Alzheimer's Disease (AD) diagnosis through speech analysis. Leveraging the power of pre-trained large language models, RSF identifies and utilizes the most probable AD-specific speech markers, addressing both the scarcity of real AD speech samples and the challenge of limited interpretability in existing models. RSF's unique approach consists of three core innovations: Firstly, it exploits the observation that speech markers most probable of predicting AD, defined as the most probable speech-markers (MPMs), must have the highest probability of activating those neurons (in the neural network) with the highest probability of predicting AD, defined as the most probable neurons (MPNs). Secondly, it utilizes a speech token representation at the input layer, allowing backtracking from MPNs to identify the most probable speech-tokens (MPTs) of AD. Lastly, it develops an innovative backtracking method to track backwards from the MPNs to the input layer, identifying the MPTs and the corresponding MPMs, and ingeniously uncovering novel speech markers for AD detection. Experimental results demonstrate RSF's superiority over traditional methods such as SHAP and Integrated Gradients, achieving a 3.5% improvement in accuracy and a 3.2% boost in F1-score. By generating speech data that encapsulates novel markers, RSF not only mitigates the limitations of real data scarcity but also significantly enhances the robustness and accuracy of AD diagnostic models. These findings underscore RSF's potential as a transformative tool in speech-based AD detection, offering new insights into AD-related linguistic deficits and paving the way for more effective non-invasive early intervention strategies.

Reverse-Speech-Finder: A Neural Network Backtracking Architecture for Generating Alzheimer's Disease Speech Samples and Improving Diagnosis Performance

TL;DR

Abstract

Reverse-Speech-Finder: A Neural Network Backtracking Architecture for Generating Alzheimer's Disease Speech Samples and Improving Diagnosis Performance

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (1)