Table of Contents
Fetching ...

Linguistic Features Extracted by GPT-4 Improve Alzheimer's Disease Detection based on Spontaneous Speech

Jonathan Heitz, Gerold Schneider, Nicolas Langer

TL;DR

This study demonstrates that GPT-4 can extract five semantic features from spontaneous speech transcripts that capture known AD speech symptoms. When integrated with a baseline set of linguistic features in a Random Forest, these GPT-derived features significantly improve AD detection across manual and ASR transcripts, surpassing fine-tuned GPT baselines. The authors validate Word-Finding Difficulties through proxy correlations and human ratings, and show robustness to prompts and seeds while maintaining explainability through feature importance and transcript excerpts. The work suggests a scalable, interpretable approach for large-scale, non-invasive AD screening using automated speech transcription, with limitations tied to dataset size and the exclusive use of text-based features. Overall, the GPT-based feature extraction provides a novel and effective augmentation to traditional linguistic cues in AD speech analysis, enabling practical, explainable screening tools.

Abstract

Alzheimer's Disease (AD) is a significant and growing public health concern. Investigating alterations in speech and language patterns offers a promising path towards cost-effective and non-invasive early detection of AD on a large scale. Large language models (LLMs), such as GPT, have enabled powerful new possibilities for semantic text analysis. In this study, we leverage GPT-4 to extract five semantic features from transcripts of spontaneous patient speech. The features capture known symptoms of AD, but they are difficult to quantify effectively using traditional methods of computational linguistics. We demonstrate the clinical significance of these features and further validate one of them ("Word-Finding Difficulties") against a proxy measure and human raters. When combined with established linguistic features and a Random Forest classifier, the GPT-derived features significantly improve the detection of AD. Our approach proves effective for both manually transcribed and automatically generated transcripts, representing a novel and impactful use of recent advancements in LLMs for AD speech analysis.

Linguistic Features Extracted by GPT-4 Improve Alzheimer's Disease Detection based on Spontaneous Speech

TL;DR

This study demonstrates that GPT-4 can extract five semantic features from spontaneous speech transcripts that capture known AD speech symptoms. When integrated with a baseline set of linguistic features in a Random Forest, these GPT-derived features significantly improve AD detection across manual and ASR transcripts, surpassing fine-tuned GPT baselines. The authors validate Word-Finding Difficulties through proxy correlations and human ratings, and show robustness to prompts and seeds while maintaining explainability through feature importance and transcript excerpts. The work suggests a scalable, interpretable approach for large-scale, non-invasive AD screening using automated speech transcription, with limitations tied to dataset size and the exclusive use of text-based features. Overall, the GPT-based feature extraction provides a novel and effective augmentation to traditional linguistic cues in AD speech analysis, enabling practical, explainable screening tools.

Abstract

Alzheimer's Disease (AD) is a significant and growing public health concern. Investigating alterations in speech and language patterns offers a promising path towards cost-effective and non-invasive early detection of AD on a large scale. Large language models (LLMs), such as GPT, have enabled powerful new possibilities for semantic text analysis. In this study, we leverage GPT-4 to extract five semantic features from transcripts of spontaneous patient speech. The features capture known symptoms of AD, but they are difficult to quantify effectively using traditional methods of computational linguistics. We demonstrate the clinical significance of these features and further validate one of them ("Word-Finding Difficulties") against a proxy measure and human raters. When combined with established linguistic features and a Random Forest classifier, the GPT-derived features significantly improve the detection of AD. Our approach proves effective for both manually transcribed and automatically generated transcripts, representing a novel and impactful use of recent advancements in LLMs for AD speech analysis.

Paper Structure

This paper contains 27 sections, 3 equations, 3 figures, 8 tables.

Figures (3)

  • Figure 1: GPT Prompt 2, used to extract feature values for our GPT features, and the GPT response for an AD patient in our dataset. The verbatim transcript is replaced by a placeholder {transcript}. The system message is not shown here, but provided in Appendix \ref{['appendix:gpt-prompts']}.
  • Figure 2: Clinical validation results for GPT features. Left: Violin plots depicting the distribution of GPT feature values. Inner lines indicate median values. Right: Mean and standard deviation of the feature values for AD and control groups. We report Cohen's d as a metric of effect size, as well as p-values of the Mann-Whitney U Test.
  • Figure 3: Feature correlation between GPT features (on the x axis) and Established features (on the y axis).