Table of Contents
Fetching ...

DementiaBank-Emotion: A Multi-Rater Emotion Annotation Corpus for Alzheimer's Disease Speech (Version 1.0)

Cheonkam Jeong, Jessica Liao, Audrey Lu, Yutong Song, Christopher Rashidian, Donna Krogh, Erik Krogh, Mahkameh Rasouli, Jung-Ah Lee, Nikil Dutt, Lisa M Gibbs, David Sultzer, Julie Rousseau, Jocelyn Ludlow, Margaret Galvez, Alexander Nuth, Chet Khay, Sabine Brunswicker, Adeline Nyamathi

TL;DR

DementiaBank-Emotion (v1.0) introduces the first multi-rater emotion annotation corpus for Alzheimer's disease speech, annotating 1,492 utterances from 108 speakers for Ekman’s basic emotions plus neutral. The study finds AD speech expresses more non-neutral emotions (16.9%) than controls (5.7%), with exploratory evidence of acoustic flattening in sadness (reduced F0 modulation) and partially preserved emotion-prosody mappings within AD as indicated by higher loudness for certain emotions. A hierarchical adjudication protocol and calibration workshops support label reliability, while qualitative analyses reveal laughter as a coping mechanism and emotion-aware interpretations tied to the Cookie Theft task. The dataset, guidelines, and calibration materials enable more nuanced emotion recognition research in clinical populations and set the stage for v2.0 enhancements including longitudinal data and finer-grained phonetic analyses.

Abstract

We present DementiaBank-Emotion, the first multi-rater emotion annotation corpus for Alzheimer's disease (AD) speech. Annotating 1,492 utterances from 108 speakers for Ekman's six basic emotions and neutral, we find that AD patients express significantly more non-neutral emotions (16.9%) than healthy controls (5.7%; p < .001). Exploratory acoustic analysis suggests a possible dissociation: control speakers showed substantial F0 modulation for sadness (Delta = -3.45 semitones from baseline), whereas AD speakers showed minimal change (Delta = +0.11 semitones; interaction p = .023), though this finding is based on limited samples (sadness: n=5 control, n=15 AD) and requires replication. Within AD speech, loudness differentiates emotion categories, indicating partially preserved emotion-prosody mappings. We release the corpus, annotation guidelines, and calibration workshop materials to support research on emotion recognition in clinical populations.

DementiaBank-Emotion: A Multi-Rater Emotion Annotation Corpus for Alzheimer's Disease Speech (Version 1.0)

TL;DR

DementiaBank-Emotion (v1.0) introduces the first multi-rater emotion annotation corpus for Alzheimer's disease speech, annotating 1,492 utterances from 108 speakers for Ekman’s basic emotions plus neutral. The study finds AD speech expresses more non-neutral emotions (16.9%) than controls (5.7%), with exploratory evidence of acoustic flattening in sadness (reduced F0 modulation) and partially preserved emotion-prosody mappings within AD as indicated by higher loudness for certain emotions. A hierarchical adjudication protocol and calibration workshops support label reliability, while qualitative analyses reveal laughter as a coping mechanism and emotion-aware interpretations tied to the Cookie Theft task. The dataset, guidelines, and calibration materials enable more nuanced emotion recognition research in clinical populations and set the stage for v2.0 enhancements including longitudinal data and finer-grained phonetic analyses.

Abstract

We present DementiaBank-Emotion, the first multi-rater emotion annotation corpus for Alzheimer's disease (AD) speech. Annotating 1,492 utterances from 108 speakers for Ekman's six basic emotions and neutral, we find that AD patients express significantly more non-neutral emotions (16.9%) than healthy controls (5.7%; p < .001). Exploratory acoustic analysis suggests a possible dissociation: control speakers showed substantial F0 modulation for sadness (Delta = -3.45 semitones from baseline), whereas AD speakers showed minimal change (Delta = +0.11 semitones; interaction p = .023), though this finding is based on limited samples (sadness: n=5 control, n=15 AD) and requires replication. Within AD speech, loudness differentiates emotion categories, indicating partially preserved emotion-prosody mappings. We release the corpus, annotation guidelines, and calibration workshop materials to support research on emotion recognition in clinical populations.
Paper Structure (51 sections, 1 equation, 3 figures, 7 tables, 1 algorithm)

This paper contains 51 sections, 1 equation, 3 figures, 7 tables, 1 algorithm.

Figures (3)

  • Figure 1: Emotion distribution comparing AD patients ($n$=615) and healthy controls ($n$=731). Rare emotions (anger, disgust, fear) were pooled as "Other" with counts shown. AD patients show significantly higher rates of non-neutral emotions (16.9% vs. 5.7%; $\chi^2$ = 38.45, $p < .001$).
  • Figure 2: Speaker-normalized loudness by emotion category in AD patients. Rare emotions (anger, disgust, fear) were pooled as "Other." Asterisks indicate significant pairwise differences (Tukey HSD): **$p<.01$, ***$p<.001$. Other acoustic features (F0, jitter, shimmer, HNR) showed no significant differences (see Appendix \ref{['app:acoustic_emotion']}).
  • Figure 3: Speaker-normalized acoustic features by emotion category (AD patients only). Error bars indicate 95% confidence intervals.