"Is There Anything Else?'': Examining Administrator Influence on Linguistic Features from the Cookie Theft Picture Description Cognitive Test
Changye Li, Zhecheng Sheng, Trevor Cohen, Serguei Pakhomov
TL;DR
The paper investigates whether test administrator involvement biases linguistic features derived from the Cookie Theft picture description task in Alzheimer's disease assessments. By analyzing two corpora (Pitt and Wisconsin WLS) with preprocessing, topic segmentation, POS-based linguistic features, and propensity score matching, it quantifies how administrator engagement interacts with language markers and classification outcomes. The findings show that administrator activity significantly modulates observed linguistic features and that cross-dataset generalizability is limited, raising concerns about relying on single-context biomarkers for dementia detection or prediction. The work emphasizes observer effects in clinical speech analytics and argues for standardized administration protocols and context-aware interpretation of linguistic biomarkers to improve robustness and fairness in early dementia screening.
Abstract
Alzheimer's Disease (AD) dementia is a progressive neurodegenerative disease that negatively impacts patients' cognitive ability. Previous studies have demonstrated that changes in naturalistic language samples can be useful for early screening of AD dementia. However, the nature of language deficits often requires test administrators to use various speech elicitation techniques during spontaneous language assessments to obtain enough propositional utterances from dementia patients. This could lead to the ``observer's effect'' on the downstream analysis that has not been fully investigated. Our study seeks to quantify the influence of test administrators on linguistic features in dementia assessment with two English corpora the ``Cookie Theft'' picture description datasets collected at different locations and test administrators show different levels of administrator involvement. Our results show that the level of test administrator involvement significantly impacts observed linguistic features in patient speech. These results suggest that many of significant linguistic features in the downstream classification task may be partially attributable to differences in the test administration practices rather than solely to participants' cognitive status. The variations in test administrator behavior can lead to systematic biases in linguistic data, potentially confounding research outcomes and clinical assessments. Our study suggests that there is a need for a more standardized test administration protocol in the development of responsible clinical speech analytics frameworks.
