Automated Extraction of Spatio-Semantic Graphs for Identifying Cognitive Impairment
Si-Ioi Ng, Pranav S. Ambadi, Kimberly D. Mueller, Julie Liss, Visar Berisha
TL;DR
This paper addresses automated extraction of spatio-semantic graphs for cognitive impairment assessment from Cookie Theft picture descriptions. It proposes an automatic CIU extraction pipeline using a 23-item CIU dictionary to map transcripts, then constructs spatio-semantic graphs from CIU coordinates on the Cookie Theft image and derives 12 graph-based features. Across WRAP and Pitt DB data, ANCOVA analyses (covariates: age, education, gender, unique nodes; significance level $p=0.05$) show that features derived from automatically extracted CIUs yield longer total path distances, more CIUs, and larger $F$-values, with comparable or greater between-group differences than manual extraction. The results support the clinical interpretability and scalability of automated spatio-semantic feature extraction for cognitive impairment assessment; future work explores large language models for CIU extraction without manual dictionaries.
Abstract
Existing methods for analyzing linguistic content from picture descriptions for assessment of cognitive-linguistic impairment often overlook the participant's visual narrative path, which typically requires eye tracking to assess. Spatio-semantic graphs are a useful tool for analyzing this narrative path from transcripts alone, however they are limited by the need for manual tagging of content information units (CIUs). In this paper, we propose an automated approach for estimation of spatio-semantic graphs (via automated extraction of CIUs) from the Cookie Theft picture commonly used in cognitive-linguistic analyses. The method enables the automatic characterization of the visual semantic path during picture description. Experiments demonstrate that the automatic spatio-semantic graphs effectively differentiate between cognitively impaired and unimpaired speakers. Statistical analyses reveal that the features derived by the automated method produce comparable results to the manual method, with even greater group differences between clinical groups of interest. These results highlight the potential of the automated approach for extracting spatio-semantic features in developing clinical speech models for cognitive impairment assessment.
