THERADIA WoZ: An Ecological Corpus for Appraisal-based Affect Research in Healthcare
Hippolyte Fournier, Sina Alisamir, Safaa Azzakhnini, Hanna Chainay, Olivier Koenig, Isabella Zsoldos, Eléeonore Trân, Gérard Bailly, Frédéeric Elisei, Béatrice Bouchot, Brice Varini, Patrick Constant, Joan Fruitet, Franck Tarpin-Bernard, Solange Rossato, François Portet, Fabien Ringeval
TL;DR
This work addresses the need for ecological, multimodal affect data in healthcare to support AI-assisted cognitive training. It introduces THERADIA WoZ, a French-language audiovisual corpus collected from healthy aging adults and MCI patients during Wizard-of-Oz–driven CCT sessions, with full transcripts and appraisal-based annotations across four dimensions and 23 affect labels. The authors provide data collection protocols, annotation guidelines, a detailed corpus analysis, and baseline automatic recognition results using both hand-crafted and self-supervised features across audio, text, and video modalities, including fusion strategies. The corpus fills a critical gap in healthcare affective computing and offers a valuable benchmark for industry and academia to develop affect-aware home-based AI assistants for CCT, with potential implications for personalized therapy and elder care.
Abstract
We present THERADIA WoZ, an ecological corpus designed for audiovisual research on affect in healthcare. Two groups of senior individuals, consisting of 52 healthy participants and 9 individuals with Mild Cognitive Impairment (MCI), performed Computerised Cognitive Training (CCT) exercises while receiving support from a virtual assistant, tele-operated by a human in the role of a Wizard-of-Oz (WoZ). The audiovisual expressions produced by the participants were fully transcribed, and partially annotated based on dimensions derived from recent models of the appraisal theories, including novelty, intrinsic pleasantness, goal conduciveness, and coping. Additionally, the annotations included 23 affective labels drew from the literature of achievement affects. We present the protocols used for the data collection, transcription, and annotation, along with a detailed analysis of the annotated dimensions and labels. Baseline methods and results for their automatic prediction are also presented. The corpus aims to serve as a valuable resource for researchers in affective computing, and is made available to both industry and academia.
