Toward a Unified Metadata Schema for Ecological Momentary Assessment with Voice-First Virtual Assistants
Chen Chen, Khalil Mrini, Kemeberly Charles, Ella T. Lifset, Michael Hogarth, Alison A. Moore, Nadir Weibel, Emilia Farcas
TL;DR
EMA data collection in real-world settings faces user burden, especially when integrating voice-first IVAs. The authors introduce a unified metadata schema that models EMA questions, contextual rules, scheduling, and multimodal outputs to enable run-time rendering and rapid prototyping without modifying source code. They implement an end-to-end platform on Alexa with DynamoDB, showcasing cross-device deployment and conditional branching for adaptive question flows. Key contributions include a header-friendly data model for topics, questions, outputs, conditions, schedules, and answer validation, plus runtime rule rendering via reflection. The work offers a practical pathway to faster, more scalable voice-based EMA prototypes, potentially improving engagement and data quality in ecological studies.
Abstract
Ecological momentary assessment (EMA) is used to evaluate subjects' behaviors and moods in their natural environments, yet collecting real-time and self-report data with EMA is challenging due to user burden. Integrating voice into EMA data collection platforms through today's intelligent virtual assistants (IVAs) is promising due to hands-free and eye-free nature. However, efficiently managing conversations and EMAs is non-trivial and time consuming due to the ambiguity of the voice input. We approach this problem by rethinking the data modeling of EMA questions and what is needed to deploy them on voice-first user interfaces. We propose a unified metadata schema that models EMA questions and the necessary attributes to effectively and efficiently integrate voice as a new EMA modality. Our schema allows user experience researchers to write simple rules that can be rendered at run-time, instead of having to edit the source code. We showcase an example EMA survey implemented with our schema, which can run on multiple voice-only and voice-first devices. We believe that our work will accelerate the iterative prototyping and design process of real-world voice-based EMA data collection platforms.
