AURA: A Reinforcement Learning Framework for AI-Driven Adaptive Conversational Surveys
Jinwen Tang, Yi Shang
TL;DR
AURA presents a reinforcement learning framework that enables AI-driven adaptive conversational surveys to improve real-time engagement within single sessions. By defining a four-dimensional LSDE quality metric and an offline-online two-level learning process, the system initializes with priors from campus-climate data and updates a simple EV-based policy during 10–15 exchanges using an epsilon-greedy approach. Empirical results show a +0.076 mean gain in response quality and a significant improvement over non-adaptive baselines, driven by a substantial reduction in specification prompts and increased validation prompts. This work demonstrates that within-session learning can yield substantive gains in data quality for open-ended surveys and points to broad applicability in sensitive data collection domains such as education, healthcare, and organizational assessment.
Abstract
Conventional online surveys provide limited personalization, often resulting in low engagement and superficial responses. Although AI survey chatbots improve convenience, most are still reactive: they rely on fixed dialogue trees or static prompt templates and therefore cannot adapt within a session to fit individual users, which leads to generic follow-ups and weak response quality. We address these limitations with AURA (Adaptive Understanding through Reinforcement Learning for Assessment), a reinforcement learning framework for AI-driven adaptive conversational surveys. AURA quantifies response quality using a four-dimensional LSDE metric (Length, Self-disclosure, Emotion, and Specificity) and selects follow-up question types via an epsilon-greedy policy that updates the expected quality gain within each session. Initialized with priors extracted from 96 prior campus-climate conversations (467 total chatbot-user exchanges), the system balances exploration and exploitation across 10-15 dialogue exchanges, dynamically adapting to individual participants in real time. In controlled evaluations, AURA achieved a +0.076 mean gain in response quality and a statistically significant improvement over non-adaptive baselines (p=0.044, d=0.66), driven by a 63% reduction in specification prompts and a 10x increase in validation behavior. These results demonstrate that reinforcement learning can give survey chatbots improved adaptivity, transforming static questionnaires into interactive, self-improving assessment systems.
