Table of Contents
Fetching ...

AURA: A Reinforcement Learning Framework for AI-Driven Adaptive Conversational Surveys

Jinwen Tang, Yi Shang

TL;DR

AURA presents a reinforcement learning framework that enables AI-driven adaptive conversational surveys to improve real-time engagement within single sessions. By defining a four-dimensional LSDE quality metric and an offline-online two-level learning process, the system initializes with priors from campus-climate data and updates a simple EV-based policy during 10–15 exchanges using an epsilon-greedy approach. Empirical results show a +0.076 mean gain in response quality and a significant improvement over non-adaptive baselines, driven by a substantial reduction in specification prompts and increased validation prompts. This work demonstrates that within-session learning can yield substantive gains in data quality for open-ended surveys and points to broad applicability in sensitive data collection domains such as education, healthcare, and organizational assessment.

Abstract

Conventional online surveys provide limited personalization, often resulting in low engagement and superficial responses. Although AI survey chatbots improve convenience, most are still reactive: they rely on fixed dialogue trees or static prompt templates and therefore cannot adapt within a session to fit individual users, which leads to generic follow-ups and weak response quality. We address these limitations with AURA (Adaptive Understanding through Reinforcement Learning for Assessment), a reinforcement learning framework for AI-driven adaptive conversational surveys. AURA quantifies response quality using a four-dimensional LSDE metric (Length, Self-disclosure, Emotion, and Specificity) and selects follow-up question types via an epsilon-greedy policy that updates the expected quality gain within each session. Initialized with priors extracted from 96 prior campus-climate conversations (467 total chatbot-user exchanges), the system balances exploration and exploitation across 10-15 dialogue exchanges, dynamically adapting to individual participants in real time. In controlled evaluations, AURA achieved a +0.076 mean gain in response quality and a statistically significant improvement over non-adaptive baselines (p=0.044, d=0.66), driven by a 63% reduction in specification prompts and a 10x increase in validation behavior. These results demonstrate that reinforcement learning can give survey chatbots improved adaptivity, transforming static questionnaires into interactive, self-improving assessment systems.

AURA: A Reinforcement Learning Framework for AI-Driven Adaptive Conversational Surveys

TL;DR

AURA presents a reinforcement learning framework that enables AI-driven adaptive conversational surveys to improve real-time engagement within single sessions. By defining a four-dimensional LSDE quality metric and an offline-online two-level learning process, the system initializes with priors from campus-climate data and updates a simple EV-based policy during 10–15 exchanges using an epsilon-greedy approach. Empirical results show a +0.076 mean gain in response quality and a significant improvement over non-adaptive baselines, driven by a substantial reduction in specification prompts and increased validation prompts. This work demonstrates that within-session learning can yield substantive gains in data quality for open-ended surveys and points to broad applicability in sensitive data collection domains such as education, healthcare, and organizational assessment.

Abstract

Conventional online surveys provide limited personalization, often resulting in low engagement and superficial responses. Although AI survey chatbots improve convenience, most are still reactive: they rely on fixed dialogue trees or static prompt templates and therefore cannot adapt within a session to fit individual users, which leads to generic follow-ups and weak response quality. We address these limitations with AURA (Adaptive Understanding through Reinforcement Learning for Assessment), a reinforcement learning framework for AI-driven adaptive conversational surveys. AURA quantifies response quality using a four-dimensional LSDE metric (Length, Self-disclosure, Emotion, and Specificity) and selects follow-up question types via an epsilon-greedy policy that updates the expected quality gain within each session. Initialized with priors extracted from 96 prior campus-climate conversations (467 total chatbot-user exchanges), the system balances exploration and exploitation across 10-15 dialogue exchanges, dynamically adapting to individual participants in real time. In controlled evaluations, AURA achieved a +0.076 mean gain in response quality and a statistically significant improvement over non-adaptive baselines (p=0.044, d=0.66), driven by a 63% reduction in specification prompts and a 10x increase in validation behavior. These results demonstrate that reinforcement learning can give survey chatbots improved adaptivity, transforming static questionnaires into interactive, self-improving assessment systems.

Paper Structure

This paper contains 48 sections, 12 equations, 2 figures, 7 tables.

Figures (2)

  • Figure 1: AURA system architecture showing the reinforcement learning cycle within a single conversation exchange. Each user response is scored along four quality dimensions (LSDE), mapped to an engagement state, and used to select the next question type via an $\epsilon$-greedy policy. Observed quality changes update the system's expected-value (EV) estimates, enabling rapid within-session adaptation.
  • Figure 2: Two-level learning framework. Offline learning (top) extracts patterns from prior conversations to initialize the policy. Online learning (bottom) adapts to individual users within each session through real-time quality feedback, then resets to priors for the next user.