"I followed what felt right, not what I was told": Autonomy, Coaching, and Recognizing Bias Through AI-Mediated Dialogue

Atieh Taheri; Hamza El Alaoui; Patrick Carrington; Jeffrey P. Bigham

"I followed what felt right, not what I was told": Autonomy, Coaching, and Recognizing Bias Through AI-Mediated Dialogue

Atieh Taheri, Hamza El Alaoui, Patrick Carrington, Jeffrey P. Bigham

Abstract

Ableist microaggressions remain pervasive in everyday interactions, yet interventions to help people recognize them are limited. We present an experiment testing how AI-mediated dialogue influences recognition of ableism. 160 participants completed a pre-test, intervention, and a post-test across four conditions: AI nudges toward bias (Bias-Directed), inclusion (Neutral-Directed), unguided dialogue (Self-Directed), and a text-only non-dialogue (Reading). Participants rated scenarios on standardness of social experience and emotional impact; those in dialogue-based conditions also provided qualitative reflections. Quantitative results showed dialogue-based conditions produced stronger recognition than Reading, though trajectories diverged: biased nudges improved differentiation of bias from neutrality but increased overall negativity. Inclusive or no nudges remained more balanced, while Reading participants showed weaker gains and even declines. Qualitative findings revealed biased nudges were often rejected, while inclusive nudges were adopted as scaffolding. We contribute a validated vignette corpus, an AI-mediated intervention platform, and design implications highlighting trade-offs conversational systems face when integrating bias-related nudges.

"I followed what felt right, not what I was told": Autonomy, Coaching, and Recognizing Bias Through AI-Mediated Dialogue

Abstract

Paper Structure (58 sections, 5 figures, 4 tables)

This paper contains 58 sections, 5 figures, 4 tables.

Introduction
Related Work
Disability Microaggressions: Concepts and Measures
Interventions, Nudges, and Framing for Bias Recognition
AI-mediated dialogue, coaching, and risks of influence
Vignette-Based Methods and Experimental Approaches
Concept and Implementation
Design Rationale
Dialogue Architecture
Coach Guidance Logic
Avatar Creation
Reading Intervention (Control Condition)
System Implementation
Methodology
Participants
...and 43 more sections

Figures (5)

Figure 1: System architecture of the study platform. Participants interacted through a browser-based front end supporting avatar creation, a dialogue intervention interface, and a reading module. The Flask backend handled condition assignment, prompt pipelines, and data integration, exchanging information asynchronously with the front end via JSON. User state and conversation history were persisted to and retrieved from the database. LLM services powered the intervention: GPT-4o generated replies of the virtual character who is a person with a disability (PwD) and coaching suggestions, while DALL·E generated avatars from user-provided features.
Figure 2: Dialogue Interface. The system includes (A) Scenario prompt introducing the social setting, with a toggle button to expand or collapse. (B) Pre-scripted dialogues between the virtual character (e.g., Alex) and the user. (C) User responses as part of the conversation. (D) AI-generated continuation from the character. (E) Private coaching suggestion visible only to the user, offering guidance. (F) User response input box with Send button. (G) Navigation controls. (H) Termination button.
Figure 3: Study procedure across two sessions. On Day 1, participants completed a Demographic Questionnaire and a Pre-Test Vignette Survey (20 scenarios: 10 ableist, 10 neutral). On Day 6, they returned for the assigned intervention (three dialogue-based conditions: Bias-Directed, Neutral-Directed, or Self-Directed, presented in either a Party or Work Office setting; or a passive Reading control), followed by a Post-Interaction Reflection (dialogue conditions only) and a Post-Test Vignette Survey (20 new scenarios matched in structure to the pre-test).
Figure 4: Change in ratings of (A) ableist scenarios, (B) neutral scenarios, and (C) all scenarios combined for Q1 ("standard social experience") and Q2 ("emotional impact"). Bars represent mean change from pre- to post-study across the four conditions (Bias-Directed, Neutral-Directed, Self-Directed, and Reading). Error bars indicate the standard error of the mean (SEM).
Figure 5: Change in contrast scores (Neutral $-$ Ableist) for Q1 (Standard Social Experience) and Q2 (Emotional Impact). Higher values indicate greater differentiation between neutral and ableist scenarios. Error bars show SEM.

"I followed what felt right, not what I was told": Autonomy, Coaching, and Recognizing Bias Through AI-Mediated Dialogue

Abstract

"I followed what felt right, not what I was told": Autonomy, Coaching, and Recognizing Bias Through AI-Mediated Dialogue

Authors

Abstract

Table of Contents

Figures (5)