Table of Contents
Fetching ...

WeAudit: Scaffolding User Auditors and AI Practitioners in Auditing Generative AI

Wesley Hanwen Deng, Wang Claire, Howard Ziyu Han, Jason I. Hong, Kenneth Holstein, Motahhare Eslami

TL;DR

WeAudit addresses how to scaffold end-user participation in auditing Generative AI by combining a two-loop workflow (Investigate and Deliberate) with a web-based platform that supports pairwise output comparison, a prompt history, a worked-example repository, social augmentation, an audit-report portal, a discussion forum, and a verification mechanism. Guided by formative studies with users and practitioners, the authors derive six design goals and validate them through a three-week study with 45 user auditors and expert interviews, showing that end users can surface previously unseen harms, articulate actionable findings, and influence practitioner workflows. Key contributions include the WeAudit workflow, its system features, and empirical insights on engagement, sensemaking, and the social-political dimensions of audit labor, along with design implications for sustaining participation and addressing power asymmetries. The work has practical impact for researchers, practitioners, and policymakers aiming to broaden and improve responsible, user-centered auditing and red-teaming of GenAI systems.

Abstract

There has been growing interest from both practitioners and researchers in engaging end users in AI auditing, to draw upon users' unique knowledge and lived experiences. However, we know little about how to effectively scaffold end users in auditing in ways that can generate actionable insights for AI practitioners. Through formative studies with both users and AI practitioners, we first identified a set of design goals to support user-engaged AI auditing. We then developed WeAudit, a workflow and system that supports end users in auditing AI both individually and collectively. We evaluated WeAudit through a three-week user study with user auditors and interviews with industry Generative AI practitioners. Our findings offer insights into how WeAudit supports users in noticing and reflecting upon potential AI harms and in articulating their findings in ways that industry practitioners can act upon. Based on our observations and feedback from both users and practitioners, we identify several opportunities to better support user engagement in AI auditing processes. We discuss implications for future research to support effective and responsible user engagement in AI auditing and red-teaming.

WeAudit: Scaffolding User Auditors and AI Practitioners in Auditing Generative AI

TL;DR

WeAudit addresses how to scaffold end-user participation in auditing Generative AI by combining a two-loop workflow (Investigate and Deliberate) with a web-based platform that supports pairwise output comparison, a prompt history, a worked-example repository, social augmentation, an audit-report portal, a discussion forum, and a verification mechanism. Guided by formative studies with users and practitioners, the authors derive six design goals and validate them through a three-week study with 45 user auditors and expert interviews, showing that end users can surface previously unseen harms, articulate actionable findings, and influence practitioner workflows. Key contributions include the WeAudit workflow, its system features, and empirical insights on engagement, sensemaking, and the social-political dimensions of audit labor, along with design implications for sustaining participation and addressing power asymmetries. The work has practical impact for researchers, practitioners, and policymakers aiming to broaden and improve responsible, user-centered auditing and red-teaming of GenAI systems.

Abstract

There has been growing interest from both practitioners and researchers in engaging end users in AI auditing, to draw upon users' unique knowledge and lived experiences. However, we know little about how to effectively scaffold end users in auditing in ways that can generate actionable insights for AI practitioners. Through formative studies with both users and AI practitioners, we first identified a set of design goals to support user-engaged AI auditing. We then developed WeAudit, a workflow and system that supports end users in auditing AI both individually and collectively. We evaluated WeAudit through a three-week user study with user auditors and interviews with industry Generative AI practitioners. Our findings offer insights into how WeAudit supports users in noticing and reflecting upon potential AI harms and in articulating their findings in ways that industry practitioners can act upon. Based on our observations and feedback from both users and practitioners, we identify several opportunities to better support user engagement in AI auditing processes. We discuss implications for future research to support effective and responsible user engagement in AI auditing and red-teaming.
Paper Structure (53 sections, 10 figures, 4 tables)

This paper contains 53 sections, 10 figures, 4 tables.

Figures (10)

  • Figure 1: An overview of the methods and contribution of our work.
  • Figure 2: Screenshot of three low-fidelity prototypes we used in our formative study.
  • Figure 3: The WeAudit workflow that contains two intersecting, iterative loops.
  • Figure 4: WeAudit interface for features: (a) Pairwise Comparison (Section \ref{['WeAudit: Pairwise']}), (b) Prompt History Sidebar (Section \ref{['WeAudit: prompt history']}), (c) Worked Examples Repository (Section \ref{['WeAudit: worked examples']}), and (d) Social Augmentation (Section \ref{['WeAudit: Social Augmentation']})
  • Figure 5: WeAudit Interface for features: (e) Audit Report Portal (Section \ref{['WeAudit: report portal']}), an example of audit report authored by a user auditor in our user study through the Audit Report Portal, and (f) Audit Discussion Forum (Section \ref{['WeAudit: discussion']}). Please refer to Section \ref{['WeAudit: report portal']} to review the concrete questions in the Audit Report Portal.
  • ...and 5 more figures