Table of Contents
Fetching ...

Towards AI Accountability Infrastructure: Gaps and Opportunities in AI Audit Tooling

Victor Ojewale, Ryan Steed, Briana Vecchione, Abeba Birhane, Inioluwa Deborah Raji

TL;DR

This paper investigates the tooling landscape for AI auditing, highlighting a gap between the need for accountability and the predominance of evaluation-focused tools. Through 27 semi-structured interviews with 35 practitioners and a landscape analysis of 435 tools, the authors build a taxonomy across 7 audit stages and reveal underweight support for harms discovery, data transparency, and advocacy. Key contributions include a comprehensive tool taxonomy, identification of critical gaps, and design directions toward shared, open infrastructure that enables accountability beyond evaluation. The work has practical policy relevance, influencing regulatory discussions and funding avenues for durable AI accountability infrastructure. Overall, it argues that advancing AI accountability requires coordinated tooling, governance, and community infrastructure, not just isolated evaluation frameworks.

Abstract

Audits are critical mechanisms for identifying the risks and limitations of deployed artificial intelligence (AI) systems. However, the effective execution of AI audits remains incredibly difficult, and practitioners often need to make use of various tools to support their efforts. Drawing on interviews with 35 AI audit practitioners and a landscape analysis of 435 tools, we compare the current ecosystem of AI audit tooling to practitioner needs. While many tools are designed to help set standards and evaluate AI systems, they often fall short in supporting accountability. We outline challenges practitioners faced in their efforts to use AI audit tools and highlight areas for future tool development beyond evaluation -- from harms discovery to advocacy. We conclude that the available resources do not currently support the full scope of AI audit practitioners' needs and recommend that the field move beyond tools for just evaluation and towards more comprehensive infrastructure for AI accountability.

Towards AI Accountability Infrastructure: Gaps and Opportunities in AI Audit Tooling

TL;DR

This paper investigates the tooling landscape for AI auditing, highlighting a gap between the need for accountability and the predominance of evaluation-focused tools. Through 27 semi-structured interviews with 35 practitioners and a landscape analysis of 435 tools, the authors build a taxonomy across 7 audit stages and reveal underweight support for harms discovery, data transparency, and advocacy. Key contributions include a comprehensive tool taxonomy, identification of critical gaps, and design directions toward shared, open infrastructure that enables accountability beyond evaluation. The work has practical policy relevance, influencing regulatory discussions and funding avenues for durable AI accountability infrastructure. Overall, it argues that advancing AI accountability requires coordinated tooling, governance, and community infrastructure, not just isolated evaluation frameworks.

Abstract

Audits are critical mechanisms for identifying the risks and limitations of deployed artificial intelligence (AI) systems. However, the effective execution of AI audits remains incredibly difficult, and practitioners often need to make use of various tools to support their efforts. Drawing on interviews with 35 AI audit practitioners and a landscape analysis of 435 tools, we compare the current ecosystem of AI audit tooling to practitioner needs. While many tools are designed to help set standards and evaluate AI systems, they often fall short in supporting accountability. We outline challenges practitioners faced in their efforts to use AI audit tools and highlight areas for future tool development beyond evaluation -- from harms discovery to advocacy. We conclude that the available resources do not currently support the full scope of AI audit practitioners' needs and recommend that the field move beyond tools for just evaluation and towards more comprehensive infrastructure for AI accountability.
Paper Structure (23 sections, 13 figures, 4 tables)

This paper contains 23 sections, 13 figures, 4 tables.

Figures (13)

  • Figure 1: Stages of the tool-supported audit process surfaced in our survey of AI audit tooling. We taxonomize tools by the stage of the AI audit process in which they are used. Tools may be used in multiple stages.
  • Figure 2: Number of tools in each category within each stage of our taxonomy, grouped by type of organization. Tools may be used in multiple stages. Note that the scales differ---the Standards and Performance Analysis stages contain many more tools than the others. Nonprofit and university/academic developers account for relatively more Harms Discovery and Data Collection tools. For-profit developers contribute relatively more Performance Analysis and Transparency Infrastructure tools.
  • Figure 3: Tool licensing by taxonomy stage (top) and by organization type (bottom).
  • Figure E.1: Number of tools by taxonomy category, sorted by type of organization (our classification). Tools may be used in multiple stages.
  • Figure E.2: Number of tools with code in each taxonomy stage. Tools may be used in multiple stages.
  • ...and 8 more figures

Theorems & Definitions (5)

  • definition 1
  • definition 2
  • definition 3
  • definition 4
  • definition 5