Towards AI Accountability Infrastructure: Gaps and Opportunities in AI Audit Tooling
Victor Ojewale, Ryan Steed, Briana Vecchione, Abeba Birhane, Inioluwa Deborah Raji
TL;DR
This paper investigates the tooling landscape for AI auditing, highlighting a gap between the need for accountability and the predominance of evaluation-focused tools. Through 27 semi-structured interviews with 35 practitioners and a landscape analysis of 435 tools, the authors build a taxonomy across 7 audit stages and reveal underweight support for harms discovery, data transparency, and advocacy. Key contributions include a comprehensive tool taxonomy, identification of critical gaps, and design directions toward shared, open infrastructure that enables accountability beyond evaluation. The work has practical policy relevance, influencing regulatory discussions and funding avenues for durable AI accountability infrastructure. Overall, it argues that advancing AI accountability requires coordinated tooling, governance, and community infrastructure, not just isolated evaluation frameworks.
Abstract
Audits are critical mechanisms for identifying the risks and limitations of deployed artificial intelligence (AI) systems. However, the effective execution of AI audits remains incredibly difficult, and practitioners often need to make use of various tools to support their efforts. Drawing on interviews with 35 AI audit practitioners and a landscape analysis of 435 tools, we compare the current ecosystem of AI audit tooling to practitioner needs. While many tools are designed to help set standards and evaluate AI systems, they often fall short in supporting accountability. We outline challenges practitioners faced in their efforts to use AI audit tools and highlight areas for future tool development beyond evaluation -- from harms discovery to advocacy. We conclude that the available resources do not currently support the full scope of AI audit practitioners' needs and recommend that the field move beyond tools for just evaluation and towards more comprehensive infrastructure for AI accountability.
