Table of Contents
Fetching ...

Effective Automation to Support the Human Infrastructure in AI Red Teaming

Alice Qian Zhang, Jina Suh, Mary L. Gray, Hong Shen

TL;DR

This paper addresses how to safely automate AI red teaming without sacrificing human expertise. It introduces three cross-cutting pillars—proficiency, human agency, and adaptability—to guide tool design. Through comparative analysis with content moderation and industry practice, it argues that automation should augment, not replace, human red teamers and should preserve their well-being. The authors advocate a hybrid, human-in-the-loop red-teaming ecosystem that scales responsibly while maintaining context-sensitivity and integrity of risk assessment.

Abstract

As artificial intelligence (AI) systems become increasingly embedded in critical societal functions, the need for robust red teaming methodologies continues to grow. In this forum piece, we examine emerging approaches to automating AI red teaming, with a particular focus on how the application of automated methods affects human-driven efforts. We discuss the role of labor in automated red teaming processes, the benefits and limitations of automation, and its broader implications for AI safety and labor practices. Drawing on existing frameworks and case studies, we argue for a balanced approach that combines human expertise with automated tools to strengthen AI risk assessment. Finally, we highlight key challenges in scaling automated red teaming, including considerations around worker proficiency, agency, and context-awareness.

Effective Automation to Support the Human Infrastructure in AI Red Teaming

TL;DR

This paper addresses how to safely automate AI red teaming without sacrificing human expertise. It introduces three cross-cutting pillars—proficiency, human agency, and adaptability—to guide tool design. Through comparative analysis with content moderation and industry practice, it argues that automation should augment, not replace, human red teamers and should preserve their well-being. The authors advocate a hybrid, human-in-the-loop red-teaming ecosystem that scales responsibly while maintaining context-sensitivity and integrity of risk assessment.

Abstract

As artificial intelligence (AI) systems become increasingly embedded in critical societal functions, the need for robust red teaming methodologies continues to grow. In this forum piece, we examine emerging approaches to automating AI red teaming, with a particular focus on how the application of automated methods affects human-driven efforts. We discuss the role of labor in automated red teaming processes, the benefits and limitations of automation, and its broader implications for AI safety and labor practices. Drawing on existing frameworks and case studies, we argue for a balanced approach that combines human expertise with automated tools to strengthen AI risk assessment. Finally, we highlight key challenges in scaling automated red teaming, including considerations around worker proficiency, agency, and context-awareness.

Paper Structure

This paper contains 3 sections.