Measuring AI R&D Automation

Alan Chan; Ranay Padarath; Joe Kwon; Hilary Greaves; Markus Anderljung

Measuring AI R&D Automation

Alan Chan, Ranay Padarath, Joe Kwon, Hilary Greaves, Markus Anderljung

Abstract

The automation of AI R&D (AIRDA) could have significant implications, but its extent and ultimate effects remain uncertain. We need empirical data to resolve these uncertainties, but existing data (primarily capability benchmarks) may not reflect real-world automation or capture its broader consequences, such as whether AIRDA accelerates capabilities more than safety progress or whether our ability to oversee AI R&D can keep pace with its acceleration. To address these gaps, this work proposes metrics to track the extent of AIRDA and its effects on AI progress and oversight. The metrics span dimensions such as capital share of AI R&D spending, researcher time allocation, and AI subversion incidents, and could help decision makers understand the potential consequences of AIRDA, implement appropriate safety measures, and maintain awareness of the pace of AI development. We recommend that companies and third parties (e.g. non-profit research organisations) start to track these metrics, and that governments support these efforts.

Measuring AI R&D Automation

Abstract

Paper Structure (29 sections, 1 equation, 3 figures, 5 tables)

This paper contains 29 sections, 1 equation, 3 figures, 5 tables.

Introduction
What is AI R&D?
Potential Implications of AI R&D Automation
AI Progress
Oversight
Decreasing the Oversight Gap
Increasing the Oversight Gap
Metrics for AI R&D Automation
Experimental Metrics
Metric #1: AI performance on AI R&D evaluations
Metric #2: AI performance on AI R&D evaluations compared to humans and human-AI teams ("AI R&D Performance RCTs")
Metric #3: Oversight red-teaming experiments
Metric #4: Misalignment evaluations
Metric #5: Compute efficiency improvements
Survey-Based Metrics
...and 14 more sections

Figures (3)

Figure 1: The extent of AI R&D automation could affect both AI progress and the oversight gap: the difference between how much oversight is needed ("oversight demand") and how much oversight is actually achieved. Oversight capacity is the ability to achieve oversight, encompassing both the ability to understand the R&D process (e.g., having sufficient expertise) and the resources available for exercising control (e.g., human labour, monitoring tools). AI progress could also affect the oversight gap, such as by increasing the stakes of R&D decisions. This work proposes metrics to track all of these quantities.
Figure 2: Positive and negative implications of AI R&D automation for AI progress.
Figure 3: AI R&D automation could decrease or increase the oversight gap by changing oversight capacity, the level of oversight achieved (not depicted for simplicity), or oversight demand. Note that oversight capacity may not directly translate into oversight achieved (e.g., companies decide to forgo oversight due to financial or time costs).

Measuring AI R&D Automation

Abstract

Measuring AI R&D Automation

Authors

Abstract

Table of Contents

Figures (3)