Increased Compute Efficiency and the Diffusion of AI Capabilities

Konstantin Pilz; Lennart Heim; Nicholas Brown

Increased Compute Efficiency and the Diffusion of AI Capabilities

Konstantin Pilz, Lennart Heim, Nicholas Brown

TL;DR

This paper formalizes how increasing compute efficiency—the combined effect of hardware price performance and algorithmic efficiency—drives both wider access to AI capabilities and higher performance for those who invest in training compute. By defining compute investment efficiency $f_c(i)=f_a\left(f_h(i)\right)$ and showing that $p=f_c(i)$, it demonstrates two intertwined effects: an access effect that lowers the cost to reach a given capability and a performance effect that raises achievable performance with the same spend. It argues that large compute investors are typically the first to discover novel capabilities, including dangerous ones, and that diffusion will gradually broaden access, though ceilings and diminishing returns can curb or delay this diffusion. The authors discuss governance implications, including oversight of compute infrastructure, sharing of information about risks, defense-oriented use of advanced models, and coordination on precautionary measures to mitigate harms from rapid diffusion. The work emphasizes the need for proactive policy and industry collaboration to manage proliferation, defend against misuse, and balance innovation with safety as compute efficiency continues to advance.

Abstract

Training advanced AI models requires large investments in computational resources, or compute. Yet, as hardware innovation reduces the price of compute and algorithmic advances make its use more efficient, the cost of training an AI model to a given performance falls over time - a concept we describe as increasing compute efficiency. We find that while an access effect increases the number of actors who can train models to a given performance over time, a performance effect simultaneously increases the performance available to each actor. This potentially enables large compute investors to pioneer new capabilities, maintaining a performance advantage even as capabilities diffuse. Since large compute investors tend to develop new capabilities first, it will be particularly important that they share information about their AI models, evaluate them for emerging risks, and, more generally, make responsible development and release decisions. Further, as compute efficiency increases, governments will need to prepare for a world where dangerous AI capabilities are widely available - for instance, by developing defenses against harmful AI models or by actively intervening in the diffusion of particularly dangerous capabilities.

Increased Compute Efficiency and the Diffusion of AI Capabilities

TL;DR

and showing that

, it demonstrates two intertwined effects: an access effect that lowers the cost to reach a given capability and a performance effect that raises achievable performance with the same spend. It argues that large compute investors are typically the first to discover novel capabilities, including dangerous ones, and that diffusion will gradually broaden access, though ceilings and diminishing returns can curb or delay this diffusion. The authors discuss governance implications, including oversight of compute infrastructure, sharing of information about risks, defense-oriented use of advanced models, and coordination on precautionary measures to mitigate harms from rapid diffusion. The work emphasizes the need for proactive policy and industry collaboration to manage proliferation, defend against misuse, and balance innovation with safety as compute efficiency continues to advance.

Abstract

Paper Structure (47 sections, 12 equations, 9 figures)

This paper contains 47 sections, 12 equations, 9 figures.

Introduction
Falling Training Cost
Hardware Price Performance
Algorithmic Efficiency
Compute Investment Efficiency
Formal Model
Algorithmic Efficiency
Hardware Price Performance
Compute Investment Efficiency
Effects of Compute Efficiency Increases
The Access Effect
The Performance Effect
Consequences
Consequences of the Effects for Different Actors
Implications for Competitive Advantage
...and 32 more sections

Figures (9)

Figure 1: Compute efficiency relates the compute investment to the performance of an AI model.
Figure 2: Compute efficiency improves between time $t = 0$ and $t = 1$, causing an access effect (red) and a performance effect (blue). Figures are merely conceptual and do not assert specific claims regarding the slopes of the curves.
Figure 3: Compute investment scaling increases the performance lead of large compute investors over time. The dashed arrows represent performance attainable without investment scaling.
Figure 4: Hardware price performance is the conversion function between the training compute investment in dollars and the training compute budget in operations. Algorithmic efficiency is the subsequent conversion function between the training compute budget and the performance of the resulting AI model. Compute (investment) efficiency combines hardware price performance and algorithmic efficiency, relating training compute investment to the performance of the resulting model.
Figure 5: Compute efficiency improves between time $t = 0$ and $t = 1$, causing an access effect (red) and a performance effect (blue). Figures are conceptual and do not make empirical claims about the slope of the curve.
...and 4 more figures

Increased Compute Efficiency and the Diffusion of AI Capabilities

TL;DR

Abstract

Increased Compute Efficiency and the Diffusion of AI Capabilities

Authors

TL;DR

Abstract

Table of Contents

Figures (9)