Risk thresholds for frontier AI

Leonie Koessler; Jonas Schuett; Markus Anderljung

Risk thresholds for frontier AI

Leonie Koessler, Jonas Schuett, Markus Anderljung

TL;DR

It is recommended that companies define risk thresholds to provide a principled foundation for their decision-making, use these risk thresholds to help set capability thresholds, and then primarily rely on capability thresholds to make their decisions.

Abstract

Frontier artificial intelligence (AI) systems could pose increasing risks to public safety and security. But what level of risk is acceptable? One increasingly popular approach is to define capability thresholds, which describe AI capabilities beyond which an AI system is deemed to pose too much risk. A more direct approach is to define risk thresholds that simply state how much risk would be too much. For instance, they might state that the likelihood of cybercriminals using an AI system to cause X amount of economic damage must not increase by more than Y percentage points. The main upside of risk thresholds is that they are more principled than capability thresholds, but the main downside is that they are more difficult to evaluate reliably. For this reason, we currently recommend that companies (1) define risk thresholds to provide a principled foundation for their decision-making, (2) use these risk thresholds to help set capability thresholds, and then (3) primarily rely on capability thresholds to make their decisions. Regulators should also explore the area because, ultimately, they are the most legitimate actors to define risk thresholds. If AI risk estimates become more reliable, risk thresholds should arguably play an increasingly direct role in decision-making.

Risk thresholds for frontier AI

TL;DR

Abstract

Paper Structure (16 sections, 5 figures, 1 table)

This paper contains 16 sections, 5 figures, 1 table.

Introduction
Risk thresholds and related concepts
Risk thresholds
Capability thresholds
Compute thresholds
How to use AI risk thresholds
Using risk thresholds to directly feed into decisions
Using risk thresholds to indirectly feed into decisions
The case for AI risk thresholds
Arguments for using risk thresholds
Arguments against using risk thresholds
Overall suggestions for using AI risk thresholds
How to define AI risk thresholds
Type of risk
Level of risk
...and 1 more sections

Figures (5)

Figure 2: F/N-diagram (quantitative) and risk matrix (semi-quantitative / qualitative) Iso2019-kp
Figure 3: Different metrics and the relationships between them
Figure 4: The ALARP framework Melchers2001-bf
Figure 5: Risk thresholds, for example via risk models, can help set capability thresholds
Figure 6: Representation of a linear risk model consisting of many risk scenarios

Risk thresholds for frontier AI

TL;DR

Abstract

Risk thresholds for frontier AI

Authors

TL;DR

Abstract

Table of Contents

Figures (5)