Table of Contents
Fetching ...

Verification methods for international AI agreements

Akash R. Wasil, Tom Reed, Jack William Miller, Peter Barnett

TL;DR

10 verification methods that could detect two types of potential violations of international AI governance agreements: unauthorized AI training and unauthorized data centers are examined.

Abstract

What techniques can be used to verify compliance with international agreements about advanced AI development? In this paper, we examine 10 verification methods that could detect two types of potential violations: unauthorized AI training (e.g., training runs above a certain FLOP threshold) and unauthorized data centers. We divide the verification methods into three categories: (a) national technical means (methods requiring minimal or no access from suspected non-compliant nations), (b) access-dependent methods (methods that require approval from the nation suspected of unauthorized activities), and (c) hardware-dependent methods (methods that require rules around advanced hardware). For each verification method, we provide a description, historical precedents, and possible evasion techniques. We conclude by offering recommendations for future work related to the verification and enforcement of international AI governance agreements.

Verification methods for international AI agreements

TL;DR

10 verification methods that could detect two types of potential violations of international AI governance agreements: unauthorized AI training and unauthorized data centers are examined.

Abstract

What techniques can be used to verify compliance with international agreements about advanced AI development? In this paper, we examine 10 verification methods that could detect two types of potential violations: unauthorized AI training (e.g., training runs above a certain FLOP threshold) and unauthorized data centers. We divide the verification methods into three categories: (a) national technical means (methods requiring minimal or no access from suspected non-compliant nations), (b) access-dependent methods (methods that require approval from the nation suspected of unauthorized activities), and (c) hardware-dependent methods (methods that require rules around advanced hardware). For each verification method, we provide a description, historical precedents, and possible evasion techniques. We conclude by offering recommendations for future work related to the verification and enforcement of international AI governance agreements.
Paper Structure (29 sections, 6 figures)

This paper contains 29 sections, 6 figures.

Figures (6)

  • Figure 1: Verification methods can help detect unauthorized training runs and unauthorized data centers. * For data center inspections to be able to detect unauthorised training runs, it is likely that hardware requirements around chips with activity logs will be needed in some form.
  • Figure 2: Summary of evasion techniques to avoid verification methods under national technical means.
  • Figure 3: Summary of evasion techniques to avoid access-dependent verification methods
  • Figure 4: Summary of evasion techniques to avoid hardware-dependent verification methods.
  • Figure 6: Estimated research and development needed for verification methods investigated. Note that green indicates little additional research needed, orange indicates some additional research and red indicates significant additional research.
  • ...and 1 more figures