Table of Contents
Fetching ...

Safety is Non-Compositional: A Formal Framework for Capability-Based AI Systems

Cosimo Spera

Abstract

This paper contains the first formal proof that safety is non-compositional in the presence of conjunctive capability dependencies: two agents each individually inca- pable of reaching any forbidden capability can, when combined, collectively reach a forbidden goal through an emergent conjunctive dependency.

Safety is Non-Compositional: A Formal Framework for Capability-Based AI Systems

Abstract

This paper contains the first formal proof that safety is non-compositional in the presence of conjunctive capability dependencies: two agents each individually inca- pable of reaching any forbidden capability can, when combined, collectively reach a forbidden goal through an emergent conjunctive dependency.
Paper Structure (54 sections, 21 theorems, 6 equations, 6 tables, 2 algorithms)

This paper contains 54 sections, 21 theorems, 6 equations, 6 tables, 2 algorithms.

Key Result

Theorem 1.1

This paper contains the first formal proof that safety is non-compositional in the presence of conjunctive capability dependencies. Specifically, two agents each individually incapable of reaching any forbidden capability can, when their capabilities are combined, collectively reach a forbidden goal

Theorems & Definitions (51)

  • Theorem 1.1: Non-Compositionality of Safety, informal
  • Definition 2.1: Capability Graph
  • Definition 3.1: Directed Hypergraph
  • Definition 3.2: Capability Hypergraph
  • Lemma 5.1: Graph Embedding, cf. gallo1993directed
  • proof
  • Corollary 5.2: Strict Generalisation
  • proof
  • Definition 6.1: Closure Operator
  • Definition 6.2: Plan
  • ...and 41 more