AI Behind Closed Doors: a Primer on The Governance of Internal Deployment
Charlotte Stix, Matteo Pistillo, Girish Sastry, Marius Hobbhahn, Alejandro Ortega, Mikita Balesni, Annika Hallensleben, Nix Goldowsky-Dill, Lee Sharkey
TL;DR
This paper argues that internal deployment of frontier AI systems—those developed and used within the deploying organization—constitutes a critical governance blind spot with potentially outsized societal risks. It develops a framework to characterize internal deployment, identifies two high-impact threat scenarios (loss of control via misaligned scheming and unchecked power concentration), and surveys existing AI governance frameworks and safety-critical-industry practices for applicable governance patterns. It then proposes a defense-in-depth blueprint comprising Frontier Safety Policies with tripwires, internal usage policies, and an oversight framework, plus targeted transparency and disaster-resilience planning, implemented via dedicated internal bodies (IDT and IDOB). The work aims to catalyze decision-making in industry and government by providing a first prototype for governance of internal deployment and highlighting opportunities for public-private cooperation to enhance safety and resilience.
Abstract
The most advanced future AI systems will first be deployed inside the frontier AI companies developing them. According to these companies and independent experts, AI systems may reach or even surpass human intelligence and capabilities by 2030. Internal deployment is, therefore, a key source of benefits and risks from frontier AI systems. Despite this, the governance of the internal deployment of highly advanced frontier AI systems appears absent. This report aims to address this absence by priming a conversation around the governance of internal deployment. It presents a conceptualization of internal deployment, learnings from other sectors, reviews of existing legal frameworks and their applicability, and illustrative examples of the type of scenarios we are most concerned about. Specifically, it discusses the risks correlated to the loss of control via the internal application of a misaligned AI system to the AI research and development pipeline, and unconstrained and undetected power concentration behind closed doors. The report culminates with a small number of targeted recommendations that provide a first blueprint for the governance of internal deployment.
