Table of Contents
Fetching ...

Gradual Disempowerment: Systemic Existential Risks from Incremental AI Development

Jan Kulveit, Raymond Douglas, Nora Ammann, Deger Turan, David Krueger, David Duvenaud

TL;DR

The paper argues that incremental AI progress can erode human influence across three interdependent societal systems—economy, culture, and states—potentially leading to an irreversible, civilization-wide loss of human agency. It analyzes how misalignment can emerge within each system and how cross-system feedback loops can amplify deterioration, creating mutual reinforcement that standard single-system fixes may not mitigate. It proposes a multi-pronged mitigation framework, including system-specific metrics, governance interventions, and a shift toward system-wide alignment that accounts for complex socio-technical dynamics, stressing international coordination and interdisciplinary research. The work highlights the existential stakes of gradual erosion, cautioning that proactive measurement, regulation, and reinforcement of human influence are necessary to prevent a gradual but irreversible disempowerment of humanity.

Abstract

This paper examines the systemic risks posed by incremental advancements in artificial intelligence, developing the concept of `gradual disempowerment', in contrast to the abrupt takeover scenarios commonly discussed in AI safety. We analyze how even incremental improvements in AI capabilities can undermine human influence over large-scale systems that society depends on, including the economy, culture, and nation-states. As AI increasingly replaces human labor and cognition in these domains, it can weaken both explicit human control mechanisms (like voting and consumer choice) and the implicit alignments with human interests that often arise from societal systems' reliance on human participation to function. Furthermore, to the extent that these systems incentivise outcomes that do not line up with human preferences, AIs may optimize for those outcomes more aggressively. These effects may be mutually reinforcing across different domains: economic power shapes cultural narratives and political decisions, while cultural shifts alter economic and political behavior. We argue that this dynamic could lead to an effectively irreversible loss of human influence over crucial societal systems, precipitating an existential catastrophe through the permanent disempowerment of humanity. This suggests the need for both technical research and governance approaches that specifically address the risk of incremental erosion of human influence across interconnected societal systems.

Gradual Disempowerment: Systemic Existential Risks from Incremental AI Development

TL;DR

The paper argues that incremental AI progress can erode human influence across three interdependent societal systems—economy, culture, and states—potentially leading to an irreversible, civilization-wide loss of human agency. It analyzes how misalignment can emerge within each system and how cross-system feedback loops can amplify deterioration, creating mutual reinforcement that standard single-system fixes may not mitigate. It proposes a multi-pronged mitigation framework, including system-specific metrics, governance interventions, and a shift toward system-wide alignment that accounts for complex socio-technical dynamics, stressing international coordination and interdisciplinary research. The work highlights the existential stakes of gradual erosion, cautioning that proactive measurement, regulation, and reinforcement of human influence are necessary to prevent a gradual but irreversible disempowerment of humanity.

Abstract

This paper examines the systemic risks posed by incremental advancements in artificial intelligence, developing the concept of `gradual disempowerment', in contrast to the abrupt takeover scenarios commonly discussed in AI safety. We analyze how even incremental improvements in AI capabilities can undermine human influence over large-scale systems that society depends on, including the economy, culture, and nation-states. As AI increasingly replaces human labor and cognition in these domains, it can weaken both explicit human control mechanisms (like voting and consumer choice) and the implicit alignments with human interests that often arise from societal systems' reliance on human participation to function. Furthermore, to the extent that these systems incentivise outcomes that do not line up with human preferences, AIs may optimize for those outcomes more aggressively. These effects may be mutually reinforcing across different domains: economic power shapes cultural narratives and political decisions, while cultural shifts alter economic and political behavior. We argue that this dynamic could lead to an effectively irreversible loss of human influence over crucial societal systems, precipitating an existential catastrophe through the permanent disempowerment of humanity. This suggests the need for both technical research and governance approaches that specifically address the risk of incremental erosion of human influence across interconnected societal systems.

Paper Structure

This paper contains 48 sections, 2 figures.

Figures (2)

  • Figure 1: A simplified model of a potential future trajectory where AI displaces human labor and the fraction of unautomated tasks collapses to zero in a fixed amount of time. Note that wages grow during the initial period but then collapse before full automation is reached. Inspired by simulations in scenario analysis by korinek2024scenarios.
  • Figure 2: Some ways in which broad societal systems interact and influence each other.