Table of Contents
Fetching ...

Hype, Sustainability, and the Price of the Bigger-is-Better Paradigm in AI

Gaël Varoquaux, Alexandra Sasha Luccioni, Meredith Whittaker

TL;DR

The paper challenges the prevailing bigger-is-better paradigm in AI, arguing that scale is not a universal solution and often leads to unsustainable compute demands, environmental impact, and a concentration of power. Through analysis of scaling trends, benchmarking practices, and case studies across domains, it shows that many tasks are well served by smaller or domain-specific models and that benchmark-driven progress can misrepresent real-world utility. The authors highlight rebound effects, data governance concerns, and societal risks associated with scale, then propose a set of norms to value smaller systems, require transparent reporting of size and cost, and encourage diverse, application-driven research. The practical impact is a call for a more democratic, cost-aware, and environmentally sustainable approach to AI research and deployment that broadens participation and prioritizes meaningful, real-world benefits over sheer scale growth.

Abstract

With the growing attention and investment in recent AI approaches such as large language models, the narrative that the larger the AI system the more valuable, powerful and interesting it is is increasingly seen as common sense. But what is this assumption based on, and how are we measuring value, power, and performance? And what are the collateral consequences of this race to ever-increasing scale? Here, we scrutinize the current scaling trends and trade-offs across multiple axes and refute two common assumptions underlying the 'bigger-is-better' AI paradigm: 1) that performance improvements are driven by increased scale, and 2) that all interesting problems addressed by AI require large-scale models. Rather, we argue that this approach is not only fragile scientifically, but comes with undesirable consequences. First, it is not sustainable, as, despite efficiency improvements, its compute demands increase faster than model performance, leading to unreasonable economic requirements and a disproportionate environmental footprint. Second, it implies focusing on certain problems at the expense of others, leaving aside important applications, e.g. health, education, or the climate. Finally, it exacerbates a concentration of power, which centralizes decision-making in the hands of a few actors while threatening to disempower others in the context of shaping both AI research and its applications throughout society.

Hype, Sustainability, and the Price of the Bigger-is-Better Paradigm in AI

TL;DR

The paper challenges the prevailing bigger-is-better paradigm in AI, arguing that scale is not a universal solution and often leads to unsustainable compute demands, environmental impact, and a concentration of power. Through analysis of scaling trends, benchmarking practices, and case studies across domains, it shows that many tasks are well served by smaller or domain-specific models and that benchmark-driven progress can misrepresent real-world utility. The authors highlight rebound effects, data governance concerns, and societal risks associated with scale, then propose a set of norms to value smaller systems, require transparent reporting of size and cost, and encourage diverse, application-driven research. The practical impact is a call for a more democratic, cost-aware, and environmentally sustainable approach to AI research and deployment that broadens participation and prioritizes meaningful, real-world benefits over sheer scale growth.

Abstract

With the growing attention and investment in recent AI approaches such as large language models, the narrative that the larger the AI system the more valuable, powerful and interesting it is is increasingly seen as common sense. But what is this assumption based on, and how are we measuring value, power, and performance? And what are the collateral consequences of this race to ever-increasing scale? Here, we scrutinize the current scaling trends and trade-offs across multiple axes and refute two common assumptions underlying the 'bigger-is-better' AI paradigm: 1) that performance improvements are driven by increased scale, and 2) that all interesting problems addressed by AI require large-scale models. Rather, we argue that this approach is not only fragile scientifically, but comes with undesirable consequences. First, it is not sustainable, as, despite efficiency improvements, its compute demands increase faster than model performance, leading to unreasonable economic requirements and a disproportionate environmental footprint. Second, it implies focusing on certain problems at the expense of others, leaving aside important applications, e.g. health, education, or the climate. Finally, it exacerbates a concentration of power, which centralizes decision-making in the hands of a few actors while threatening to disempower others in the context of shaping both AI research and its applications throughout society.
Paper Structure (48 sections, 8 figures)

This paper contains 48 sections, 8 figures.

Figures (8)

  • Figure 1: An explosion in model size -- Left: The increase in model size means it is more and more expensive to run them in terms of RAM. Right: resources we need are increasing faster than available compute. Data from epochMachineLearningData2022, specific details in \ref{['app:historical_plots']}.
  • Figure 2: Performance as a function of scale saturates across various tasks. Plots of performance as a function of scale (time or memory footprint) on benchmark data from a) a medical image segmentation challenge flare2023challenge, b) computer-vision object detection lin2014microsoft and c) scene parsing zhou2017scene, d) tabular learning grinsztajn2022tree, e) text embedding muennighoff2022mteb, and f) text understanding open-llm-leaderboard. Details in \ref{['app:benchmark_plots']}.
  • Figure 3: The cost of a single inference is growing faster than compute is improving
  • Figure 4: A single inference uses more energy for models with broad purposes. Data from luccioni2024power.
  • Figure 5: A sharp increase in amount of data used for training Details in \ref{['app:historical_plots']}.
  • ...and 3 more figures