Scaling Laws for Economic Productivity: Experimental Evidence in LLM-Assisted Translation
Ali Merali
TL;DR
The study probes how scaling laws for large language models translate into real-world economic productivity. It implements a preregistered randomized controlled trial with 300 translators, 1,800 tasks, and 13 LLMs of varying compute, measuring time, quality, and earnings under bonuses. Key findings show that a tenfold increase in training compute reduces task time by 12.3%, elevates quality by 0.18 standard deviations, and raises earnings per minute by 16.1%, with larger gains for lower-skilled workers. By embedding these scaling results in the Acemoglu (2024) framework and using Hulten's theorem, the authors project about 6.95% aggregate U.S. productivity growth over the next decade, while noting domain limits and the need for broader generalization and longer-horizon analysis.
Abstract
This paper derives "scaling laws"--empirical relationships between the training compute of Large Language Models (LLMs) and their performance--for economic outcomes. In a preregistered online experiment, 300 professional translators completed 1,800 tasks using one of 13 LLMs (or a control). A tenfold increase in model compute improved task completion speed by 12.3%, grades by 0.18 standard deviations, and earnings per minute by 16.1%. Gains were four times larger for lower-skilled workers. These findings suggest continued model scaling could boost U.S. productivity by at least 6.9% over the next decade.
