The Serial Scaling Hypothesis
Yuxi Liu, Konpat Preechakul, Kananart Kuwaranancharoen, Yutong Bai
TL;DR
The paper formalizes the Serial Scaling Hypothesis (SSH), arguing that many real-world tasks require long serial computation that cannot be efficiently parallelized by current architectures. By casting problems in a $ extsf{TC}$-theoretic framework, it delineates parallel ($ extsf{TC}$) vs inherently serial problems and demonstrates that diffusion models with $ extsf{TC}^0$ backbones have limited serial capacity, failing to solve general inherently serial tasks. It catalogs illustrative serial problems—cellular automata, many-body mechanics, sequential decision making, and math QA—showing they demand step-by-step computation that cannot be shortcut. The paper discusses implications for model design, hardware development, and benchmarks, arguing for architectures and training strategies that accommodate serial depth and for recognizing inherently serial tasks as a distinct benchmark category. Overall, it highlights a fundamental limit of parallel scaling and motivates a broader view of computation that includes significant serial computation in ML systems.
Abstract
While machine learning has advanced through massive parallelization, we identify a critical blind spot: some problems are fundamentally sequential. These "inherently serial" problems-from mathematical reasoning to physical simulations to sequential decision-making-require sequentially dependent computational steps that cannot be efficiently parallelized. We formalize this distinction in complexity theory, and demonstrate that current parallel-centric architectures face fundamental limitations on such tasks. Then, we show for first time that diffusion models despite their sequential nature are incapable of solving inherently serial problems. We argue that recognizing the serial nature of computation holds profound implications on machine learning, model design, and hardware development.
