Multi-step Predictive Coding Leads To Simplicity Bias
Aviv Ratzon, Omri Barak
TL;DR
This work investigates when predictive coding yields interpretable latent representations by analyzing a minimal linear predictive coding problem with horizon $A$ and depth $L$, showing that the leading singular direction of the OLS estimator $oldsymbol{X}^ op oldsymbol{X}$ dominates as $A$ grows with the environment size $S$. The authors combine gradient-descent dynamics with an OLS-based bias analysis to explain why deeper networks trained with multi-step horizons converge to structured, low-rank encodings of the latent state, contrasting with single-step training. They extend these insights to nonlinear and more naturalistic settings, including piecewise-linear environments and MNIST-based tasks, where multi-step prediction yields a low-dimensional manifold ordered by the latent variable, while regularization fails to recover such structure. The findings offer a principled account of when predictive coding yields interpretable world models and inform how horizon, depth, and training dynamics shape learned representations in both artificial and biological systems.
Abstract
Predictive coding is a framework for understanding the formation of low-dimensional internal representations mirroring the environment's latent structure. The conditions under which such representations emerge remain unclear. In this work, we investigate how the prediction horizon and network depth shape the solutions of predictive coding tasks. Using a minimal abstract setting inspired by prior work, we show empirically and theoretically that sufficiently deep networks trained with multi-step prediction horizons consistently recover the underlying latent structure, a phenomenon explained through the Ordinary Least Squares estimator structure and biases in learning dynamics. We then extend these insights to nonlinear networks and complex datasets, including piecewise linear functions, MNIST, multiple latent states and higher dimensional state geometries. Our results provide a principled understanding of when and why predictive coding induces structured representations, bridging the gap between empirical observations and theoretical foundations.
