Text-Trained LLMs Can Zero-Shot Extrapolate PDE Dynamics, Revealing a Three-Stage In-Context Learning Mechanism

Jiajun Bao; Nicolas Boullé; Toni J. B. Liu; Raphaël Sarfati; Christopher J. Earls

Text-Trained LLMs Can Zero-Shot Extrapolate PDE Dynamics, Revealing a Three-Stage In-Context Learning Mechanism

Jiajun Bao, Nicolas Boullé, Toni J. B. Liu, Raphaël Sarfati, Christopher J. Earls

TL;DR

The paper shows that text-trained, zero-shot LLMs can extrapolate discretized PDE dynamics from serialized spatiotemporal data without fine-tuning or NL prompts. It reveals in-context scaling laws: prediction improves with longer temporal context but worsens with finer spatial discretization, and errors grow algebraically over multi-step rollouts. A three-stage entropy-based progression—syntax imitation, exploratory uncertainty, and consolidation—emerges as these models internalize PDE structure purely from in-context exposure. The work indicates that pretrained LLMs encode numerical priors and invariants that enable coherent spatiotemporal predictions, offering a lens into emergent reasoning biases and potential tools for probing numerical dynamics in large language models.

Abstract

Large language models (LLMs) have demonstrated emergent in-context learning (ICL) capabilities across a range of tasks, including zero-shot time-series forecasting. We show that text-trained foundation models can accurately extrapolate spatiotemporal dynamics from discretized partial differential equation (PDE) solutions without fine-tuning or natural language prompting. Predictive accuracy improves with longer temporal contexts but degrades at finer spatial discretizations. In multi-step rollouts, where the model recursively predicts future spatial states over multiple time steps, errors grow algebraically with the time horizon, reminiscent of global error accumulation in classical finite-difference solvers. We interpret these trends as in-context neural scaling laws, where prediction quality varies predictably with both context length and output length. To better understand how LLMs are able to internally process PDE solutions so as to accurately roll them out, we analyze token-level output distributions and uncover a consistent three-stage ICL progression: beginning with syntactic pattern imitation, transitioning through an exploratory high-entropy phase, and culminating in confident, numerically grounded predictions.

Text-Trained LLMs Can Zero-Shot Extrapolate PDE Dynamics, Revealing a Three-Stage In-Context Learning Mechanism

TL;DR

Abstract

Text-Trained LLMs Can Zero-Shot Extrapolate PDE Dynamics, Revealing a Three-Stage In-Context Learning Mechanism

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (25)