Table of Contents
Fetching ...

Towards a Physics Foundation Model

Florian Wiesner, Matthias Wessling, Stephen Baek

TL;DR

The paper tackles the lack of a true Physics Foundation Model (PFM) by introducing the General Physics Transformer (GPhyT), a transformer-based surrogate trained on 1.8 TB of diverse simulations to infer governing dynamics from context. GPhyT combines a neural differentiator that predicts the time derivative $\partial X/\partial t$ with a numerical integrator (e.g., forward Euler) to advance the system, enabling in-context adaptation across multiple PDE-driven domains. It demonstrates state-of-the-art cross-physics performance, plausible zero-shot generalization to unseen boundary conditions and novel physics, and improved long-horizon stability relative to specialized baselines, marking a significant step toward a universal Physics Foundation Model. While limitations remain (2D scope, long-term accuracy, and broader physics coverage), the approach provides a scalable path to democratizing access to high-fidelity simulations and accelerating computational science, with reproducibility and future extension to 3D and broader domains emphasized.

Abstract

Foundation models have revolutionized natural language processing through a ``train once, deploy anywhere'' paradigm, where a single pre-trained model adapts to countless downstream tasks without retraining. Access to a Physics Foundation Model (PFM) would be transformative - democratizing access to high-fidelity simulations, accelerating scientific discovery, and eliminating the need for specialized solver development. Yet current physics-aware machine learning approaches remain fundamentally limited to single, narrow domains and require retraining for each new system. We present the General Physics Transformer (GPhyT), trained on 1.8 TB of diverse simulation data, that demonstrates foundation model capabilities are achievable for physics. Our key insight is that transformers can learn to infer governing dynamics from context, enabling a single model to simulate fluid-solid interactions, shock waves, thermal convection, and multi-phase dynamics without being told the underlying equations. GPhyT achieves three critical breakthroughs: (1) superior performance across multiple physics domains, outperforming specialized architectures by more than 7x, (2) plausible zero-shot generalization to entirely unseen physical systems through in-context learning, and (3) more stable long-term predictions through long-horizon rollouts. By establishing that a single model can learn generalizable physical principles from data alone, this work opens the path toward a universal PFM that could transform computational science and engineering.

Towards a Physics Foundation Model

TL;DR

The paper tackles the lack of a true Physics Foundation Model (PFM) by introducing the General Physics Transformer (GPhyT), a transformer-based surrogate trained on 1.8 TB of diverse simulations to infer governing dynamics from context. GPhyT combines a neural differentiator that predicts the time derivative with a numerical integrator (e.g., forward Euler) to advance the system, enabling in-context adaptation across multiple PDE-driven domains. It demonstrates state-of-the-art cross-physics performance, plausible zero-shot generalization to unseen boundary conditions and novel physics, and improved long-horizon stability relative to specialized baselines, marking a significant step toward a universal Physics Foundation Model. While limitations remain (2D scope, long-term accuracy, and broader physics coverage), the approach provides a scalable path to democratizing access to high-fidelity simulations and accelerating computational science, with reproducibility and future extension to 3D and broader domains emphasized.

Abstract

Foundation models have revolutionized natural language processing through a ``train once, deploy anywhere'' paradigm, where a single pre-trained model adapts to countless downstream tasks without retraining. Access to a Physics Foundation Model (PFM) would be transformative - democratizing access to high-fidelity simulations, accelerating scientific discovery, and eliminating the need for specialized solver development. Yet current physics-aware machine learning approaches remain fundamentally limited to single, narrow domains and require retraining for each new system. We present the General Physics Transformer (GPhyT), trained on 1.8 TB of diverse simulation data, that demonstrates foundation model capabilities are achievable for physics. Our key insight is that transformers can learn to infer governing dynamics from context, enabling a single model to simulate fluid-solid interactions, shock waves, thermal convection, and multi-phase dynamics without being told the underlying equations. GPhyT achieves three critical breakthroughs: (1) superior performance across multiple physics domains, outperforming specialized architectures by more than 7x, (2) plausible zero-shot generalization to entirely unseen physical systems through in-context learning, and (3) more stable long-term predictions through long-horizon rollouts. By establishing that a single model can learn generalizable physical principles from data alone, this work opens the path toward a universal PFM that could transform computational science and engineering.

Paper Structure

This paper contains 33 sections, 20 equations, 18 figures, 5 tables.

Figures (18)

  • Figure 1: (a) General architecture of GPhyT. A 4D-stack of physical quantities (time, height, width, fields) serves as input $X$. The numerically computed derivatives of each field are concatenated to the input. The differentiator (linear tokenizer, spatiotemporal transformer, linear detokenizer) provides the partial derivative of $X$ wrt. time. Finally, a numerical integrator computes the next timestep of each field given $\frac{\partial X}{\partial t}$ and $X$. (b) Architecture of a single transformer layer, consisting of layer norms (LN), spatiotemporal attention, and multilayer perceptron (MLP).
  • Figure 2: Median normalized mean square error (NMSE) of all models across the test datasets for next step prediction. Losses are grouped for each dataset and the overall loss. The error bars indicate the 25th and 75th percentile errors. GPhyT shows the overall lowest error and lowest in all but one dataset.
  • Figure 3: (a) Overall autoregressive long-horizon prediction (median NMSE) for all models on the known physical systems up to 24 prediction steps. (b) Visualization of prediction step t=1 and t=24 of Euler shockwaves with ground truth (GT), the worst model (MPP) and the two best (Poseidon, GPhyT). The images can be best viewed on a high-definition digital monitor.
  • Figure 4: (a) Overall autoregressive long-horizon prediction (median NMSE) for all models on the novel physical systems up to 24 prediction steps. (b) Separate graphs for autoregressive long-horizon prediction of systems with new boundary condations (top) and completely new physics (bottom).
  • Figure 5: Ablation studies on the known datasets with median NMSE: (a) Comparing GPhyT against a version predicting the next state directly. (b) Comparing GPhyT against a version without explicit spatial and temporal derivatives as input. (c) Scaling behavior of GPhyT. (d) Effect of the number of input time steps on rollout performance.
  • ...and 13 more figures