Table of Contents
Fetching ...

Overtone: Cyclic Patch Modulation for Clean, Efficient, and Flexible Physics Emulators

Payel Mukhopadhyay, Michael McCabe, Ruben Ohana, Miles Cranmer

TL;DR

Overtone's key insight is that cyclically modulating patch sizes during autoregressive rollouts distributes errors across the frequency spectrum, mitigating the systematic harmonic artifact accumulation that plague fixed-patch models.

Abstract

Transformer-based PDE surrogates achieve remarkable performance but face two key challenges: fixed patch sizes cause systematic error accumulation at harmonic frequencies, and computational costs remain inflexible regardless of problem complexity or available resources. We introduce Overtone, a unified solution through dynamic patch size control at inference. Overtone's key insight is that cyclically modulating patch sizes during autoregressive rollouts distributes errors across the frequency spectrum, mitigating the systematic harmonic artifact accumulation that plague fixed-patch models. We implement this through two architecture-agnostic modules--CSM (using dynamic stride modulation) and CKM (using dynamic kernel resizing)--that together provide both harmonic mitigation and compute-adaptive deployment. This flexible tokenization lets users trade accuracy for speed dynamically based on computational constraints, and the cyclic rollout strategy yields up to 40% lower long rollout error in variance-normalised RMSE (VRMSE) compared to conventional, static-patch surrogates. Across challenging 2D and 3D PDE benchmarks, one Overtone model matches or exceeds fixed-patch baselines across inference compute budgets, when trained under a fixed total training budget setting.

Overtone: Cyclic Patch Modulation for Clean, Efficient, and Flexible Physics Emulators

TL;DR

Overtone's key insight is that cyclically modulating patch sizes during autoregressive rollouts distributes errors across the frequency spectrum, mitigating the systematic harmonic artifact accumulation that plague fixed-patch models.

Abstract

Transformer-based PDE surrogates achieve remarkable performance but face two key challenges: fixed patch sizes cause systematic error accumulation at harmonic frequencies, and computational costs remain inflexible regardless of problem complexity or available resources. We introduce Overtone, a unified solution through dynamic patch size control at inference. Overtone's key insight is that cyclically modulating patch sizes during autoregressive rollouts distributes errors across the frequency spectrum, mitigating the systematic harmonic artifact accumulation that plague fixed-patch models. We implement this through two architecture-agnostic modules--CSM (using dynamic stride modulation) and CKM (using dynamic kernel resizing)--that together provide both harmonic mitigation and compute-adaptive deployment. This flexible tokenization lets users trade accuracy for speed dynamically based on computational constraints, and the cyclic rollout strategy yields up to 40% lower long rollout error in variance-normalised RMSE (VRMSE) compared to conventional, static-patch surrogates. Across challenging 2D and 3D PDE benchmarks, one Overtone model matches or exceeds fixed-patch baselines across inference compute budgets, when trained under a fixed total training budget setting.

Paper Structure

This paper contains 69 sections, 27 equations, 24 figures, 15 tables, 2 algorithms.

Figures (24)

  • Figure 1: Illustration of Overtone's CSM (stride modulation) and CKM (kernel modulation). Cyclic modulation distributes errors across frequencies, preventing error accumulation at harmonics and enabling accuracy-compute trade-offs at inference time.
  • Figure 2: Left: Averaged 1D residual power spectrum at rollout step 20 revealing harmonic error accumulation in fixed-patch models. The fixed patch 16 model shows pronounced spikes at harmonic frequencies $k/16$, while CSM and CKM with cyclic modulation distribute spectral errors across the frequency range, supporting our theoretical motivation (\ref{['sec:intuition']}). Note that this is residual spectrum wrt the ground truth, which is trivially zero for the true simulation, and is therefore not shown. Right: Spatial manifestation of harmonic artifacts at rollout step 20 on Turbulent Radiative Layer 2D. Cyclic patch modulation eliminates the grid-like distortions in the fixed model, showing the practical impact of Overtone's rollout strategy.
  • Figure 3: Next-step prediction test VRMSEs v.s. number of tokens at inference of a 100M parameter model trained on four 2D datasets. Lower VRMSE is better. Token count is a proxy for required compute. Note that there are three separate fixed-patch models (green), as each needs to be trained from scratch. This plot shows the compute-accuracy trade-off, which is also explored more in \ref{['subsec:pred_accuracy_vrmse']}.
  • Figure 4: Step 20 rollout for the concentration field of the Active Matter dataset. Left: ground truth, fixed p=16, CSM, and CKM. Right: corresponding 1D averaged residual frequency spectrum.
  • Figure 5: Step 40 rollout for the tracer field of the Shear Flow dataset. Left: ground truth, fixed p=16, CSM, and CKM. Right: corresponding residual frequency spectrum.
  • ...and 19 more figures