Table of Contents
Fetching ...

BCAT: A Block Causal Transformer for PDE Foundation Models for Fluid Dynamics

Yuxuan Liu, Jingmin Sun, Hayden Schaeffer

TL;DR

<3-5 sentence high-level summary> BCAT introduces a block causal transformer-based PDE foundation model for autoregressive prediction of 2D fluid dynamics. It reframes forecasting as next-frame prediction to better capture spatiotemporal dependencies, achieving substantial speedups and accuracy gains over next-token approaches. Trained on six PDE families from PDEBench, PDEArena, and CFDBench, it attains an average relative L2 error around 1.18% and demonstrates strong zero-shot and transfer performance, including turbulence fine-tuning that surpasses prior methods by over 40%. The work also shows notable architectural and optimization innovations, including the Muon optimizer and patch-based tokenization, enabling scalable, efficient learning for complex fluid dynamics tasks.

Abstract

We introduce BCAT, a PDE foundation model designed for autoregressive prediction of solutions to two dimensional fluid dynamics problems. Our approach uses a block causal transformer architecture to model next frame predictions, leveraging previous frames as contextual priors rather than relying solely on sub-frames or pixel-based inputs commonly used in image generation methods. This block causal framework more effectively captures the spatial dependencies inherent in nonlinear spatiotemporal dynamics and physical phenomena. In an ablation study, next frame prediction demonstrated a 3.5x accuracy improvement over next token prediction. BCAT is trained on a diverse range of fluid dynamics datasets, including incompressible and compressible Navier-Stokes equations across various geometries and parameter regimes, as well as the shallow-water equations. The model's performance was evaluated on 6 distinct downstream prediction tasks and tested on about 8K trajectories to measure robustness on a variety of fluid dynamics simulations. BCAT achieved an average relative error of 1.18% across all evaluation tasks, outperforming prior approaches on standard benchmarks. With fine-tuning on a turbulence dataset, we show that the method adapts to new settings with more than 40% better accuracy over prior methods.

BCAT: A Block Causal Transformer for PDE Foundation Models for Fluid Dynamics

TL;DR

<3-5 sentence high-level summary> BCAT introduces a block causal transformer-based PDE foundation model for autoregressive prediction of 2D fluid dynamics. It reframes forecasting as next-frame prediction to better capture spatiotemporal dependencies, achieving substantial speedups and accuracy gains over next-token approaches. Trained on six PDE families from PDEBench, PDEArena, and CFDBench, it attains an average relative L2 error around 1.18% and demonstrates strong zero-shot and transfer performance, including turbulence fine-tuning that surpasses prior methods by over 40%. The work also shows notable architectural and optimization innovations, including the Muon optimizer and patch-based tokenization, enabling scalable, efficient learning for complex fluid dynamics tasks.

Abstract

We introduce BCAT, a PDE foundation model designed for autoregressive prediction of solutions to two dimensional fluid dynamics problems. Our approach uses a block causal transformer architecture to model next frame predictions, leveraging previous frames as contextual priors rather than relying solely on sub-frames or pixel-based inputs commonly used in image generation methods. This block causal framework more effectively captures the spatial dependencies inherent in nonlinear spatiotemporal dynamics and physical phenomena. In an ablation study, next frame prediction demonstrated a 3.5x accuracy improvement over next token prediction. BCAT is trained on a diverse range of fluid dynamics datasets, including incompressible and compressible Navier-Stokes equations across various geometries and parameter regimes, as well as the shallow-water equations. The model's performance was evaluated on 6 distinct downstream prediction tasks and tested on about 8K trajectories to measure robustness on a variety of fluid dynamics simulations. BCAT achieved an average relative error of 1.18% across all evaluation tasks, outperforming prior approaches on standard benchmarks. With fine-tuning on a turbulence dataset, we show that the method adapts to new settings with more than 40% better accuracy over prior methods.

Paper Structure

This paper contains 61 sections, 13 equations, 8 figures, 11 tables.

Figures (8)

  • Figure 1: BCAT model overview. The inputs to the model are the initial frames sampled from the datasets, which are patchified and converted into a sequence of features. Transformer layers then take the input sequence to perform next frame prediction, where a block causal mask allows spatial interactions within a frame and temporal causality across varying-length context windows. The processed features are then transformed back to form the final predictions.
  • Figure 2: Evaluating BCAT, DPOT-L, and MPP-L for more output time steps on PDEArena NS-cond dataset. Rollout is used to obtain outputs beyond the training timesteps.
  • Figure 3: Comparing BCAT trained with Muon vs. AdamW optimizer. The target (first row) is the first 6 output steps from the PDEArena Navier-Stokes (conditioned) dataset (particle density channel). For each optimizer (each row), we display the difference between the target and model output. Relative $L^2$ errors for the full trajectories are listed after the optimizer names.
  • Figure 4: Transformer Layers used in BCAT model.
  • Figure 5: Example outputs from the BCAT model. 4 output steps for PDEArena Navier-Stokes dataset. The channel plotted is the particle density in equation \ref{['eq:arena_ns']}. Each column represents a different timestamp. For this trajectory, the relative $L^2$ error is 1.93%.
  • ...and 3 more figures