Text2PDE: Latent Diffusion Models for Accessible Physics Simulation

Anthony Zhou; Zijie Li; Michael Schneier; John R Buchanan; Amir Barati Farimani

Text2PDE: Latent Diffusion Models for Accessible Physics Simulation

Anthony Zhou, Zijie Li, Michael Schneier, John R Buchanan, Amir Barati Farimani

TL;DR

Text2PDE introduces a latent diffusion framework for physics simulation that compresses irregular PDE data with a mesh aware autoencoder and generates full spatio temporal rollouts conditioned on initial frames or language prompts. By performing diffusion in a latent space, the approach mitigates autoregressive error accumulation and supports discretization invariant decoding onto arbitrary meshes. Conditioning modalities, including text prompts via pretrained transformers, offer interpretable and compact interfaces for engineering design while maintaining physical fidelity. Across cylinder flow, buoyancy driven flow, and 3D turbulence, the method achieves competitive accuracy and scalable performance up to near 3 billion parameters, highlighting the practicality of diffusion surrogates for neural PDE solvers and their potential usability improvements.

Abstract

Recent advances in deep learning have inspired numerous works on data-driven solutions to partial differential equation (PDE) problems. These neural PDE solvers can often be much faster than their numerical counterparts; however, each presents its unique limitations and generally balances training cost, numerical accuracy, and ease of applicability to different problem setups. To address these limitations, we introduce several methods to apply latent diffusion models to physics simulation. Firstly, we introduce a mesh autoencoder to compress arbitrarily discretized PDE data, allowing for efficient diffusion training across various physics. Furthermore, we investigate full spatio-temporal solution generation to mitigate autoregressive error accumulation. Lastly, we investigate conditioning on initial physical quantities, as well as conditioning solely on a text prompt to introduce text2PDE generation. We show that language can be a compact, interpretable, and accurate modality for generating physics simulations, paving the way for more usable and accessible PDE solvers. Through experiments on both uniform and structured grids, we show that the proposed approach is competitive with current neural PDE solvers in both accuracy and efficiency, with promising scaling behavior up to $\sim$3 billion parameters. By introducing a scalable, accurate, and usable physics simulator, we hope to bring neural PDE solvers closer to practical use.

Text2PDE: Latent Diffusion Models for Accessible Physics Simulation

TL;DR

Abstract

3 billion parameters. By introducing a scalable, accurate, and usable physics simulator, we hope to bring neural PDE solvers closer to practical use.

Paper Structure (57 sections, 5 equations, 8 figures, 9 tables)

This paper contains 57 sections, 5 equations, 8 figures, 9 tables.

Introduction
Text2PDE Models
Contributions
Background
Problem Setup
Spatio-temporal Diffusion for PDEs
Latent Diffusion
Methods
Autoencoders for PDE Data
Mesh Encoder
Convolutional Backbone
Mesh Decoder
Comparison to GNNs and Neural Fields
Latent Diffusion
Conditioning Mechanisms
...and 42 more sections

Figures (8)

Figure 1: We introduce latent diffusion models for physics simulation, with the remarkable ability of generating an entire PDE rollout from a text prompt. Three generated solutions are displayed with their model inputs.
Figure 2: The proposed architecture. Samples are mapped to a grid through a learned aggregation before being encoded to a latent vector and noised. A denoising process is learned, with conditioning from text or physics-based modalities. Denoised latents are decoded to a grid and mapped to a mesh through a learned interpolation.
Figure 3: Losses at each timestep are evaluated for 10 samples. Average losses at each timestep are bolded, individual sample losses are opaque.
Figure 4: Additional examples of text-conditioned generation of 3D turbulence samples with the velocity magnitude rendered. The true solution is shown on top, followed by the sampled solution on the bottom. While the generated solutions smoothen high-frequency features, the solutions remain stable and are broadly accurate. Rendered with vAPE4D koehler2024apebenchbenchmarkautoregressiveneural.
Figure 5: Additional examples of text-conditioned generation of smoke buoyancy examples. The true solution is shown on top, followed by the sampled solution on the bottom. The model is only given a text description of the initial frame, which is also displayed for each example with the additional observations omitted.
...and 3 more figures

Text2PDE: Latent Diffusion Models for Accessible Physics Simulation

TL;DR

Abstract

Text2PDE: Latent Diffusion Models for Accessible Physics Simulation

Authors

TL;DR

Abstract

Table of Contents

Figures (8)