Transolver: A Fast Transformer Solver for PDEs on General Geometries

Haixu Wu; Huakun Luo; Haowen Wang; Jianmin Wang; Mingsheng Long

Transolver: A Fast Transformer Solver for PDEs on General Geometries

Haixu Wu, Huakun Luo, Haowen Wang, Jianmin Wang, Mingsheng Long

TL;DR

Transolver introduces Physics-Attention, which learns intrinsic physical states as slices of discretized domains and applies attention over physics-aware tokens to solve PDEs on general geometries with linear-time complexity. By focusing on physical states rather than individual mesh points, it effectively captures complex multiphysics correlations and scales to large unstructured meshes and industrial designs. Empirically, it achieves state-of-the-art performance across eight benchmarks, demonstrates strong efficiency, and exhibits robust out-of-distribution generalization, indicating potential as a foundation model for PDE solving. The work highlights the value of physics-guided tokenization for geometry-general neural operators and suggests paths toward large-scale pretraining and broader industrial adoption.

Abstract

Transformers have empowered many milestones across various fields and have recently been applied to solve partial differential equations (PDEs). However, since PDEs are typically discretized into large-scale meshes with complex geometries, it is challenging for Transformers to capture intricate physical correlations directly from massive individual points. Going beyond superficial and unwieldy meshes, we present Transolver based on a more foundational idea, which is learning intrinsic physical states hidden behind discretized geometries. Specifically, we propose a new Physics-Attention to adaptively split the discretized domain into a series of learnable slices of flexible shapes, where mesh points under similar physical states will be ascribed to the same slice. By calculating attention to physics-aware tokens encoded from slices, Transovler can effectively capture intricate physical correlations under complex geometrics, which also empowers the solver with endogenetic geometry-general modeling capacity and can be efficiently computed in linear complexity. Transolver achieves consistent state-of-the-art with 22% relative gain across six standard benchmarks and also excels in large-scale industrial simulations, including car and airfoil designs. Code is available at https://github.com/thuml/Transolver.

Transolver: A Fast Transformer Solver for PDEs on General Geometries

TL;DR

Abstract

Paper Structure (63 sections, 3 theorems, 16 equations, 21 figures, 16 tables)

This paper contains 63 sections, 3 theorems, 16 equations, 21 figures, 16 tables.

Introduction
Related Work
Neural PDE Solvers
Physics-informed neural networks
Neural operators
Neural operators
Geometric Deep Learning
Method
Problem setup
Learning Physics-Aware Tokens
Transolver
Physics-Attention
Overall design
Experiments
Benchmarks
...and 48 more sections

Key Result

Theorem 3.4

Given input function $\boldsymbol{u}:\Omega\to\mathbb{R}^{C}$ and a mesh point $\mathbf{g}^\ast\in\Omega$, Physics-Attention is to approximate the integral operator $\mathcal{G}$, which is defined as: where $\kappa(\cdot,\cdot)$ denotes the kernel function defined on $\Omega\times \Omega$.

Figures (21)

Figure 1: Visualization of learned slices in Transolver. For each case, the leftmost subfigure is model input and the right shows learned slices. A brighter color indicates the mesh point is more ascribed to the corresponding slice. See Appendix \ref{['appdix:full_vis']} for more visualizations.
Figure 2: Learning physics-aware tokens from Transolver slices.
Figure 3: Overall design of Transolver layer, which replaces the standard attention with Physics-Attention. Each head encodes the input domain into a series of physics-aware tokens and then captures physical correlations under intricate geometrics by attention among tokens.
Figure 4: Car and airfoil design tasks. The key problem is to estimate the drag and lift force of a driving car or a flying airplane.
Figure 5: Physics-Attention visualization on Elasticity: (a) slice weights in the last layer of Transolver for both original and resampled meshes, (b) attention maps of the last layer in Transolver and Galerkin Transformer Cao2021ChooseAT. See Appendix \ref{['appdix:full_vis_slice']} for more visualizations.
...and 16 more figures

Theorems & Definitions (11)

Remark 3.1: Why slices can learn physically internal-consistent information
Remark 3.2: Learning slice is different from splitting computation area
Remark 3.3: Attention as learnable integral operator
Theorem 3.4: Physics-Attention is equivalent to learnable integral on $\Omega$
proof
Lemma 1.1
proof
Remark 1.2: Solving PDEs by learning integral neural operators
Lemma 1.3
proof
...and 1 more

Transolver: A Fast Transformer Solver for PDEs on General Geometries

TL;DR

Abstract

Transolver: A Fast Transformer Solver for PDEs on General Geometries

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (21)

Theorems & Definitions (11)