SUPRA: Subspace Parameterized Attention for Neural Operator on General Domains
Zherui Yang, Zhengyang Xue, Ligang Liu
TL;DR
SUPRA introduces Subspace Parameterized Attention to extend attention mechanisms to function spaces for neural operators on general domains. By recasting attention as a bilinear form $a(\,\cdot\, ,\,\cdot\,)$ and a linear operator $b(\cdot)$ in $L^2(\Omega)$ and projecting to a finite subspace spanned by basis functions, SUPRA achieves a favorable balance between expressive power and computational efficiency. The Laplacian eigensubspace basis on irregular geometries guarantees continuity and near-optimal approximation for smooth functions, enabling accurate PDE surrogates with reduced complexity $O(C^2 N + C M)$. Across five standard PDE benchmarks, SUPRA attains up to 33% relative $L^2$ error reductions while maintaining state-of-the-art efficiency, highlighting its practical potential for PDE solving on complex domains.
Abstract
Neural operators are efficient surrogate models for solving partial differential equations (PDEs), but their key components face challenges: (1) in order to improve accuracy, attention mechanisms suffer from computational inefficiency on large-scale meshes, and (2) spectral convolutions rely on the Fast Fourier Transform (FFT) on regular grids and assume a flat geometry, which causes accuracy degradation on irregular domains. To tackle these problems, we regard the matrix-vector operations in the standard attention mechanism on vectors in Euclidean space as bilinear forms and linear operators in vector spaces and generalize the attention mechanism to function spaces. This new attention mechanism is fully equivalent to the standard attention but impossible to compute due to the infinite dimensionality of function spaces. To address this, inspired by model reduction techniques, we propose a Subspace Parameterized Attention (SUPRA) neural operator, which approximates the attention mechanism within a finite-dimensional subspace. To construct a subspace on irregular domains for SUPRA, we propose using the Laplacian eigenfunctions, which naturally adapt to domains' geometry and guarantee the optimal approximation for smooth functions. Experiments show that the SUPRA neural operator reduces error rates by up to 33% on various PDE datasets while maintaining state-of-the-art computational efficiency.
