Simple yet Effective: Low-Rank Spatial Attention for Neural Operators

Zherui Yang; Haiyang Xin; Tao Du; Ligang Liu

Simple yet Effective: Low-Rank Spatial Attention for Neural Operators

Zherui Yang, Haiyang Xin, Tao Du, Ligang Liu

Abstract

Neural operators have emerged as data-driven surrogates for solving partial differential equations (PDEs), and their success hinges on efficiently modeling the long-range, global coupling among spatial points induced by the underlying physics. In many PDE regimes, the induced global interaction kernels are empirically compressible, exhibiting rapid spectral decay that admits low-rank approximations. We leverage this observation to unify representative global mixing modules in neural operators under a shared low-rank template: compressing high-dimensional pointwise features into a compact latent space, processing global interactions within it, and reconstructing the global context back to spatial points. Guided by this view, we introduce Low-Rank Spatial Attention (LRSA) as a clean and direct instantiation of this template. Crucially, unlike prior approaches that often rely on non-standard aggregation or normalization modules, LRSA is built purely from standard Transformer primitives, i.e., attention, normalization, and feed-forward networks, yielding a concise block that is straightforward to implement and directly compatible with hardware-optimized kernels. In our experiments, such a simple construction is sufficient to achieve high accuracy, yielding an average error reduction of over 17\% relative to second-best methods, while remaining stable and efficient in mixed-precision training.

Simple yet Effective: Low-Rank Spatial Attention for Neural Operators

Abstract

Simple yet Effective: Low-Rank Spatial Attention for Neural Operators

Abstract

Paper Structure

Table of Contents

Figures (5)