SPUS: A Lightweight and Parameter-Efficient Foundation Model for PDEs
Abu Bucker Siddik, Diane Oyen, Alexander Most, Michal Kucer, Ayan Biswas
TL;DR
SPUS tackles the high cost of PDE foundation modeling by introducing a compact residual U-Net–based solver pretrained to autoregressively emulate time stepping, enabling broad generalization across diverse PDEs with far fewer parameters than transformer-based FMs. Trained on a diverse CE-based pretraining set and finetuned on six unseen PDEs, SPUS transfers to complex dynamics such as incompressible Navier–Stokes and wave equations while maintaining competitive trajectory accuracy. Key contributions include demonstrating a 36M-parameter model that surpasses larger baselines on multiple downstream tasks, showing effective cross-domain transfer from CE to NS dynamics, and introducing lightweight adapters for flexible downstream adaptation. The results suggest a practical path toward highly parameter-efficient, generalizable PDE modeling suitable for rapid prototyping and multi-physics simulations.
Abstract
We introduce Small PDE U-Net Solver (SPUS), a compact and efficient foundation model (FM) designed as a unified neural operator for solving a wide range of partial differential equations (PDEs). Unlike existing state-of-the-art PDE FMs-primarily based on large complex transformer architectures with high computational and parameter overhead-SPUS leverages a lightweight residual U-Net-based architecture that has been largely underexplored as a foundation model architecture in this domain. To enable effective learning in this minimalist framework, we utilize a simple yet powerful auto-regressive pretraining strategy which closely replicates the behavior of numerical solvers to learn the underlying physics. SPUS is pretrained on a diverse set of fluid dynamics PDEs and evaluated across 6 challenging unseen downstream PDEs spanning various physical systems. Experimental results demonstrate that SPUS using residual U-Net based architecture achieves state-of-the-art generalization on these downstream tasks while requiring significantly fewer parameters and minimal fine-tuning data, highlighting its potential as a highly parameter-efficient FM for solving diverse PDE systems.
