TANTE: Time-Adaptive Operator Learning via Neural Taylor Expansion
Zhikai Wu, Sifan Wang, Shiyang Zhang, Sizhuang He, Min Zhu, Anran Jiao, Lu Lu, David van Dijk
TL;DR
This work addresses the challenge of time-dependent PDE operator learning with fixed time steps, which can cause error accumulation and inefficiency during rollout. It introduces TANTE, a Time-Adaptive Transformer that uses Neural Taylor Expansion to predict multiple temporal derivatives and a local convergence radius, enabling continuous-time predictions with adaptive step sizes via a Taylor-series rollout $ ilde{\mathbf{u}}(t)=\mathbf{u}(0)+\sum_{k=1}^{n}{\tilde{\mathbf{u}}^{(k)}(0)\, t^{k}/k!}$ within $[0,\tilde{r}_t]$. The architecture comprises a spatiotemporal encoder, a Transformer Processor that estimates derivatives up to order $n$, and a spatiotemporal decoder that outputs the derivatives and $\tilde{r}_t$, with a regularization term to prevent degenerate radii. Empirically, TANTE achieves state-of-the-art predictive accuracy and efficiency across four challenging PDE benchmarks, demonstrates robust scalability with model size and expansion order, and reveals meaningful adaptivity of the convergence radius across system parameters and trajectories, reducing error accumulation and enabling more efficient simulations. This framework offers a practical, scalable path toward adaptive surrogate models for complex, multi-scale dynamical systems.
Abstract
Operator learning for time-dependent partial differential equations (PDEs) has seen rapid progress in recent years, enabling efficient approximation of complex spatiotemporal dynamics. However, most existing methods rely on fixed time step sizes during rollout, which limits their ability to adapt to varying temporal complexity and often leads to error accumulation. Here, we propose the Time-Adaptive Transformer with Neural Taylor Expansion (TANTE), a novel operator-learning framework that produces continuous-time predictions with adaptive step sizes. TANTE predicts future states by performing a Taylor expansion at the current state, where neural networks learn both the higher-order temporal derivatives and the local radius of convergence. This allows the model to dynamically adjust its rollout based on the local behavior of the solution, thereby reducing cumulative error and improving computational efficiency. We demonstrate the effectiveness of TANTE across a wide range of PDE benchmarks, achieving superior accuracy and adaptability compared to fixed-step baselines, delivering accuracy gains of 60-80 % and speed-ups of 30-40 % at inference time. The code is publicly available at https://github.com/zwu88/TANTE for transparency and reproducibility.
