AttNS: Attention-Inspired Numerical Solving For Limited Data Scenarios
Zhongzhan Huang, Mingfu Liang, Shanshan Zhong, Liang Lin
TL;DR
AttNS introduces an attention-inspired numerical solving framework to address generalization and robustness gaps in AI-Hybrid solvers when data are scarce. By embedding a Lipschitz-attention module into the forward integration of ODEs, informed by ResNet's dynamical-systems view, it yields a data-efficient solver with theoretical convergence guarantees and empirical robustness across high-dimensional and chaotic dynamics. The work provides both additive (AttNS) and multiplicative (AttNS-m) variants, demonstrates favorable generalization with reduced data, and conducts extensive ablations to validate architectural choices and input design. The approach advances data-efficient, stable numerical solving and offers a pathway to extending attention-based improvements to broader PDE contexts and complex dynamical systems.
Abstract
We propose the attention-inspired numerical solver (AttNS), a concise method that helps the generalization and robustness issues faced by the AI-Hybrid numerical solver in solving differential equations due to limited data. AttNS is inspired by the effectiveness of attention modules in Residual Neural Networks (ResNet) in enhancing model generalization and robustness for conventional deep learning tasks. Drawing from the dynamical system perspective of ResNet, we seamlessly incorporate attention mechanisms into the design of numerical methods tailored for the characteristics of solving differential equations. Our results on benchmarks, ranging from high-dimensional problems to chaotic systems, showcases AttNS consistently enhancing various numerical solvers without any intricate model crafting. Finally, we analyze AttNS experimentally and theoretically, demonstrating its ability to achieve strong generalization and robustness while ensuring the convergence of the solver. This includes requiring less data compared to other advanced methods to achieve comparable generalization errors and better prevention of numerical explosion issues when solving differential equations.
