Learning dynamical systems with hit-and-run random feature maps
Pinak Mandal, Georg A. Gottwald
TL;DR
This work addresses forecasting chaotic dynamical systems using tanh-based random feature maps (RFMs) with fixed internal weights. It introduces a data-informed hit-and-run initialization, skip connections, deep stacking, and localization to overcome saturation, nonlinearity, and the curse of dimensionality, achieving state-of-the-art forecast skill with much smaller networks than reservoir computing approaches. The authors demonstrate strong single-trajectory forecasts and accurate long-time statistics on Lorenz-63, Lorenz-96, and Kuramoto-Sivashinsky, with depth and localization providing notable gains and scalable training times. The approach is computationally efficient, requires tuning only a single hyperparameter in many cases, and is complemented by open-source code and detailed appendices on algorithms and localization schemes. Together, these findings position deep/localized RFMs as competitive, scalable surrogates for data-driven forecasting of high-dimensional chaotic dynamics, with potential for integration with data assimilation and partial-noise scenarios.
Abstract
We show how random feature maps can be used to forecast dynamical systems with excellent forecasting skill. We consider the tanh activation function and judiciously choose the internal weights in a data-driven manner such that the resulting features explore the nonlinear, non-saturated regions of the activation function. We introduce skip connections and construct a deep variant of random feature maps by combining several units. To mitigate the curse of dimensionality, we introduce localization where we learn local maps, employing conditional independence. Our modified random feature maps provide excellent forecasting skill for both single trajectory forecasts as well as long-time estimates of statistical properties, for a range of chaotic dynamical systems with dimensions up to 512. In contrast to other methods such as reservoir computers which require extensive hyperparameter tuning, we effectively need to tune only a single hyperparameter, and are able to achieve state-of-the-art forecast skill with much smaller networks.
