Table of Contents
Fetching ...

ChaosNexus: A Foundation Model for ODE-based Chaotic System Forecasting with Hierarchical Multi-scale Awareness

Chang Liu, Bohao Zhao, Jingtao Ding, Yong Li

TL;DR

ChaosNexus introduces a foundation model for chaotic ODE forecasting that explicitly addresses multi-scale temporal structure and cross-system spectral heterogeneity. By integrating a ScaleFormer backbone with Mixture-of-Experts blocks and a wavelet-based frequency fingerprint, the model learns universal dynamical priors from a large synthetic chaotic corpus and achieves state-of-the-art zero-shot performance, including sub-1°C MAE for 5-day global weather forecasts. The approach maintains short-term point accuracy while preserving long-term attractor statistics through an MMD-based distributional loss and MoE load-balancing, with strong ablations confirming the contribution of each component. The work demonstrates robust cross-domain generalization and offers practical implications for scientific forecasting where data are sparse or expensive to obtain.

Abstract

Foundation models have shown great promise in achieving zero-shot or few-shot forecasting for ODE-based chaotic systems via large-scale pretraining. However, existing architectures often fail to capture the multi-scale temporal structures and distinct spectral characteristics of chaotic dynamics. To address this, we introduce ChaosNexus, a foundation model for chaotic system forecasting underpinned by the proposed ScaleFormer architecture. By processing temporal contexts across hierarchically varying patch sizes, ChaosNexus effectively captures long-range dependencies and preserves high-frequency fluctuations. To address heterogeneity across distinct systems, we integrate Mixture-of-Experts (MoE) layers into each ScaleFormer block and explicitly condition the final forecasts on a learned frequency fingerprint, providing the model with a global spectral view of the system. Extensive evaluations on over 9,000 synthetic systems demonstrate that ChaosNexus achieves superior fidelity in long-term attractor statistics while maintaining competitive point-wise accuracy. Furthermore, in real-world applications, it achieves a remarkable zero-shot mean error below 1°C for 5-day station-based weather forecasting. Codes are available at https://github.com/TomXaxaxa/ChaosNexus.

ChaosNexus: A Foundation Model for ODE-based Chaotic System Forecasting with Hierarchical Multi-scale Awareness

TL;DR

ChaosNexus introduces a foundation model for chaotic ODE forecasting that explicitly addresses multi-scale temporal structure and cross-system spectral heterogeneity. By integrating a ScaleFormer backbone with Mixture-of-Experts blocks and a wavelet-based frequency fingerprint, the model learns universal dynamical priors from a large synthetic chaotic corpus and achieves state-of-the-art zero-shot performance, including sub-1°C MAE for 5-day global weather forecasts. The approach maintains short-term point accuracy while preserving long-term attractor statistics through an MMD-based distributional loss and MoE load-balancing, with strong ablations confirming the contribution of each component. The work demonstrates robust cross-domain generalization and offers practical implications for scientific forecasting where data are sparse or expensive to obtain.

Abstract

Foundation models have shown great promise in achieving zero-shot or few-shot forecasting for ODE-based chaotic systems via large-scale pretraining. However, existing architectures often fail to capture the multi-scale temporal structures and distinct spectral characteristics of chaotic dynamics. To address this, we introduce ChaosNexus, a foundation model for chaotic system forecasting underpinned by the proposed ScaleFormer architecture. By processing temporal contexts across hierarchically varying patch sizes, ChaosNexus effectively captures long-range dependencies and preserves high-frequency fluctuations. To address heterogeneity across distinct systems, we integrate Mixture-of-Experts (MoE) layers into each ScaleFormer block and explicitly condition the final forecasts on a learned frequency fingerprint, providing the model with a global spectral view of the system. Extensive evaluations on over 9,000 synthetic systems demonstrate that ChaosNexus achieves superior fidelity in long-term attractor statistics while maintaining competitive point-wise accuracy. Furthermore, in real-world applications, it achieves a remarkable zero-shot mean error below 1°C for 5-day station-based weather forecasting. Codes are available at https://github.com/TomXaxaxa/ChaosNexus.

Paper Structure

This paper contains 60 sections, 23 equations, 35 figures, 10 tables.

Figures (35)

  • Figure 1: Motivating observations. (a) Spectral entropy distributions of synthetic chaotic systems lai2025panda versus general time series liu2023itransformer (including Electricity, ETT, and Exchange Rate). (b--c) Power spectra of representatives from Lorenz-63 and Lorenz-96 systems.
  • Figure 2: Overview of our ChaosNexus framework, with details of patch merging and expansion operations, and the Transformer block architecture with mixture-of-experts layers.
  • Figure 3: Zero-shot forecasting performances of models on synthetic chaotic systems. Each box shows the median (center line), the middle 50% of results (box), and the overall range (whiskers). The inset plot shows the mean performance with the 95% CI of ChaosNexus and Panda. Asterisks indicate statistically significant differences determined by the Wilcoxon signed-rank test (*: $p<0.05$, **: $p<0.01$).
  • Figure 4: Results on Lorenz96 systems with different spectral entropy. We vary parameter $F$ to modulate spectral entropy. The panels display the mean performance with the 95% CI.
  • Figure 5: Forecasting performance for global temperature on the WEATHER-5K dataset. The Mean Absolute Error (MAE) of ChaosNexus and baseline models is compared across multiple prediction horizons after fine-tuning on 85K (0.1%) and 473K (0.5%) samples. The zero-shot performance of ChaosNexus is shown as a dashed line for reference.
  • ...and 30 more figures