Table of Contents
Fetching ...

Earthfarseer: Versatile Spatio-Temporal Dynamical Systems Modeling in One Model

Hao Wu, Yuxuan Liang, Wei Xiong, Zhengyang Zhou, Wei Huang, Shilong Wang, Kun Wang

TL;DR

EarthFarseer tackles spatio-temporal dynamics modeling by addressing local fidelity, long-horizon prediction, scalability, and efficiency through a unified framework. It introduces FoTF for spatial reasoning (Local CNN plus Global Fourier-based Transformer) and TeDev for temporal evolution (multi-scale fully convolutional processing with FFT/IFFT), enabling flexible future-length predictions via a two-stage decoder. The approach achieves state-of-the-art performance across eight diverse datasets, with fast convergence and strong local-detail preservation, and is validated by extensive ablations. This framework offers practical, scalable ST modeling suitable for both natural and social dynamical systems, with broad applicability and efficient deployment.

Abstract

Efficiently modeling spatio-temporal (ST) physical processes and observations presents a challenging problem for the deep learning community. Many recent studies have concentrated on meticulously reconciling various advantages, leading to designed models that are neither simple nor practical. To address this issue, this paper presents a systematic study on existing shortcomings faced by off-the-shelf models, including lack of local fidelity, poor prediction performance over long time-steps,low scalability, and inefficiency. To systematically address the aforementioned problems, we propose an EarthFarseer, a concise framework that combines parallel local convolutions and global Fourier-based transformer architectures, enabling dynamically capture the local-global spatial interactions and dependencies. EarthFarseer also incorporates a multi-scale fully convolutional and Fourier architectures to efficiently and effectively capture the temporal evolution. Our proposal demonstrates strong adaptability across various tasks and datasets, with fast convergence and better local fidelity in long time-steps predictions. Extensive experiments and visualizations over eight human society physical and natural physical datasets demonstrates the state-of-the-art performance of EarthFarseer. We release our code at https://github.com/easylearningscores/EarthFarseer.

Earthfarseer: Versatile Spatio-Temporal Dynamical Systems Modeling in One Model

TL;DR

EarthFarseer tackles spatio-temporal dynamics modeling by addressing local fidelity, long-horizon prediction, scalability, and efficiency through a unified framework. It introduces FoTF for spatial reasoning (Local CNN plus Global Fourier-based Transformer) and TeDev for temporal evolution (multi-scale fully convolutional processing with FFT/IFFT), enabling flexible future-length predictions via a two-stage decoder. The approach achieves state-of-the-art performance across eight diverse datasets, with fast convergence and strong local-detail preservation, and is validated by extensive ablations. This framework offers practical, scalable ST modeling suitable for both natural and social dynamical systems, with broad applicability and efficient deployment.

Abstract

Efficiently modeling spatio-temporal (ST) physical processes and observations presents a challenging problem for the deep learning community. Many recent studies have concentrated on meticulously reconciling various advantages, leading to designed models that are neither simple nor practical. To address this issue, this paper presents a systematic study on existing shortcomings faced by off-the-shelf models, including lack of local fidelity, poor prediction performance over long time-steps,low scalability, and inefficiency. To systematically address the aforementioned problems, we propose an EarthFarseer, a concise framework that combines parallel local convolutions and global Fourier-based transformer architectures, enabling dynamically capture the local-global spatial interactions and dependencies. EarthFarseer also incorporates a multi-scale fully convolutional and Fourier architectures to efficiently and effectively capture the temporal evolution. Our proposal demonstrates strong adaptability across various tasks and datasets, with fast convergence and better local fidelity in long time-steps predictions. Extensive experiments and visualizations over eight human society physical and natural physical datasets demonstrates the state-of-the-art performance of EarthFarseer. We release our code at https://github.com/easylearningscores/EarthFarseer.
Paper Structure (29 sections, 9 equations, 7 figures, 3 tables)

This paper contains 29 sections, 9 equations, 7 figures, 3 tables.

Figures (7)

  • Figure 1: A natural phenomenon in which global and local evolution are inconsistent. The hurricanes primarily exhibit clockwise rotation while in certain localized areas, the presence of convection results in the emergence of counterclockwise rotation.
  • Figure 2: Left. We showcase the performance comparisons between our model and SOTA models across diverse domains. Middle. Convergence of our model compared to other models across different datasets. Right. Our model demonstrates exceptional capability in addressing long-time steps prediction problems.
  • Figure 3: The upper half of the image presents an overview of the model, where Fig (a), (b), and (c) respectively showcase the details of the spatial module, temporal module, and decoding module.
  • Figure 4: Model performance on 2DSWE dataset with different baselines. We measure the time it takes for the model to reach optimal performance by conducting fair executions across all frameworks on a Tesla V100-40GB.
  • Figure 5: Model performance on SEVIR dataset with different number of temporal layers.
  • ...and 2 more figures