SFTformer: A Spatial-Frequency-Temporal Correlation-Decoupling Transformer for Radar Echo Extrapolation

Liangyu Xu; Wanxuan Lu; Hongfeng Yu; Fanglong Yao; Xian Sun; Kun Fu

SFTformer: A Spatial-Frequency-Temporal Correlation-Decoupling Transformer for Radar Echo Extrapolation

Liangyu Xu, Wanxuan Lu, Hongfeng Yu, Fanglong Yao, Xian Sun, Kun Fu

TL;DR

SFTformer tackles radar echo extrapolation by decoupling spatial morphology from temporal evolution within a Transformer framework. The SFT-Block combines a Spatiotemporal Correlation Layer, Spatial Refinement Layer, and Temporal Modeling Layer to learn coupled dynamics while separately refining appearance and motion, aided by a Frequency Enhanced Block for periodic patterns. A Prediction-Reconstruction Joint Training paradigm reinforces memory of historical echoes, improving long-term forecasts. Evaluated on HKO-7 and ChinaNorth-2021, SFTformer delivers state-of-the-art performance for 1-, 2-, and 3-hour nowcasts, demonstrating robust handling of both short- and long-term precipitation dynamics. The work offers a practical, memory-efficient pathway for radar-based weather nowcasting with potential extensions to other spatiotemporal forecasting domains.

Abstract

Extrapolating future weather radar echoes from past observations is a complex task vital for precipitation nowcasting. The spatial morphology and temporal evolution of radar echoes exhibit a certain degree of correlation, yet they also possess independent characteristics. {Existing methods learn unified spatial and temporal representations in a highly coupled feature space, emphasizing the correlation between spatial and temporal features but neglecting the explicit modeling of their independent characteristics, which may result in mutual interference between them.} To effectively model the spatiotemporal dynamics of radar echoes, we propose a Spatial-Frequency-Temporal correlation-decoupling Transformer (SFTformer). The model leverages stacked multiple SFT-Blocks to not only mine the correlation of the spatiotemporal dynamics of echo cells but also avoid the mutual interference between the temporal modeling and the spatial morphology refinement by decoupling them. Furthermore, inspired by the practice that weather forecast experts effectively review historical echo evolution to make accurate predictions, SFTfomer incorporates a joint training paradigm for historical echo sequence reconstruction and future echo sequence prediction. Experimental results on the HKO-7 dataset and ChinaNorth-2021 dataset demonstrate the superior performance of SFTfomer in short(1h), mid(2h), and long-term(3h) precipitation nowcasting.

SFTformer: A Spatial-Frequency-Temporal Correlation-Decoupling Transformer for Radar Echo Extrapolation

TL;DR

Abstract

Paper Structure (23 sections, 28 equations, 12 figures, 4 tables)

This paper contains 23 sections, 28 equations, 12 figures, 4 tables.

Introduction
Related Works
Deep Learning-based Radar Echo Extrapolation
Spatiotemporal Predictive Learning
Transformers for Time Series Forecasting and Vision Tasks
Methodology
Preliminary
Overall Framework
SFT-Block
Spatiotemporal Correlation Layer
Spatial Refinement Layer
Temporal Modeling Layer
Prediction-Reconstruction Joint Training Paradigm
Experiments
Dataset Description
...and 8 more sections

Figures (12)

Figure 1: Radar echo sequences observed during 4 periods from HKO-7 dataset. The 4 sequences exhibit distinct spatial morphologies but share a common evolution lifecycle characterized by initiation, maturation and decay of echo evolution.
Figure 2: The overall framework of SFTformer. The historical echo sequence ${\mathcal{X}}_{t, T}$ is compactly embedded to ${\mathcal{H}}$ through Feature Embedding and then entered into N stacked SFT-Blocks to learn spatiotemporal dynamics. SFT-Block has a correlation-decoupling hierarchical architecture. The spatiotemporal correlation layer mines global coarse spatiotemporal correlation. The spatial refinement module models the refined morphological features of radar echo using a higher channel dimension. The temporal modeling layer learns temporal evolution patterns by means of temporal interaction and frequency analysis. The features after SFT-Block ${\mathcal{H}}^{'}$ pass through the Forecasting module (spatial upsampling) to the final prediction $\boldsymbol{\hat{Y}}$. In addition, the reconstruction branch excavates the motion pattern of the echo, and the even-frame features in ${\mathcal{H}}^{'}$ jointly recover the odd-frame features in H through the Reconstruction module.
Figure 3: Both the spatiotemporal correlation layer and the spatial refinement layer use the Swin Transformer block as the basic unit, but the window partitioning strategies for calculating attention are different: (a) for the ST-Correlation layer and (b) for the spatial refinement layer. (c) shows details of the Swin Block.
Figure 4: Detailed procedure for temporal modeling layer. The spatiotemporal features are first coded through the temporal feature embedding shown in (a), and then entered into the temporal modeling layer in (b) to learn the temporal evolution law of the echo. (c) details the specific structure of the frequency enhanced block.
Figure 5: (a) A precipitation case in the Hong Kong region, with the selected observation point marked by a red circle. (b) Temporal variation of radar echo intensity at the annotated observation point. (c) Frequency domain representation of radar echo intensity at the annotated observation point.
...and 7 more figures

SFTformer: A Spatial-Frequency-Temporal Correlation-Decoupling Transformer for Radar Echo Extrapolation

TL;DR

Abstract

SFTformer: A Spatial-Frequency-Temporal Correlation-Decoupling Transformer for Radar Echo Extrapolation

Authors

TL;DR

Abstract

Table of Contents

Figures (12)