Table of Contents
Fetching ...

Self-supervised Spatial-Temporal Learner for Precipitation Nowcasting

Haotian Li, Arno Siebes, Siamak Mehrkanoon

TL;DR

This work tackles precipitation nowcasting within a short horizon by introducing SpaT-SparK, a self-supervised spatial-temporal learning framework. It combines a CNN-based encoder–decoder pretrained with masked image modeling and a translation network to model temporal relationships between past and future precipitation maps. The method, validated on the NL-50 Netherlands dataset, outperforms baselines like SmaAt-UNet and demonstrates improved precision and reduced false alarms, underscoring the benefit of self-supervised representation learning for weather forecasting. The findings suggest that data-efficient SSL pretraining plus a temporal translation mechanism can significantly enhance nowcasting performance with practical implications for flood control, water management, and urban drainage planning.

Abstract

Nowcasting, the short-term prediction of weather, is essential for making timely and weather-dependent decisions. Specifically, precipitation nowcasting aims to predict precipitation at a local level within a 6-hour time frame. This task can be framed as a spatial-temporal sequence forecasting problem, where deep learning methods have been particularly effective. However, despite advancements in self-supervised learning, most successful methods for nowcasting remain fully supervised. Self-supervised learning is advantageous for pretraining models to learn representations without requiring extensive labeled data. In this work, we leverage the benefits of self-supervised learning and integrate it with spatial-temporal learning to develop a novel model, SpaT-SparK. SpaT-SparK comprises a CNN-based encoder-decoder structure pretrained with a masked image modeling (MIM) task and a translation network that captures temporal relationships among past and future precipitation maps in downstream tasks. We conducted experiments on the NL-50 dataset to evaluate the performance of SpaT-SparK. The results demonstrate that SpaT-SparK outperforms existing baseline supervised models, such as SmaAt-UNet, providing more accurate nowcasting predictions.

Self-supervised Spatial-Temporal Learner for Precipitation Nowcasting

TL;DR

This work tackles precipitation nowcasting within a short horizon by introducing SpaT-SparK, a self-supervised spatial-temporal learning framework. It combines a CNN-based encoder–decoder pretrained with masked image modeling and a translation network to model temporal relationships between past and future precipitation maps. The method, validated on the NL-50 Netherlands dataset, outperforms baselines like SmaAt-UNet and demonstrates improved precision and reduced false alarms, underscoring the benefit of self-supervised representation learning for weather forecasting. The findings suggest that data-efficient SSL pretraining plus a temporal translation mechanism can significantly enhance nowcasting performance with practical implications for flood control, water management, and urban drainage planning.

Abstract

Nowcasting, the short-term prediction of weather, is essential for making timely and weather-dependent decisions. Specifically, precipitation nowcasting aims to predict precipitation at a local level within a 6-hour time frame. This task can be framed as a spatial-temporal sequence forecasting problem, where deep learning methods have been particularly effective. However, despite advancements in self-supervised learning, most successful methods for nowcasting remain fully supervised. Self-supervised learning is advantageous for pretraining models to learn representations without requiring extensive labeled data. In this work, we leverage the benefits of self-supervised learning and integrate it with spatial-temporal learning to develop a novel model, SpaT-SparK. SpaT-SparK comprises a CNN-based encoder-decoder structure pretrained with a masked image modeling (MIM) task and a translation network that captures temporal relationships among past and future precipitation maps in downstream tasks. We conducted experiments on the NL-50 dataset to evaluate the performance of SpaT-SparK. The results demonstrate that SpaT-SparK outperforms existing baseline supervised models, such as SmaAt-UNet, providing more accurate nowcasting predictions.

Paper Structure

This paper contains 10 sections, 11 equations, 2 figures, 3 tables.

Figures (2)

  • Figure 1: Schematic of SpaT-SparK model in pretraining and fine-tuning mode. (a) In pretraining, only the encoder, the decoder, and the densifying network are pretrained. (b) Architecture of encoder (ResNet) blocks and decoder (UNet) blocks. (c) Illustration of densify network and successive projection. (d) In fine-tuning, the encoder, the decoder, and the densifying network are initialized with pretrained weight, and the translation network is trained from scratch. The 4th hierarchy is adapted for better visualization purpose; the visualizations of the outputs in (a) and (d) are adapted for illustration purposes, and do not reflect the actual predictions.
  • Figure 2: Performance of the models at each time step. SparK and SpaT-SparK both use ResNet-18 as the encoder. "$\downarrow$" indicates the lower the value is, the better the performance is.

Theorems & Definitions (1)

  • Remark