Table of Contents
Fetching ...

TeleViT1.0: Teleconnection-aware Vision Transformers for Subseasonal to Seasonal Wildfire Pattern Forecasts

Ioannis Prapas, Nikolaos Papadopoulos, Nikolaos-Ioannis Bountos, Dimitrios Michail, Gustau Camps-Valls, Ioannis Papoutsis

TL;DR

This work tackles subseasonal-to-seasonal wildfire forecasting by introducing TeleViT, a teleconnection-aware Vision Transformer that fuses fine-grained local drivers with coarsened global fields and teleconnection indices. The asymmetric tokenization enables joint processing of heterogeneous inputs, with a per-local-patch decoder preserving spatial structure. TeleViT demonstrates superior AUPRC across lead times up to four months compared to U-Net++, ViT, and climatology, and analyses show local information dominates while global cues provide contextual support. Regional analyses reveal strongest performance in seasonally consistent fire regimes, and attention/attribution studies enhance interpretability of input contributions. Overall, explicit modeling of large-scale Earth-system context extends wildfire predictability on S2S timescales and opens avenues for further temporal tokenization and causal-attention approaches.

Abstract

Forecasting wildfires weeks to months in advance is difficult, yet crucial for planning fuel treatments and allocating resources. While short-term predictions typically rely on local weather conditions, long-term forecasting requires accounting for the Earth's interconnectedness, including global patterns and teleconnections. We introduce TeleViT, a Teleconnection-aware Vision Transformer that integrates (i) fine-scale local fire drivers, (ii) coarsened global fields, and (iii) teleconnection indices. This multi-scale fusion is achieved through an asymmetric tokenization strategy that produces heterogeneous tokens processed jointly by a transformer encoder, followed by a decoder that preserves spatial structure by mapping local tokens to their corresponding prediction patches. Using the global SeasFire dataset (2001-2021, 8-day resolution), TeleViT improves AUPRC performance over U-Net++, ViT, and climatology across all lead times, including horizons up to four months. At zero lead, TeleViT with indices and global inputs reaches AUPRC 0.630 (ViT 0.617, U-Net 0.620), at 16x8day lead (around 4 months), TeleViT variants using global input maintain 0.601-0.603 (ViT 0.582, U-Net 0.578), while surpassing the climatology (0.572) at all lead times. Regional results show the highest skill in seasonally consistent fire regimes, such as African savannas, and lower skill in boreal and arid regions. Attention and attribution analyses indicate that predictions rely mainly on local tokens, with global fields and indices contributing coarse contextual information. These findings suggest that architectures explicitly encoding large-scale Earth-system context can extend wildfire predictability on subseasonal-to-seasonal timescales.

TeleViT1.0: Teleconnection-aware Vision Transformers for Subseasonal to Seasonal Wildfire Pattern Forecasts

TL;DR

This work tackles subseasonal-to-seasonal wildfire forecasting by introducing TeleViT, a teleconnection-aware Vision Transformer that fuses fine-grained local drivers with coarsened global fields and teleconnection indices. The asymmetric tokenization enables joint processing of heterogeneous inputs, with a per-local-patch decoder preserving spatial structure. TeleViT demonstrates superior AUPRC across lead times up to four months compared to U-Net++, ViT, and climatology, and analyses show local information dominates while global cues provide contextual support. Regional analyses reveal strongest performance in seasonally consistent fire regimes, and attention/attribution studies enhance interpretability of input contributions. Overall, explicit modeling of large-scale Earth-system context extends wildfire predictability on S2S timescales and opens avenues for further temporal tokenization and causal-attention approaches.

Abstract

Forecasting wildfires weeks to months in advance is difficult, yet crucial for planning fuel treatments and allocating resources. While short-term predictions typically rely on local weather conditions, long-term forecasting requires accounting for the Earth's interconnectedness, including global patterns and teleconnections. We introduce TeleViT, a Teleconnection-aware Vision Transformer that integrates (i) fine-scale local fire drivers, (ii) coarsened global fields, and (iii) teleconnection indices. This multi-scale fusion is achieved through an asymmetric tokenization strategy that produces heterogeneous tokens processed jointly by a transformer encoder, followed by a decoder that preserves spatial structure by mapping local tokens to their corresponding prediction patches. Using the global SeasFire dataset (2001-2021, 8-day resolution), TeleViT improves AUPRC performance over U-Net++, ViT, and climatology across all lead times, including horizons up to four months. At zero lead, TeleViT with indices and global inputs reaches AUPRC 0.630 (ViT 0.617, U-Net 0.620), at 16x8day lead (around 4 months), TeleViT variants using global input maintain 0.601-0.603 (ViT 0.582, U-Net 0.578), while surpassing the climatology (0.572) at all lead times. Regional results show the highest skill in seasonally consistent fire regimes, such as African savannas, and lower skill in boreal and arid regions. Attention and attribution analyses indicate that predictions rely mainly on local tokens, with global fields and indices contributing coarse contextual information. These findings suggest that architectures explicitly encoding large-scale Earth-system context can extend wildfire predictability on subseasonal-to-seasonal timescales.

Paper Structure

This paper contains 20 sections, 4 equations, 9 figures, 2 tables.

Figures (9)

  • Figure 1: Architecture of the TeleViT model. The model processes three types of inputs (local, global, and indices) through an asymmetric tokenization strategy, followed by a transformer encoder and linear decoder to produce predictions.
  • Figure 2: Performance comparison across different input configurations and lead times. The plot shows how different combinations of input types affect model performance, with TeleViT$_{i,g}$ consistently outperforming other variants across all forecasting horizons.
  • Figure 3: Sample global predictions (week 0 and week 25) of the test year versus the target for the best model, predicting at a lead forecasting horizon of 0 and 16 $\times$ 8-day steps. Values below 0.01 are masked. Confidence is determined as the softmax score of the positive prediction. In the bottom, we see the distributions of the prediction scores for the positive class for the two sample global prediction maps.
  • Figure 4: AUPRC scores of the Televit$_{ig}$ model for all GFED regions and lead times. The performance of the Climatology baseline is shown as a red dashed horizontal line.
  • Figure 5: Bar charts of mean attention weights with standard deviations (error bars) for each token type (local, global, indices) across prediction patches around the world. The left chart shows results for a forecasting horizon of $h=0$$\times$ 8-days, while the right chart shows results for $h=16$$\times$ 8-days. The bottom map indicates the geographical locations of the selected patches.
  • ...and 4 more figures