Table of Contents
Fetching ...

SIFM: A Foundation Model for Multi-granularity Arctic Sea Ice Forecasting

Jingyi Xu, Yeqi Luo, Weidong Yang, Keyi Liu, Shengnan Wang, Ben Fei, Lei Bai

TL;DR

This study proposes to cultivate temporal multi-granularity that naturally derived from Arctic sea ice reanalysis data and provides a unified perspective for modeling SIC via the Sea Ice Foundation Model, and shows that SIFM outperforms off-the-shelf deep learning models for their specific temporal granularity.

Abstract

Arctic sea ice performs a vital role in global climate and has paramount impacts on both polar ecosystems and coastal communities. In the last few years, multiple deep learning based pan-Arctic sea ice concentration (SIC) forecasting methods have emerged and showcased superior performance over physics-based dynamical models. However, previous methods forecast SIC at a fixed temporal granularity, e.g. sub-seasonal or seasonal, thus only leveraging inter-granularity information and overlooking the plentiful inter-granularity correlations. SIC at various temporal granularities exhibits cumulative effects and are naturally consistent, with short-term fluctuations potentially impacting long-term trends and long-term trends provides effective hints for facilitating short-term forecasts in Arctic sea ice. Therefore, in this study, we propose to cultivate temporal multi-granularity that naturally derived from Arctic sea ice reanalysis data and provide a unified perspective for modeling SIC via our Sea Ice Foundation Model. SIFM is delicately designed to leverage both intra-granularity and inter-granularity information for capturing granularity-consistent representations that promote forecasting skills. Our extensive experiments show that SIFM outperforms off-the-shelf deep learning models for their specific temporal granularity.

SIFM: A Foundation Model for Multi-granularity Arctic Sea Ice Forecasting

TL;DR

This study proposes to cultivate temporal multi-granularity that naturally derived from Arctic sea ice reanalysis data and provides a unified perspective for modeling SIC via the Sea Ice Foundation Model, and shows that SIFM outperforms off-the-shelf deep learning models for their specific temporal granularity.

Abstract

Arctic sea ice performs a vital role in global climate and has paramount impacts on both polar ecosystems and coastal communities. In the last few years, multiple deep learning based pan-Arctic sea ice concentration (SIC) forecasting methods have emerged and showcased superior performance over physics-based dynamical models. However, previous methods forecast SIC at a fixed temporal granularity, e.g. sub-seasonal or seasonal, thus only leveraging inter-granularity information and overlooking the plentiful inter-granularity correlations. SIC at various temporal granularities exhibits cumulative effects and are naturally consistent, with short-term fluctuations potentially impacting long-term trends and long-term trends provides effective hints for facilitating short-term forecasts in Arctic sea ice. Therefore, in this study, we propose to cultivate temporal multi-granularity that naturally derived from Arctic sea ice reanalysis data and provide a unified perspective for modeling SIC via our Sea Ice Foundation Model. SIFM is delicately designed to leverage both intra-granularity and inter-granularity information for capturing granularity-consistent representations that promote forecasting skills. Our extensive experiments show that SIFM outperforms off-the-shelf deep learning models for their specific temporal granularity.

Paper Structure

This paper contains 18 sections, 6 equations, 7 figures, 3 tables.

Figures (7)

  • Figure 1: Visualization of Arctic sea ice trends. (a)The annual average SIC and SIE trend over the last 35 years (1987-2023); the monthly cyclic trend of SIC (b) and SIE (c). Note that the averaged SIC values are calculated over the entire pan-Arctic region which could only be used to observe the trend.
  • Figure 2: The main differences between (a) existing mainstream SIC forecasting approaches and (b) our SIFM are follows: (1) Previous models take a channel-wise fusion to jointly extract spatial features, e.g., utilizing 2D convolution to expand and downsample SIC channels. In our case, we focus on capturing effective spatial tokens representation of SIC by the shared spatial encoder. (2) The correlation among input SIC sequence is implicitly modeled via the U-Net-based architecture in (a) while SIFM explicitly captures intra-granularity and inter-granularity correlation via sequential modeling. (3) We propose leveraging multi-granularity information that is naturally derived from the SIC and embedding it into granularity variates to improve overall forecasting skills.
  • Figure 3: Overview of proposed SIFM, which comprises three main components: (1) The shared spatial encoder first independently extracts spatial features of input SIC from each granularity (i.e. 7 days, 8 weeks' averages and 6 months' averages) to obtain spatial tokens, and then concatenates these spatial tokens accordingly. (2) The embedded spatial tokens are subsequently flattened with respect to their granularity and linearly projected into the same length. We propose to utilize an encoder-only transformer backbone to perform multi-granularity fusion which explicitly captures both inter-granularity and intra-granularity sequential features. (3) Lastly, the predicted multi-granularity features are restored to the shape of the input via linear transformation and the shared spatial decoder.
  • Figure 4: Comparison between different backbones for temporal sequence modeling: (a) Our proposed SIFM sequentially concatenates independent SIC tokens that are derived from each temporal scale as a granularity variate and applies an attention mechanism over the embedded variate tokens. The FFN transforms the variate representation for the input of the next layer; (b) For vanilla Transformer architecture vaswani2017attention, it applies an attention mechanism over temporal tokens and FFN is applied on multivariate representations; (c) The MLP-mixer tolstikhin2021mlp approach first performs token-wise mixing, then transpose the extracted features to apply channel-wise mixing. The vanilla Transformer and MLP-mixer both fall short of modeling the sequential information of sea ice.
  • Figure 5: Qualitative analysis of SIE prediction. The derived SIE ground truth and prediction generated by SIFM and three single-granularity models (one for each temproal granularity) over: (a) The first week of September; (b) 4 weeks; (c) 1 month. Considering the abnormal increase of Arctic sea ice in 2022, our proposed method could still produce reasonable forecasts that keep the similar overall shape of Arctic SIE.
  • ...and 2 more figures