Table of Contents
Fetching ...

Distillation and Interpretability of Ensemble Forecasts of ENSO Phase using Entropic Learning

Michael Groom, Davide Bassetti, Illia Horenko, Terence J. O'Kane

TL;DR

This work addresses the interpretability gap in state-of-the-art ENSO forecasts produced by ensembles of entropy-optimal eSPA models trained on observational/reanalysis data. By distilling successful ensemble members into lead-time specific superclusters and associated affiliation/transition structures, the authors retain probabilistic skill while enabling rigorous diagnostics, including spatial importance maps and reconstructed precursor pathways. The distilled models show RPSS comparable to or better than baselines for long lead times, reveal physically consistent precursors, and trace canonical pathways from extratropical/remote signals to mature ENSO states. The approach provides a practical, diagnostically rich framework that complements real-time forecasts and deepens understanding of long-range ENSO predictability, with potential extensions to operational settings and broader fields.

Abstract

This paper introduces a distillation framework for an ensemble of entropy-optimal Sparse Probabilistic Approximation (eSPA) models, trained exclusively on satellite-era observational and reanalysis data to predict ENSO phase up to 24 months in advance. While eSPA ensembles yield state-of-the-art forecast skill, they are harder to interpret than individual eSPA models. We show how to compress the ensemble into a compact set of "distilled" models by aggregating the structure of only those ensemble members that make correct predictions. This process yields a single, diagnostically tractable model for each forecast lead time that preserves forecast performance while also enabling diagnostics that are impractical to implement on the full ensemble. An analysis of the regime persistence of the distilled model "superclusters", as well as cross-lead clustering consistency, shows that the discretised system accurately captures the spatiotemporal dynamics of ENSO. By considering the effective dimension of the feature importance vectors, the complexity of the input space required for correct ENSO phase prediction is shown to peak when forecasts must cross the boreal spring predictability barrier. Spatial importance maps derived from the feature importance vectors are introduced to identify where predictive information resides in each field and are shown to include known physical precursors at certain lead times. Case studies of key events are also presented, showing how fields reconstructed from distilled model centroids trace the evolution from extratropical and inter-basin precursors to the mature ENSO state. Overall, the distillation framework enables a rigorous investigation of long-range ENSO predictability that complements real-time data-driven operational forecasts.

Distillation and Interpretability of Ensemble Forecasts of ENSO Phase using Entropic Learning

TL;DR

This work addresses the interpretability gap in state-of-the-art ENSO forecasts produced by ensembles of entropy-optimal eSPA models trained on observational/reanalysis data. By distilling successful ensemble members into lead-time specific superclusters and associated affiliation/transition structures, the authors retain probabilistic skill while enabling rigorous diagnostics, including spatial importance maps and reconstructed precursor pathways. The distilled models show RPSS comparable to or better than baselines for long lead times, reveal physically consistent precursors, and trace canonical pathways from extratropical/remote signals to mature ENSO states. The approach provides a practical, diagnostically rich framework that complements real-time forecasts and deepens understanding of long-range ENSO predictability, with potential extensions to operational settings and broader fields.

Abstract

This paper introduces a distillation framework for an ensemble of entropy-optimal Sparse Probabilistic Approximation (eSPA) models, trained exclusively on satellite-era observational and reanalysis data to predict ENSO phase up to 24 months in advance. While eSPA ensembles yield state-of-the-art forecast skill, they are harder to interpret than individual eSPA models. We show how to compress the ensemble into a compact set of "distilled" models by aggregating the structure of only those ensemble members that make correct predictions. This process yields a single, diagnostically tractable model for each forecast lead time that preserves forecast performance while also enabling diagnostics that are impractical to implement on the full ensemble. An analysis of the regime persistence of the distilled model "superclusters", as well as cross-lead clustering consistency, shows that the discretised system accurately captures the spatiotemporal dynamics of ENSO. By considering the effective dimension of the feature importance vectors, the complexity of the input space required for correct ENSO phase prediction is shown to peak when forecasts must cross the boreal spring predictability barrier. Spatial importance maps derived from the feature importance vectors are introduced to identify where predictive information resides in each field and are shown to include known physical precursors at certain lead times. Case studies of key events are also presented, showing how fields reconstructed from distilled model centroids trace the evolution from extratropical and inter-basin precursors to the mature ENSO state. Overall, the distillation framework enables a rigorous investigation of long-range ENSO predictability that complements real-time data-driven operational forecasts.
Paper Structure (17 sections, 1 theorem, 35 equations, 15 figures)

This paper contains 17 sections, 1 theorem, 35 equations, 15 figures.

Key Result

Theorem 1

Assume that the distribution of $X_d$ is continuous and symmetric about some $\mu\in\mathbb{R}$ (i.e. $X_d-\mu =-(X_d-\mu)$). Then

Figures (15)

  • Figure 1: Schematic of the distillation procedure. Starting with an ensemble of eSPA models (a)---see Figure 2 of Groom2025 for a complete description---a set of superclusters is fit to the centroids of each eSPA model (b), and the aggregated feature importance is calculated (c). Then, fuzzy affiliations over these superclusters are calculated (d), along with a matrix of conditional probabilities based on these fuzzy affiliations (e). The legend in the bottom left contains the key equations involved in each of these steps. The dashed lines in (b) indicate the decision boundaries.
  • Figure 2: The affiliation probabilities $\Gamma^{(n)}$ vs. target date for a lead time of $n$=3 months over the evaluation period of January 2002 to December 2024 (a), and the 3-month running average of the Niño3.4 index over this same period (b). The background shading in (b) corresponds to target dates where the Niño3.4 index is $\ge$0.5$^\circ$ (red) or $\le-$0.5$^\circ$ (blue).
  • Figure 3: Visualisations of the lag-0 composites of SST, $\mathrm dT/\mathrm dz$ and wind stress corresponding to superclusters 12, 6, 5, 3 and 1 at $n$=3 months lead time, as well as the projection of the centroids for these clusters onto the 3 most important feature dimensions at this lead time. The projection of the features $X$ onto these 3 dimensions is also shown, with each point coloured by its corresponding label, i.e. the phase of the Niño3.4 index in 3 months time.
  • Figure 4: The transition probability matrix $P^{(n)}$ for a lead time of $n$=3 months, including 95% confidence interval estimates (a), and a visualisation of the matrix as a directed cyclic graph (b), where only the probabilities $P^{(n)}_{ij}$$\ge$0.05 are shown as edges. The nodes are coloured by the expected value of the conditional probabilities $\Lambda^{(n)}_{:,k}$ for each cluster, and the edges are coloured (and sized) by the transition probabilities $P^{(n)}_{ij}$.
  • Figure 5: (a) The Bayesian network $\mathcal{G}$ of lead $n$$\rightarrow$$n$$-1$ transition probabilities. (b) SST composites for each cluster on the most probable path between node 9 ($n$$=$24 months) to node 3 ($n$$=$0 months), highlighted in red.
  • ...and 10 more figures

Theorems & Definitions (3)

  • Theorem 1: Large-$\varepsilon$ limit of \ref{['eq:deff-threshold']}
  • proof
  • Remark 1