Am I Confused or Is This Confusing?: Deep Ensembles for ENSO Uncertainty Quantification
Devin M. McAfee, Elizabeth A. Barnes
TL;DR
The paper tackles uncertainty quantification in climate predictions of ENSO under climate-change–driven covariate shift. It adopts large deep ensembles of probabilistic networks on CESM2-LE data, explicitly disentangling aleatoric and epistemic uncertainty with AU and EU definitions, including $\text{AU}(\mathbf{x}) = -\frac{1}{M}\sum_{i=1}^{M} \sum_{k=1}^{K} p_{\mathbf{w}_i}(y=k\mid \mathbf{x}) \log p_{\mathbf{w}_i}(y=k\mid \mathbf{x})$ and $\text{EU}(\mathbf{x}) = \frac{1}{M} \sum_{k=1}^{K} \sum_{i=1}^{M} (p_{\mathbf{w}_i}(y=k\mid \mathbf{x}) - p_{\text{ens}}(y=k\mid \mathbf{x}))^2$. The findings show that epistemic uncertainty robustly signals predictive error growth under warming scenarios, while aleatoric uncertainty becomes unreliable as the input distribution shifts; ensemble improvement scales with EU and increases with distributional shift, and temperature scaling can correct calibration biases to recover short-lead performance. These results support using deep ensembles for robust, interpretable UQ in climate prediction and highlight the need to account for epistemic uncertainty when forecasting under nonstationary climates.
Abstract
Faithful uncertainty quantification (UQ) is paramount in high stakes climate prediction. Deep ensembles, or ensembles of probabilistic neural networks, are state of the art for UQ in machine learning (ML) and are growing increasingly popular for weather and climate prediction. However, detailed analyses of the mechanisms, strengths, and limitations of ensembles in these complex problem settings are lacking. We take a step towards filling this gap by deploying deep ensembles for predictability analysis of the El-Niño Southern Oscillation (ENSO) in the Community Earth System Model 2 Large Ensemble (CESM2-LE). Principally, we show that epistemic uncertainty, modeled by ensemble disagreement, robustly signals predictive error growth associated with shifts in the distributions of monthly sea-surface temperature (SST), ocean heat content (OHC), and zonal surface wind stress ($τ_x$) anomalies under a climate change scenario. Conversely, we find that aleatoric uncertainty, which remains a popular measure of model confidence, becomes less reliable and behaves counterintuitively under climate-change-induced distributional shift. We highlight that, because ensemble performance improvement relative to the expected single model scales with epistemic uncertainty, ensemble improvement increases with distributional shift from climate change. This work demonstrates the utility of deep ensembles for modeling aleatoric and epistemic uncertainty in ML climate prediction, as well as the growing importance of robustly quantifying these two forms of uncertainty under anthropogenic warming.
