Robust Calibration For Improved Weather Prediction Under Distributional Shift

Sankalp Gilda; Neel Bhandari; Wendy Mak; Andrea Panizza

Robust Calibration For Improved Weather Prediction Under Distributional Shift

Sankalp Gilda, Neel Bhandari, Wendy Mak, Andrea Panizza

TL;DR

The paper tackles robust weather prediction under real-world distributional shift by combining a mixture-density network with beta-likelihoods, Moment Exchange data augmentation for regularization, and post-hoc calibration to improve uncertainty estimates. It demonstrates competitive performance against boosted-tree baselines on a tabular weather dataset, leveraging robust per-domain calibration and ensemble strategies to manage domain shifts. Key findings show that calibration improves predictive reliability and that domain-aware calibration can yield larger gains in predictive uncertainty metrics, highlighting the importance of uncertainty quantification in non-IID settings. The work provides a practical framework for improving out-of-domain weather forecasting and uncertainty estimation with neural models in the presence of distributional shifts.

Abstract

In this paper, we present results on improving out-of-domain weather prediction and uncertainty estimation as part of the \texttt{Shifts Challenge on Robustness and Uncertainty under Real-World Distributional Shift} challenge. We find that by leveraging a mixture of experts in conjunction with an advanced data augmentation technique borrowed from the computer vision domain, in conjunction with robust \textit{post-hoc} calibration of predictive uncertainties, we can potentially achieve more accurate and better-calibrated results with deep neural networks than with boosted tree models for tabular data. We quantify our predictions using several metrics and propose several future lines of inquiry and experimentation to boost performance.

Robust Calibration For Improved Weather Prediction Under Distributional Shift

TL;DR

Abstract

Paper Structure (6 sections, 2 figures, 4 tables)

This paper contains 6 sections, 2 figures, 4 tables.

Introduction
Data
Method
Results
Future Work
Tables

Figures (2)

Figure 1: Impact of varying the probability parameter p in MoEx. p is the probability that a given sample will be augmented. Left: Negative log likelihood of robust, post-hoc predictions on TEST vs. p and mean absolute error. Right: Mean absolute error instead of NLL. For both plots, lower is better. We denote by dashed orange and blue lines the respective predicted metrics from NGBoost and CatBoost (see last lines in Tables \ref{['table:results_ngb']} and \ref{['table:results_cb']}, respectively).
Figure 2: Architecture of the mixture density network mdn0mdn1. PONO is the positional normalization layer pono, and the use of LeakyGate is inspired by simple_mods_tabular. We model the output variable fact_temperature conditioned on the input variables using a mixture of 5 $\beta$ distributions.

Robust Calibration For Improved Weather Prediction Under Distributional Shift

TL;DR

Abstract

Robust Calibration For Improved Weather Prediction Under Distributional Shift

Authors

TL;DR

Abstract

Table of Contents

Figures (2)