Probability calibration for precipitation nowcasting
Lauri Kurki, Yaniel Cabrera, Samu Karanko
TL;DR
Precipitation nowcasting with neural models increasingly requires probabilistic outputs, but conventional calibration metrics fail to capture miscalibration across precipitation thresholds. The authors introduce the expected thresholded calibration error (ETCE) to measure calibration over multiple precipitation thresholds and adapt calibration methods from computer vision, including selective scaling with lead-time conditioning. They find that selective scaling with an MLP or Segformer calibrator reduces ETCE by up to about 23%, while temperature scaling variants offer limited benefit. These results provide a practical path to more reliable probabilistic precipitation nowcasting by conditioning calibrators on lead time and mispredictions.
Abstract
Reliable precipitation nowcasting is critical for weather-sensitive decision-making, yet neural weather models (NWMs) can produce poorly calibrated probabilistic forecasts. Standard calibration metrics such as the expected calibration error (ECE) fail to capture miscalibration across precipitation thresholds. We introduce the expected thresholded calibration error (ETCE), a new metric that better captures miscalibration in ordered classes like precipitation amounts. We extend post-processing techniques from computer vision to the forecasting domain. Our results show that selective scaling with lead time conditioning reduces model miscalibration without reducing the forecast quality.
