Table of Contents
Fetching ...

A Temporal Stochastic Bias Correction using a Machine Learning Attention model

Omer Nivron, Damon J. Wischik, Mathieu Vrac, Emily Shuckburgh, Alex T. Archibald

TL;DR

This work reframes bias correction for climate models as a time-indexed probabilistic regression task with stochastic outputs, enabling learning of temporal biases and asynchronicities. By adapting the Taylorformer attention model, the method learns complex temporal dependencies and produces distributional bias-corrected time-series for heatwave statistics. In Abuja and Tokyo, the approach outperforms standard BC methods in heatwave duration counts and log-likelihood, demonstrating improved fidelity to observations under distributional shifts. The framework holds promise for scalable, regionally differentiated BC that could enhance climate impact assessments and policy decisions.

Abstract

Climate models are biased with respect to real-world observations. They usually need to be adjusted before being used in impact studies. The suite of statistical methods that enable such adjustments is called bias correction (BC). However, BC methods currently struggle to adjust temporal biases. Because they mostly disregard the dependence between consecutive time points. As a result, climate statistics with long-range temporal properties, such as heatwave duration and frequency, cannot be corrected accurately. This makes it more difficult to produce reliable impact studies on such climate statistics. This paper offers a novel BC methodology to correct temporal biases. This is made possible by rethinking the philosophy behind BC. We will introduce BC as a time-indexed regression task with stochastic outputs. Rethinking BC enables us to adapt state-of-the-art machine learning (ML) attention models and thereby learn different types of biases, including temporal asynchronicities. With a case study of heatwave duration statistics in Abuja, Nigeria, and Tokyo, Japan, we show more accurate results than current climate model outputs and alternative BC methods.

A Temporal Stochastic Bias Correction using a Machine Learning Attention model

TL;DR

This work reframes bias correction for climate models as a time-indexed probabilistic regression task with stochastic outputs, enabling learning of temporal biases and asynchronicities. By adapting the Taylorformer attention model, the method learns complex temporal dependencies and produces distributional bias-corrected time-series for heatwave statistics. In Abuja and Tokyo, the approach outperforms standard BC methods in heatwave duration counts and log-likelihood, demonstrating improved fidelity to observations under distributional shifts. The framework holds promise for scalable, regionally differentiated BC that could enhance climate impact assessments and policy decisions.

Abstract

Climate models are biased with respect to real-world observations. They usually need to be adjusted before being used in impact studies. The suite of statistical methods that enable such adjustments is called bias correction (BC). However, BC methods currently struggle to adjust temporal biases. Because they mostly disregard the dependence between consecutive time points. As a result, climate statistics with long-range temporal properties, such as heatwave duration and frequency, cannot be corrected accurately. This makes it more difficult to produce reliable impact studies on such climate statistics. This paper offers a novel BC methodology to correct temporal biases. This is made possible by rethinking the philosophy behind BC. We will introduce BC as a time-indexed regression task with stochastic outputs. Rethinking BC enables us to adapt state-of-the-art machine learning (ML) attention models and thereby learn different types of biases, including temporal asynchronicities. With a case study of heatwave duration statistics in Abuja, Nigeria, and Tokyo, Japan, we show more accurate results than current climate model outputs and alternative BC methods.
Paper Structure (72 sections, 19 equations, 31 figures, 1 table, 2 algorithms)

This paper contains 72 sections, 19 equations, 31 figures, 1 table, 2 algorithms.

Figures (31)

  • Figure 1: Inference Task for Estimating Future Observations: The top panel outlines the available data to us in order to estimate the potential continuation of observed values post-1989 (dashed vertical line), based on historical observations (green line) and both past and future climate model outputs (blue line). The bottom panel displays a red line representing one possible continuation, sampled from the forecasting model which estimates $P_\theta\left(\textbf{O}^{\star} \middle| \textbf{o}, \textbf{g}\right)$
  • Figure 2: Construction of Training Data for Tokyo, Japan: Column I displays full sequences from 1948 to 1988 for climate models (blue) and observations (green). Dashed orange lines indicate the selected slice for each data row, a process termed "Window Selection." Column II zooms into the selected window, featuring a vertical black line at a randomly selected "prediction index" ($t_j$), from which we aim to estimate the observations until $t_h$. Column III illustrates the "Pruning" operation, where time points before and after $t_j$ are randomly selected from both the climate model and observations. The observed values to be estimated are concealed, and the chosen time points, called "prediction tufts" are highlighted with red tufts. This column adapts the data for use in ML sequential models. Note that we have dropped the example index $i$ for readability, but $k, j, h$ change from one example to the next.
  • Figure 3: Illustration of the inference procedure. Given a trained Taylorformer model with parameters $\theta$, we generate a time-series sequentially. In the top row, we generate a value $o_1^{\star}$ which will be then plugged into the condition set in the second row and so forth
  • Figure 4: Comparative Analysis of 'heatwave duration' Trends in Tokyo, Japan (1989-2008): The number of periods featuring at least three consecutive days with temperatures exceeding 22$^\circ$C is shown. The IPSL climate model predictions are represented by red triangles, which generally underestimate the observations. Actual observations are indicated by a vertical orange line. The Taylorformer temporal BC are depicted using horizontal box plots, with whiskers indicating the 1st and 3rd quartiles. Markers for other BC methods are indicated at the bottom of the figure.
  • Figure 5: Comparative Analysis of 'heatwave duration' Trends in Abuja, Nigeria (1989-2008): The number of periods featuring at least three consecutive days with temperatures exceeding 24$^\circ$C is shown. The IPSL climate model predictions are represented by red triangles, which overestimate the observations. A vertical orange line indicates actual observations. The Taylor former temporal BC is depicted using horizontal box plots, with whiskers indicating the 1st and 3rd quartiles. Markers for other BC methods are indicated at the bottom of the figure.
  • ...and 26 more figures