Table of Contents
Fetching ...

Conditional Diffusion-Based Retrieval of Atmospheric CO2 from Earth Observing Spectroscopy

William R. Keely, Otto Lamminpää, Steffen Mauceri, Sean M. R. Crowell, Christopher W. O'Dell, Gregory R. McGarragh

TL;DR

This work reframes satellite-based CO$_2$ retrieval as a nonlinear Bayesian inverse problem and introduces a diffusion-based approach to efficiently sample the conditional posterior $p(x|oldsymbol{ar{y}})$ using a conditional mean prior. By training on millions of simulated radiances and incorporating collocated TCCON data for bias correction, the method enables fast posterior sampling and improved uncertainty quantification over the current Gaussian-based ACOS retrieval. Empirical results show substantial inference speedups (orders of magnitude), the ability to capture non-Gaussian features such as bimodality, and RMSE improvements (up to ~10% with finetuning) along with better calibration of uncertainties. The approach holds promise for near real-time, global CO$_2$ monitoring and can be extended to other GHGs and next-generation missions, with attention to regional bias control and averaging kernels for data assimilation.

Abstract

Satellite-based estimates of greenhouse gas (GHG) properties from observations of reflected solar spectra are integral for understanding and monitoring complex terrestrial systems and their impact on the carbon cycle due to their near global coverage. Known as retrieval, making GHG concentration estimations from these observations is a non-linear Bayesian inverse problem, which is operationally solved using a computationally expensive algorithm called Optimal Estimation (OE), providing a Gaussian approximation to a non-Gaussian posterior. This leads to issues in solver algorithm convergence, and to unrealistically confident uncertainty estimates for the retrieved quantities. Upcoming satellite missions will provide orders of magnitude more data than the current constellation of GHG observers. Development of fast and accurate retrieval algorithms with robust uncertainty quantification is critical. Doing so stands to provide substantial climate impact of moving towards the goal of near continuous real-time global monitoring of carbon sources and sinks which is essential for policy making. To achieve this goal, we propose a diffusion-based approach to flexibly retrieve a Gaussian or non-Gaussian posterior, for NASA's Orbiting Carbon Observatory-2 spectrometer, while providing a substantial computational speed-up over the current operational state-of-the-art.

Conditional Diffusion-Based Retrieval of Atmospheric CO2 from Earth Observing Spectroscopy

TL;DR

This work reframes satellite-based CO retrieval as a nonlinear Bayesian inverse problem and introduces a diffusion-based approach to efficiently sample the conditional posterior using a conditional mean prior. By training on millions of simulated radiances and incorporating collocated TCCON data for bias correction, the method enables fast posterior sampling and improved uncertainty quantification over the current Gaussian-based ACOS retrieval. Empirical results show substantial inference speedups (orders of magnitude), the ability to capture non-Gaussian features such as bimodality, and RMSE improvements (up to ~10% with finetuning) along with better calibration of uncertainties. The approach holds promise for near real-time, global CO monitoring and can be extended to other GHGs and next-generation missions, with attention to regional bias control and averaging kernels for data assimilation.

Abstract

Satellite-based estimates of greenhouse gas (GHG) properties from observations of reflected solar spectra are integral for understanding and monitoring complex terrestrial systems and their impact on the carbon cycle due to their near global coverage. Known as retrieval, making GHG concentration estimations from these observations is a non-linear Bayesian inverse problem, which is operationally solved using a computationally expensive algorithm called Optimal Estimation (OE), providing a Gaussian approximation to a non-Gaussian posterior. This leads to issues in solver algorithm convergence, and to unrealistically confident uncertainty estimates for the retrieved quantities. Upcoming satellite missions will provide orders of magnitude more data than the current constellation of GHG observers. Development of fast and accurate retrieval algorithms with robust uncertainty quantification is critical. Doing so stands to provide substantial climate impact of moving towards the goal of near continuous real-time global monitoring of carbon sources and sinks which is essential for policy making. To achieve this goal, we propose a diffusion-based approach to flexibly retrieve a Gaussian or non-Gaussian posterior, for NASA's Orbiting Carbon Observatory-2 spectrometer, while providing a substantial computational speed-up over the current operational state-of-the-art.

Paper Structure

This paper contains 12 sections, 2 equations, 3 figures, 3 tables.

Figures (3)

  • Figure 1: Example of bi-modal distribution recovered through 10 conditional samples from the Diffusion posterior (blue) without finetuning compared to the operational OE posterior which is derived from the point estimate $\pm$ an estimate of the variance (black) without bias correction. The Diffusion posterior covers the ground truth value from TCCON (red) within the first mode which is not captured by the Gaussian operational posterior.
  • Figure 2: Comparison of ACOS and Diffusion with and without bias correction/finetuning for the holdout year of 2022. We compare the point estimate of each method (top row) by evaluating the RMSE. We compare the Gaussian uncertainties of each method (bottom row) by evaluating the Miscalibration Area.
  • Figure 3: The top figure shows an example of an observed radiance (blue) and the FM modeled radiance (orange) for a single sounding. The bottom figure shows the mean and variance of remaining error before and after application of the randomly scaled EOFs.