Conditional Diffusion-Based Retrieval of Atmospheric CO2 from Earth Observing Spectroscopy
William R. Keely, Otto Lamminpää, Steffen Mauceri, Sean M. R. Crowell, Christopher W. O'Dell, Gregory R. McGarragh
TL;DR
This work reframes satellite-based CO$_2$ retrieval as a nonlinear Bayesian inverse problem and introduces a diffusion-based approach to efficiently sample the conditional posterior $p(x|oldsymbol{ar{y}})$ using a conditional mean prior. By training on millions of simulated radiances and incorporating collocated TCCON data for bias correction, the method enables fast posterior sampling and improved uncertainty quantification over the current Gaussian-based ACOS retrieval. Empirical results show substantial inference speedups (orders of magnitude), the ability to capture non-Gaussian features such as bimodality, and RMSE improvements (up to ~10% with finetuning) along with better calibration of uncertainties. The approach holds promise for near real-time, global CO$_2$ monitoring and can be extended to other GHGs and next-generation missions, with attention to regional bias control and averaging kernels for data assimilation.
Abstract
Satellite-based estimates of greenhouse gas (GHG) properties from observations of reflected solar spectra are integral for understanding and monitoring complex terrestrial systems and their impact on the carbon cycle due to their near global coverage. Known as retrieval, making GHG concentration estimations from these observations is a non-linear Bayesian inverse problem, which is operationally solved using a computationally expensive algorithm called Optimal Estimation (OE), providing a Gaussian approximation to a non-Gaussian posterior. This leads to issues in solver algorithm convergence, and to unrealistically confident uncertainty estimates for the retrieved quantities. Upcoming satellite missions will provide orders of magnitude more data than the current constellation of GHG observers. Development of fast and accurate retrieval algorithms with robust uncertainty quantification is critical. Doing so stands to provide substantial climate impact of moving towards the goal of near continuous real-time global monitoring of carbon sources and sinks which is essential for policy making. To achieve this goal, we propose a diffusion-based approach to flexibly retrieve a Gaussian or non-Gaussian posterior, for NASA's Orbiting Carbon Observatory-2 spectrometer, while providing a substantial computational speed-up over the current operational state-of-the-art.
