Table of Contents
Fetching ...

Integrating Score-Based Diffusion Models with Machine Learning-Enhanced Localization for Advanced Data Assimilation in Geological Carbon Storage

Gabriel Serrão Seabra, Nikolaj T. Mücke, Vinicius Luiz Santos Silva, Alexandre A. Emerick, Denis Voskov, Femke Vossepoel

TL;DR

This work tackles uncertainty quantification for geological carbon storage in highly channelized reservoirs, where traditional ensemble methods struggle with covariance estimation and geologic realism. It introduces a framework that combines score-based diffusion models to generate large, geologically consistent super-ensembles with ML-enhanced localization to produce reliable, channel-respecting covariance estimates within ESMDA. Empirical results in a 2D channelized CO$_2$ storage setting show that ML-based localization preserves up to ~40% more ensemble variance than standard tapers and concentrates updates along high-permeability channels, improving data matching while maintaining geological realism. The approach is computationally efficient and practical for GCS applications, with clear avenues for extending to 3D, incorporating additional data types, and enabling online learning of localization proxies.

Abstract

Accurate characterization of subsurface heterogeneity is important for the safe and effective implementation of geological carbon storage (GCS) projects. This paper explores how machine learning methods can enhance data assimilation for GCS with a framework that integrates score-based diffusion models with machine learning-enhanced localization in channelized reservoirs during CO$_2$ injection. We employ a machine learning-enhanced localization framework that uses large ensembles ($N_s = 5000$) with permeabilities generated by the diffusion model and states computed by simple ML algorithms to improve covariance estimation for the Ensemble Smoother with Multiple Data Assimilation (ESMDA). We apply ML algorithms to a prior ensemble of channelized permeability fields, generated with the geostatistical model FLUVSIM. Our approach is applied on a CO$_2$ injection scenario simulated using the Delft Advanced Research Terra Simulator (DARTS). Our ML-based localization maintains significantly more ensemble variance than when localization is not applied, while achieving comparable data-matching quality. This framework has practical implications for GCS projects, helping improve the reliability of uncertainty quantification for risk assessment.

Integrating Score-Based Diffusion Models with Machine Learning-Enhanced Localization for Advanced Data Assimilation in Geological Carbon Storage

TL;DR

This work tackles uncertainty quantification for geological carbon storage in highly channelized reservoirs, where traditional ensemble methods struggle with covariance estimation and geologic realism. It introduces a framework that combines score-based diffusion models to generate large, geologically consistent super-ensembles with ML-enhanced localization to produce reliable, channel-respecting covariance estimates within ESMDA. Empirical results in a 2D channelized CO storage setting show that ML-based localization preserves up to ~40% more ensemble variance than standard tapers and concentrates updates along high-permeability channels, improving data matching while maintaining geological realism. The approach is computationally efficient and practical for GCS applications, with clear avenues for extending to 3D, incorporating additional data types, and enabling online learning of localization proxies.

Abstract

Accurate characterization of subsurface heterogeneity is important for the safe and effective implementation of geological carbon storage (GCS) projects. This paper explores how machine learning methods can enhance data assimilation for GCS with a framework that integrates score-based diffusion models with machine learning-enhanced localization in channelized reservoirs during CO injection. We employ a machine learning-enhanced localization framework that uses large ensembles () with permeabilities generated by the diffusion model and states computed by simple ML algorithms to improve covariance estimation for the Ensemble Smoother with Multiple Data Assimilation (ESMDA). We apply ML algorithms to a prior ensemble of channelized permeability fields, generated with the geostatistical model FLUVSIM. Our approach is applied on a CO injection scenario simulated using the Delft Advanced Research Terra Simulator (DARTS). Our ML-based localization maintains significantly more ensemble variance than when localization is not applied, while achieving comparable data-matching quality. This framework has practical implications for GCS projects, helping improve the reliability of uncertainty quantification for risk assessment.

Paper Structure

This paper contains 29 sections, 19 equations, 26 figures, 3 tables, 4 algorithms.

Figures (26)

  • Figure 1: Score-based diffusion process. Forward diffusion (top) transforms channelized permeability to noise via VE-SDE. Reverse process (bottom) uses learned score function $s_\theta(\mathbf{x}_t, t)$ for denoising with SDE (stochastic) or ODE (deterministic) sampling.
  • Figure 2: Representative FLUVSIM-generated channelized permeability fields with bimodal distribution: channels (2000 mD) in background (50 mD).
  • Figure 3: Channelized reservoir model (left) and DARTS-simulated pressure field (right) with injection well (triangle) and monitoring wells (circles). The pressure distribution correlates with the high-permeability channels and the injector–producer configuration: it is strongest along the dominant connected channel, with weaker propagation along the two minor channels.
  • Figure 4: Visualization of UNet architecture with attention.
  • Figure 5: Evolution of channelized structures during generation. ODE (top) and PC (bottom) samplers progressively denoise from Gaussian to realistic permeability fields.
  • ...and 21 more figures