AI enhanced data assimilation and uncertainty quantification applied to Geological Carbon Storage
G. S. Seabra, N. T. Mücke, V. L. S. Silva, D. Voskov, F. Vossepoel
TL;DR
This work tackles uncertainty quantification in Geological Carbon Storage (GCS) by integrating machine learning surrogates with data assimilation (DA) in a hydrocarbon-relevant setting. It evaluates two surrogate architectures, Fourier Neural Operators (FNO) and Transformer UNet (T-UNet), for CO2 injection simulations performed with the high-fidelity DARTS model, and introduces surrogate-based hybrid DA frameworks: SH-ESMDA and SH-RML. SH-ESMDA accelerates Ensemble Smoother with Multiple Data Assimilation by substituting intermediate forward-model evaluations with surrogates, achieving around a 50% speedup while preserving posterior fidelity; SH-RML enables gradient-based variational DA by leveraging surrogate gradients and refining posteriors with the full physics solver. The results show SH-RML provides better uncertainty quantification and history matching than standard ESMDA, while SH-ESMDA offers a practical trade-off by delivering substantial speedups with comparable accuracy, highlighting the potential of hybrid ML-DA methods for scalable, reliable GCS analytics and their applicability to related subsurface challenges.
Abstract
This study investigates the integration of machine learning (ML) and data assimilation (DA) techniques, focusing on implementing surrogate models for Geological Carbon Storage (GCS) projects while maintaining high fidelity physical results in posterior states. Initially, we evaluate the surrogate modeling capability of two distinct machine learning models, Fourier Neural Operators (FNOs) and Transformer UNet (T-UNet), in the context of CO$_2$ injection simulations within channelized reservoirs. We introduce the Surrogate-based hybrid ESMDA (SH-ESMDA), an adaptation of the traditional Ensemble Smoother with Multiple Data Assimilation (ESMDA). This method uses FNOs and T-UNet as surrogate models and has the potential to make the standard ESMDA process at least 50% faster or more, depending on the number of assimilation steps. Additionally, we introduce Surrogate-based Hybrid RML (SH-RML), a variational data assimilation approach that relies on the randomized maximum likelihood (RML) where both the FNO and the T-UNet enable the computation of gradients for the optimization of the objective function, and a high-fidelity model is employed for the computation of the posterior states. Our comparative analyses show that SH-RML offers better uncertainty quantification compared to conventional ESMDA for the case study.
