Table of Contents
Fetching ...

Machine Learning for Inverse Problems and Data Assimilation

Eviatar Bach, Ricardo Baptista, Daniel Sanz-Alonso, Andrew Stuart

TL;DR

The notes articulate a unified framework for applying machine learning to inverse problems and data assimilation within a Bayesian setting. They develop variational and transport-based methods to learn priors, surrogates, and posterior maps, and prove stability and approximation guarantees, including posterior closeness under forward-model or likelihood perturbations. A central theme is learning and amortizing computation via surrogate forward models, pushforward priors, and transport maps to accelerate inference and sampling, with explicit attention to model error and data dependence. The data-assimilation portion then situates these ideas in sequential settings, detailing classical filters (Kalman variants) and particle methods, along with practical enhancements like inflation and localization. Overall, the work provides both theoretical foundations and algorithmic strategies for integrating ML into Bayesian inverse problems and data assimilation, with broad implications for efficient, robust, and scalable inference in complex systems.

Abstract

The aim of these notes is to demonstrate the potential for ideas in machine learning to impact on the fields of inverse problems and data assimilation. The perspective is one that is primarily aimed at researchers from inverse problems and/or data assimilation who wish to see a mathematical presentation of machine learning as it pertains to their fields. As a by-product, we include a succinct mathematical treatment of various fundamental underpinning topics in machine learning, and adjacent areas of (computational) mathematics.

Machine Learning for Inverse Problems and Data Assimilation

TL;DR

The notes articulate a unified framework for applying machine learning to inverse problems and data assimilation within a Bayesian setting. They develop variational and transport-based methods to learn priors, surrogates, and posterior maps, and prove stability and approximation guarantees, including posterior closeness under forward-model or likelihood perturbations. A central theme is learning and amortizing computation via surrogate forward models, pushforward priors, and transport maps to accelerate inference and sampling, with explicit attention to model error and data dependence. The data-assimilation portion then situates these ideas in sequential settings, detailing classical filters (Kalman variants) and particle methods, along with practical enhancements like inflation and localization. Overall, the work provides both theoretical foundations and algorithmic strategies for integrating ML into Bayesian inverse problems and data assimilation, with broad implications for efficient, robust, and scalable inference in complex systems.

Abstract

The aim of these notes is to demonstrate the potential for ideas in machine learning to impact on the fields of inverse problems and data assimilation. The perspective is one that is primarily aimed at researchers from inverse problems and/or data assimilation who wish to see a mathematical presentation of machine learning as it pertains to their fields. As a by-product, we include a succinct mathematical treatment of various fundamental underpinning topics in machine learning, and adjacent areas of (computational) mathematics.

Paper Structure

This paper contains 214 sections, 71 theorems, 815 equations, 4 figures, 1 table, 10 algorithms.

Key Result

Theorem 1.3

Let Assumption a:jc1 and Data Assumption da:vb hold, and assume further that Then $u | y\sim \pi^y,$ whereWhen there is no possibility of confusion, we will simply write $\pi$ for the posterior probability density function, rather than $\pi^y.$

Figures (4)

  • Figure 1: Dynamics and observation models underlying data assimilation problems.
  • Figure 2: Prediction and analysis steps combined.
  • Figure 3: Diagram representing relationships between three different ways of quantifying closeness between probability measures. Metrics impose more restrictive conditions than divergences: only some divergences are metrics. Furthermore a subset of expected scoring rules ----those based on strictly proper scoring rules--- lead to divergences, and in some cases to metrics.
  • Figure 4: Integration of the difference between cumulative distribution functions (left) and their inverses (right)

Theorems & Definitions (136)

  • Theorem 1.3
  • proof
  • Theorem 1.6
  • proof : Proof of Theorem \ref{['t:wpz']}
  • Corollary 1.7: Well-Posedness of Posterior
  • Theorem 1.10
  • proof
  • Theorem 2.1
  • proof
  • Proposition 2.2
  • ...and 126 more