Optimising seismic imaging design parameters via bilevel learning

Shaunagh Downing; Silvia Gazzola; Ivan G. Graham; Euan A. Spence

Optimising seismic imaging design parameters via bilevel learning

Shaunagh Downing, Silvia Gazzola, Ivan G. Graham, Euan A. Spence

TL;DR

This work addresses the problem of automatically selecting FWI design parameters, specifically sensor placements $\mathcal{P}$ and regularisation weight $\alpha$, by casting it as a bilevel learning problem with training models $\mathcal{M}'$. The lower level solves FWI via the Helmholtz equation to obtain $\bm^{\rm FWI}$, while the upper level minimizes the misfit between ground-truth training models and their FWI reconstructions, leveraging an adjoint-state gradient and a reduced single-level formulation. Key contributions include explicit upper-level gradient formulas, a Hessian-based preconditioning strategy, a smoothing extraction to ensure smooth sensor-position dependence, and a bilevel frequency-continuation scheme to avoid spurious stationary points, demonstrated on a Marmousi-type problem. The results show that jointly learning sensor locations and the regularisation weight yields robust improvements on unseen test models, with full cross-validation confirming sizeable gains in SSIM and the improvement factor.

Abstract

Full Waveform Inversion (FWI) is a standard algorithm in seismic imaging. Its implementation requires the a priori choice of a number of "design parameters", such as the positions of sensors for the actual measurements and one (or more) regularisation weights. In this paper we describe a novel algorithm for determining these design parameters automatically from a set of training images, using a (supervised) bilevel learning approach. In our algorithm, the upper level objective function measures the quality of the reconstructions of the training images, where the reconstructions are obtained by solving the lower level optimisation problem -- in this case FWI. Our algorithm employs (variants of) the BFGS quasi-Newton method to perform the optimisation at each level, and thus requires the repeated solution of the forward problem -- here taken to be the Helmholtz equation. This paper focuses on the implementation of the algorithm. The novel contributions are: (i) an adjoint-state method for the efficient computation of the upper-level gradient; (ii) a complexity analysis for the bilevel algorithm, which counts the number of Helmholtz solves needed and shows this number is independent of the number of design parameters optimised; (iii) an effective preconditioning strategy for iteratively solving the linear systems required at each step of the bilevel algorithm; (iv) a smoothed extraction process for point values of the discretised wavefield, necessary for ensuring a smooth upper level objective function. The algorithm also uses an extension to the bilevel setting of classical frequency-continuation strategies, helping avoid convergence to spurious stationary points. The advantage of our algorithm is demonstrated on a problem derived from the standard Marmousi test problem.

Optimising seismic imaging design parameters via bilevel learning

TL;DR

This work addresses the problem of automatically selecting FWI design parameters, specifically sensor placements

and regularisation weight

, by casting it as a bilevel learning problem with training models

. The lower level solves FWI via the Helmholtz equation to obtain

, while the upper level minimizes the misfit between ground-truth training models and their FWI reconstructions, leveraging an adjoint-state gradient and a reduced single-level formulation. Key contributions include explicit upper-level gradient formulas, a Hessian-based preconditioning strategy, a smoothing extraction to ensure smooth sensor-position dependence, and a bilevel frequency-continuation scheme to avoid spurious stationary points, demonstrated on a Marmousi-type problem. The results show that jointly learning sensor locations and the regularisation weight yields robust improvements on unseen test models, with full cross-validation confirming sizeable gains in SSIM and the improvement factor.

Abstract

Paper Structure (23 sections, 8 theorems, 78 equations, 11 figures, 3 tables, 3 algorithms)

This paper contains 23 sections, 8 theorems, 78 equations, 11 figures, 3 tables, 3 algorithms.

Introduction
Formulation of the Bilevel Problem
The Wave Equation in the Frequency Domain
The lower-level problem
Training models and the bilevel problem
Solving the Bilevel Problem
Derivative of $\psi$ with respect to position coordinate $p_{j,\ell}$
Derivative of $\psi$ with respect to regularisation parameter $\alpha$
Complexity Analysis in Terms of the Number of PDE Solves
Application to a Marmousi problem
Benefit of jointly optimising over sources' locations and weighting parameter.
Full cross-validation of the results.
Conclusions and outlook
Appendix -- Details of the numerical implementation
Numerical forward model
...and 8 more sections

Key Result

Proposition 2.6

Let $\Re$ denote the real part of a complex number. Then and

Figures (11)

Figure 1: Overall Schematic of the Bilevel Problem.
Figure 2: Smooth Marmousi model.
Figure 3: First row: the smooth Marmousi model divided into individual slices. Second row: reconstructions using the non-optimised design parameters. Third row: FWI reconstructions using the $\mathcal{P}$ - $\alpha$ optimised designed parameters. Slices 1, 2, 3 and 5 were used for training, while slice 4 was used for testing. Colour maps are scaled to be in the same range; i.e., the colours correspond to the same values in each slice.
Figure 4: Absolute value of the Mean Relative Percentage Error (MRE) defined in \ref{['relerr']} for each of the scenarios in Figure Slices 1, 2, 3 and 5 were used for training, while slice 4 was used for testing.
Figure 5: (a) Source positions and random non-optimised sensor positions; (b) Source positions and optimised sensor positions after $\mathcal{P}$ optimisation (c) Source positions and optimised sensor positions after $\mathcal{P}, \alpha$ optimisation. Slices 1, 2, 3 and 5 were used as training models, and slice 4 as testing model. Illustrations are presented on slice 2 for convenience.
...and 6 more figures

Theorems & Definitions (23)

Definition 2.1: Solution operator and its adjoint
Remark 2.2
Definition 2.3: The delta function and its derivative
Definition 2.4: General bilevel problem
Definition 2.5: Reduced single level problem
Proposition 2.6: Derivative of $\phi$ with respect to $m_k$
proof
Remark 2.7
Proposition 3.1
proof
...and 13 more

Optimising seismic imaging design parameters via bilevel learning

TL;DR

Abstract

Optimising seismic imaging design parameters via bilevel learning

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (11)

Theorems & Definitions (23)