Table of Contents
Fetching ...

Ambient Noise Full Waveform Inversion with Neural Operators

Caifeng Zou, Zachary E. Ross, Robert W. Clayton, Fan-Chi Lin, Kamyar Azizzadenesheli

TL;DR

This work tackles the computational bottleneck of ambient-noise full waveform inversion by employing a Helmholtz Neural Operator (HNO) that learns a forward operator in the frequency domain and leverages automatic differentiation for gradient-based inversion. The authors demonstrate the first real-data application of neural operators to ambient-noise tomography in the Los Angeles basins, achieving substantial speedups (two orders of magnitude) over conventional adjoint-based solvers while maintaining accuracy. Key contributions include a detailed HNO architecture combining Fourier and Graph Neural Operators, a data-driven training regime on synthetic SALVUS data, and successful inversion on BASIN data that aligns with prior geological models. The approach promises scalable 3D extensions and offers a flexible framework that integrates modern optimization techniques, potentially transforming practical seismic inversion workflows when training distributions are representative of target regions.

Abstract

Numerical simulations of seismic wave propagation are crucial for investigating velocity structures and improving seismic hazard assessment. However, standard methods such as finite difference or finite element are computationally expensive. Recent studies have shown that a new class of machine learning models, called neural operators, can solve the elastodynamic wave equation orders of magnitude faster than conventional methods. Full waveform inversion is a prime beneficiary of the accelerated simulations. Neural operators, as end-to-end differentiable operators, combined with automatic differentiation, provide an alternative approach to the adjoint-state method. State-of-the-art optimization techniques built into PyTorch provide neural operators with greater flexibility to improve the optimization dynamics of full waveform inversion, thereby mitigating cycle-skipping problems. In this study, we demonstrate the first application of neural operators for full waveform inversion on a real seismic dataset, which consists of several nodal transects collected across the San Gabriel, Chino, and San Bernardino basins in the Los Angeles metropolitan area.

Ambient Noise Full Waveform Inversion with Neural Operators

TL;DR

This work tackles the computational bottleneck of ambient-noise full waveform inversion by employing a Helmholtz Neural Operator (HNO) that learns a forward operator in the frequency domain and leverages automatic differentiation for gradient-based inversion. The authors demonstrate the first real-data application of neural operators to ambient-noise tomography in the Los Angeles basins, achieving substantial speedups (two orders of magnitude) over conventional adjoint-based solvers while maintaining accuracy. Key contributions include a detailed HNO architecture combining Fourier and Graph Neural Operators, a data-driven training regime on synthetic SALVUS data, and successful inversion on BASIN data that aligns with prior geological models. The approach promises scalable 3D extensions and offers a flexible framework that integrates modern optimization techniques, potentially transforming practical seismic inversion workflows when training distributions are representative of target regions.

Abstract

Numerical simulations of seismic wave propagation are crucial for investigating velocity structures and improving seismic hazard assessment. However, standard methods such as finite difference or finite element are computationally expensive. Recent studies have shown that a new class of machine learning models, called neural operators, can solve the elastodynamic wave equation orders of magnitude faster than conventional methods. Full waveform inversion is a prime beneficiary of the accelerated simulations. Neural operators, as end-to-end differentiable operators, combined with automatic differentiation, provide an alternative approach to the adjoint-state method. State-of-the-art optimization techniques built into PyTorch provide neural operators with greater flexibility to improve the optimization dynamics of full waveform inversion, thereby mitigating cycle-skipping problems. In this study, we demonstrate the first application of neural operators for full waveform inversion on a real seismic dataset, which consists of several nodal transects collected across the San Gabriel, Chino, and San Bernardino basins in the Los Angeles metropolitan area.

Paper Structure

This paper contains 8 sections, 11 equations, 18 figures, 3 tables.

Figures (18)

  • Figure 1: Model architecture. $\mathbf{a}$ is the input given by $V_P$, $V_S$, the source location, and a constant function indicating the frequency value. $\mathbf{\hat{d}}$ is the predicted data. $\mathcal{P}$ is a point-wise operator used to lift the dimension. $\mathcal{Q}$ is used to project the output to the desired dimension. $\mathcal{B}$ is the inner integral operator chosen as the FNO, in which $\mathbf{v}$ is input to the layer, $\mathcal{F}$ and $\mathcal{F}^{-1}$ denote the Fourier and inverse Fourier transforms, respectively, $\mathcal{R}$ is a linear operator, $\mathcal{W}$ acts as a residual connection, and $\sigma$ is a nonlinear activation function. $\Lambda$ is a GNO used to query the waveforms at the free surface. Blue circles denote concatenation along the channel dimension.
  • Figure 2: Comparing simulations for velocity model CVM-S4.26 between HNO and baseline method. Source location is marked by a white star.
  • Figure 3: Sensitivity kernels for the HNO-AD method, defined as the gradients of the misfit with respect to the velocity parameters, which are summed over all source-receiver pairs and averaged horizontally. The misfit is defined as the squared error between the HNO-predicted data with a 1D initial model and the SALVUS-simulated data with the true model (CVM-S4.26). The sensitivity values for $V_P$ and $V_S$ are each normalized by its maximum amplitude.
  • Figure 4: Overall MSE (mean squared error) in the FWI process, where higher frequency data is progressively incorporated. We use an Adam optimizer with a learning rate of $0.05$ and a scheduler that decays the learning rate by half every $5$ epochs. The batch size is $64$.
  • Figure 5: A sample shot of synthetic data perturbed with different levels of noise. The noise is generated from a zero-mean Gaussian distribution with an SD equal to a factor times the SD of the noise-free data.
  • ...and 13 more figures