On the recovery of two function-valued coefficients in the Helmholtz equation for inverse scattering problems via neural networks

Zehui Zhou

On the recovery of two function-valued coefficients in the Helmholtz equation for inverse scattering problems via neural networks

Zehui Zhou

TL;DR

This work tackles recovering two function-valued coefficients $\gamma$ and $\eta$ in the Helmholtz inverse scattering problem using two-frequency data. It decomposes the regularized inverse $F_\alpha^\dag=(F^*F+\alpha I)^{-1}F^*$ into a Fourier-type forward part $F^*$ and a convolution-type inverse, and constructs two combined neural networks (uncompressed and butterfly-compressed) to approximate this inverse, leveraging polar coordinates and butterfly factorization for efficiency. The authors establish approximation bounds for the BFNN components and generalization bounds via Rademacher complexity, and corroborate with numerical experiments that the networks can recover isotropic media and, in some cases, the isotropic representation of certain anisotropic media. The results demonstrate that, with sufficient data and appropriately chosen frequency pairs and resolutions, the proposed networks can effectively approximate the inverse with meaningful generalization, offering a data-driven pathway for solving two-coefficient inverse scattering problems. The work also discusses limitations due to linearization and points to future work on nonlinear extensions and broader anisotropy scenarios.

Abstract

Recently, deep neural networks (DNNs) have become powerful tools for solving inverse scattering problems. However, the approximation and generalization rates of DNNs for solving these problems remain largely under-explored. In this work, we introduce two types of combined DNNs (uncompressed and compressed) to reconstruct {two function-valued coefficients} in the Helmholtz equation for inverse scattering problems from the scattering data at two different frequencies. An analysis of the approximation and generalization capabilities of the proposed neural networks for simulating the regularized pseudo-inverses of the linearized forward operators in direct scattering problems is provided. The results show that, with sufficient training data and parameters, the proposed neural networks can effectively approximate the inverse process with desirable generalization. Preliminary numerical results show the feasibility of the proposed neural networks for recovering two types of isotropic inhomogeneous media. Furthermore, the trained neural network is capable of reconstructing the isotropic representation of certain types of anisotropic media.

On the recovery of two function-valued coefficients in the Helmholtz equation for inverse scattering problems via neural networks

TL;DR

This work tackles recovering two function-valued coefficients

and

in the Helmholtz inverse scattering problem using two-frequency data. It decomposes the regularized inverse

into a Fourier-type forward part

and a convolution-type inverse, and constructs two combined neural networks (uncompressed and butterfly-compressed) to approximate this inverse, leveraging polar coordinates and butterfly factorization for efficiency. The authors establish approximation bounds for the BFNN components and generalization bounds via Rademacher complexity, and corroborate with numerical experiments that the networks can recover isotropic media and, in some cases, the isotropic representation of certain anisotropic media. The results demonstrate that, with sufficient data and appropriately chosen frequency pairs and resolutions, the proposed networks can effectively approximate the inverse with meaningful generalization, offering a data-driven pathway for solving two-coefficient inverse scattering problems. The work also discusses limitations due to linearization and points to future work on nonlinear extensions and broader anisotropy scenarios.

Abstract

Paper Structure (18 sections, 23 theorems, 192 equations, 12 figures, 5 tables)

This paper contains 18 sections, 23 theorems, 192 equations, 12 figures, 5 tables.

Introduction
Properties of the regularized pseudo-inverse $F_\alpha^\dag$
Combined neural networks
Butterfly neural networks (BFNNs) for $F^*$
Convolutional neural networks (CNNs) for $(F^*F+\alpha I)^{-1}$
Generalization of the combined neural networks
Training set, hypothesis space and loss function
Generalization error of $\mathcal{H}^0$
Generalization error of $\mathcal{H}^1$
Numerical experiments
Training setup and neural networks
Approximation and generalization capabilities
Dependence on the wave frequency
Dependence on the resolution
Dependence on the size of training dataset
...and 3 more sections

Key Result

Proposition 2.1

Let $F_1^\omega$, $F_2^\omega$ and $F$ be the operators defined in eqn:F1F2 and eqn:prob_linear_sys respectively. Then their adjoint operators over the scattering data on the polar coordinates $(\theta,\rho)\in [0,2\pi]\times [0,\infty)$ are given by and for $i=1,2$, with the linear radial scaling operator $\mathcal{S}_\omega f(\theta,\rho)=f(\theta,\frac{\omega}{\omega_1}\rho)$, the integral ke

Figures (12)

Figure 1: The combined uncompressed neural networks $h=\psi\circ\psi_0\circ \phi^0$ where the intermediate outputs $O_{j,\omega_1}^{0'}:=(\phi_j^0 \Lambda^{\omega_1}_{\theta_1},\cdots,\phi_j^0 \Lambda^{\omega_1}_{\theta_{n_\theta}})^T\in\mathbb{R}^{ n_\theta\times n_\rho}$ and $O_{j,\omega_2}^{0'}:=(\phi_j^0 \Lambda^{\omega_2}_{\theta_1},\cdots,\phi_j^0 \Lambda^{\omega_2}_{\theta_{n_\theta}})^T\in\mathbb{R}^{ n_\theta\times n_\rho}$ for $j=1,2$. The input $\Lambda\in \mathbb{C}^{2n_\theta\times n_\theta}$ is the discretized scattering data at two distinct wave frequencies $\omega_2>\omega_1$. $0$-level BFNN $\phi^0$ is applied to the shifted inputs and derives the intermediate outputs $\phi^0(\Lambda)\in\mathbb{R}^{2 n_\theta\times \frac{\omega_1}{\omega_2}n_\rho}$. After converting the polar coordinates convert to the Cartesian coordinates by $\psi_0:\mathbb{R}^{2 n_\theta\times \frac{\omega_1}{\omega_2}n_\rho}\to \mathbb{R}^{2 n_c\times n_c}$, the $2$-dimensional deep ($8$-layer) CNN $\psi$ with $6$ filters of the same size is used to derive the output for minimizing the loss function $\ell$.
Figure 2: The exact and reconstructed perturbations from two different types of datasets. The first three rows are generated by \ref{['eqn:dataset_s']} (smooth) and the last two rows are generated by \ref{['eqn:dataset_nons']} (nonsmooth and discontinuous).
Figure 3: The exact and reconstructed perturbations from the proposed neural networks with different wave frequencies $\omega_1,\omega_2$.
Figure 4: The exact perturbations and the reconstructed errors from the proposed neural networks with different wave frequencies $\omega_1,\omega_2$.
Figure 5: Exact perturbations $\gamma_{11}$, $\gamma_{12}$, $\gamma_{22}$ and $\eta_e$, and the exact and reconstructed scattering data.
...and 7 more figures

Theorems & Definitions (47)

Definition 1.1: Scattering Data and the Inverse Problem
Remark 1.1
Proposition 2.1
proof
Remark 2.1
Theorem 2.1
proof
Proposition 2.2
proof
Remark 2.2
...and 37 more

On the recovery of two function-valued coefficients in the Helmholtz equation for inverse scattering problems via neural networks

TL;DR

Abstract

On the recovery of two function-valued coefficients in the Helmholtz equation for inverse scattering problems via neural networks

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (12)

Theorems & Definitions (47)