Data selection: at the interface of PDE-based inverse problem and randomized linear algebra
Kathrin Hellmuth, Ruhui Jin, Qin Li, Stephen J. Wright
TL;DR
This survey explores data selection for PDE-based inverse problems through the lens of randomized numerical linear algebra (RNLA). It identifies the core challenge of simultaneous infinite-dimensional parameter and design spaces and shows how RNLA techniques—such as matrix sketching, randomized SVD, and Hessian/subset strategies—can be tailored to the tensorized sensitivities that arise from PDE linearization. The paper connects PDE-constrained optimization and Bayesian design with RNLA, proposing qualitative data-selection tools that prioritize efficiency while preserving essential information for reconstruction. It also outlines theoretical results, practical algorithms, and open questions, especially regarding nonlinear extensions and infinite-dimensional formulations, with implications for scalable design in physics and engineering contexts.
Abstract
All inverse problems rely on data to recover unknown parameters, yet not all data are equally informative. This raises the central question of data selection. A distinctive challenge in PDE-based inverse problems is their inherently infinite-dimensional nature: both the parameter space and the design space are infinite, which greatly complicates the selection process. Somewhat unexpectedly, randomized numerical linear algebra (RNLA), originally developed in very different contexts, has provided powerful tools for addressing this challenge. These methods are inherently probabilistic, with guarantees typically stating that information is preserved with probability at least 1-p when using N randomly selected, weighted samples. Here, the notion of information can take different mathematical forms depending on the setting. In this review, we survey the problem of data selection in PDE-based inverse problems, emphasize its unique infinite-dimensional aspects, and highlight how RNLA strategies have been adapted and applied in this context.
