Interpretability and Generalization Bounds for Learning Spatial Physics
Alejandro Francisco Queiruga, Theo Gutman-Solo, Shuai Jiang
TL;DR
This work addresses the challenge of understanding when ML methods can reliably learn and generalize spatial physics, focusing on linear DEs and Green's-function representations. It develops a theory linking discretization, data function spaces, and learning dynamics, proving bounds on parameter learning and showing that linear operator learning converges to a projection of the true operator onto the training subspace via $W^* = A U U^T + W^0(I - U U^T)$. The paper validates these ideas through extensive experiments across finite-difference, PINN, DeepONet, Neural Operators, and physics-informed variants, demonstrating a robust subspace-generalization structure and introducing a cross-set validation protocol as a practical benchmark. It further shows that Green's functions can be extracted from well-generalizing black-box models, providing a mechanistic interpretable lens, while also revealing that different model classes can exhibit opposing generalization behaviors, thereby guiding data collection and evaluation in scientific ML.
Abstract
While there are many applications of ML to scientific problems that look promising, visuals can be deceiving. Using numerical analysis techniques, we rigorously quantify the accuracy, convergence rates, and generalization bounds of certain ML models applied to linear differential equations for parameter discovery or solution finding. Beyond the quantity and discretization of data, we identify that the function space of the data is critical to the generalization of the model. A similar lack of generalization is empirically demonstrated for commonly used models, including physics-specific techniques. Counterintuitively, we find that different classes of models can exhibit opposing generalization behaviors. Based on our theoretical analysis, we also introduce a new mechanistic interpretability lens on scientific models whereby Green's function representations can be extracted from the weights of black-box models. Our results inform a new cross-validation technique for measuring generalization in physical systems, which can serve as a benchmark.
