The False Promise of Zero-Shot Super-Resolution in Machine-Learned Operators
Mansi Sakarvadia, Kareem Hegazy, Amin Totounferoush, Kyle Chard, Yaoqing Yang, Ian Foster, Michael W. Mahoney
TL;DR
The paper investigates whether machine-learned operators can perform zero-shot super-resolution across discretizations, scrutinizing the Fourier Neural Operator on standard PDE benchmarks. It shows that zero-shot interpolation and extrapolation fail due to aliasing and out-of-distribution sensitivity to discretization, and that physics-informed constraints and band-limited learning do not resolve these issues. As a practical remedy, the authors propose multi-resolution training, demonstrating that incorporating data from multiple resolutions—preferably mostly low-resolution data with some high-resolution samples—greatly improves cross-resolution generalization at low additional cost. This work clarifies the limitations of zero-shot approaches in scientific ML and provides a scalable training paradigm to enable robust multi-resolution inference in surrogate PDE models, with potential broad impact on PDEBench-style workflows and mesh-invariant modeling.
Abstract
A core challenge in scientific machine learning, and scientific computing more generally, is modeling continuous phenomena which (in practice) are represented discretely. Machine-learned operators (MLOs) have been introduced as a means to achieve this modeling goal, as this class of architecture can perform inference at arbitrary resolution. In this work, we evaluate whether this architectural innovation is sufficient to perform "zero-shot super-resolution," namely to enable a model to serve inference on higher-resolution data than that on which it was originally trained. We comprehensively evaluate both zero-shot sub-resolution and super-resolution (i.e., multi-resolution) inference in MLOs. We decouple multi-resolution inference into two key behaviors: 1) extrapolation to varying frequency information; and 2) interpolating across varying resolutions. We empirically demonstrate that MLOs fail to do both of these tasks in a zero-shot manner. Consequently, we find MLOs are not able to perform accurate inference at resolutions different from those on which they were trained, and instead they are brittle and susceptible to aliasing. To address these failure modes, we propose a simple, computationally-efficient, and data-driven multi-resolution training protocol that overcomes aliasing and that provides robust multi-resolution generalization.
