Breaking Scale Anchoring: Frequency Representation Learning for Accurate High-Resolution Inference from Low-Resolution Training
Wenshuo Wang, Fan Zhang
TL;DR
This work identifies Scale Anchoring as an information-theoretic limitation that prevents true cross-resolution generalization in zero-shot spatiotemporal forecasting. It introduces Frequency Representation Learning (FRL), an architecture-agnostic framework that normalizes frequencies across resolutions and enforces spectral consistency to mitigate high-frequency extrapolation gaps. Theoretical analysis links Nyquist-bound learning to frequency blindness and validates this mechanism with experiments across 3D fluid simulations and ERA5-based weather forecasting. Empirically, FRL yields substantial high-resolution accuracy gains with modest training-time and memory overhead, demonstrating its potential to enable reliable, multi-resolution inference in STF while clarifying the boundaries of its applicability.
Abstract
Zero-Shot Super-Resolution Spatiotemporal Forecasting requires a deep learning model to be trained on low-resolution data and deployed for inference on high-resolution. Existing studies consider maintaining similar error across different resolutions as indicative of successful multi-resolution generalization. However, deep learning models serving as alternatives to numerical solvers should reduce error as resolution increases. The fundamental limitation is, the upper bound of physical law frequencies that low-resolution data can represent is constrained by its Nyquist frequency, making it difficult for models to process signals containing unseen frequency components during high-resolution inference. This results in errors being anchored at low resolution, incorrectly interpreted as successful generalization. We define this fundamental phenomenon as a new problem distinct from existing issues: Scale Anchoring. Therefore, we propose architecture-agnostic Frequency Representation Learning. It alleviates Scale Anchoring through resolution-aligned frequency representations and spectral consistency training: on grids with higher Nyquist frequencies, the frequency response in high-frequency bands of FRL-enhanced variants is more stable. This allows errors to decrease with resolution and significantly outperform baselines within our task and resolution range, while incurring only modest computational overhead.
