On Transferring Transferability: Towards a Theory for Size Generalization
Eitan Levin, Yuxin Ma, Mateo Díaz, Soledad Villar
TL;DR
The paper develops a unifying theory for transferring learning across object sizes by embedding finite-sized problems into a common limit space $V_\infty$ with a limit group $\mathsf{G}_\infty$. It proves that transferability is equivalent to continuity of the limit extension $f_\infty$ under a symmetrized metric, enabling size generalization bounds that connect model Lipschitzness, data geometry, and sampling. The framework is instantiated across sets, graphs, and point clouds, leading to concrete transferable architectures (e.g., GGNN, Continuous GGNN, SVD-DS) and refined variants of DeepSet and IGNs, along with a principled path to design new transferable models. Empirical results on size-generalization tasks show that aligning model inductive biases with the limit space yields robust performance across increasing input sizes, with trade-offs in computational efficiency depending on the chosen representation. Overall, the work provides both a theoretical foundation and practical tools for achieving reliable size generalization in diverse domains such as graphs, sets, and 3D point clouds, anchored by $V_\infty$-level continuity and robust generalization guarantees.
Abstract
Many modern learning tasks require models that can take inputs of varying sizes. Consequently, dimension-independent architectures have been proposed for domains where the inputs are graphs, sets, and point clouds. Recent work on graph neural networks has explored whether a model trained on low-dimensional data can transfer its performance to higher-dimensional inputs. We extend this body of work by introducing a general framework for transferability across dimensions. We show that transferability corresponds precisely to continuity in a limit space formed by identifying small problem instances with equivalent large ones. This identification is driven by the data and the learning task. We instantiate our framework on existing architectures, and implement the necessary changes to ensure their transferability. Finally, we provide design principles for designing new transferable models. Numerical experiments support our findings.
