Invariant Kernels: Rank Stabilization and Generalization Across Dimensions
Mateo Díaz, Dmitriy Drusvyatskiy, Jack Kendrick, Rekha R. Thomas
TL;DR
This paper investigates how symmetry in data—captured by group invariances—drives the rank and generalization properties of fixed-degree polynomial kernels. By linking kernel rank to the dimension of invariant polynomial spaces ${\mathbb{R}}[V]_m^G$, it shows that in several natural settings (permutations, set-permutations, graphs, and point clouds) the invariant dimension, and hence the kernel rank, can be independent of the ambient data dimension, enabling dimension-free learning and efficient computation. The authors develop free invariant bases (elementary symmetric polynomials and polarized variants) and show how to express invariant kernels as dimension-stable bilinear forms using these bases, together with a Monte Carlo approach to estimate invariant-dimensions and a cross-dimension (minimax) generalization framework. They connect rank stabilization to representation stability, providing a principled path to learning across varying dimensions with finite-parameter representations, and validate the theory through numerical experiments on set-classification, cross-dimension regression, and invariant-dimension estimation. The results offer a rigorous, scalable approach for leveraging invariances in kernel methods, with practical implications for multi-dimensional learning across domains such as graphs, point clouds, and unordered sets.
Abstract
Symmetry arises often when learning from high dimensional data. For example, data sets consisting of point clouds, graphs, and unordered sets appear routinely in contemporary applications, and exhibit rich underlying symmetries. Understanding the benefits of symmetry on the statistical and numerical efficiency of learning algorithms is an active area of research. In this work, we show that symmetry has a pronounced impact on the rank of kernel matrices. Specifically, we compute the rank of a polynomial kernel of fixed degree that is invariant under various groups acting independently on its two arguments. In concrete circumstances, including the three aforementioned examples, symmetry dramatically decreases the rank making it independent of the data dimension. In such settings, we show that a simple regression procedure is minimax optimal for estimating an invariant polynomial from finitely many samples drawn across different dimensions. We complete the paper with numerical experiments that illustrate our findings.
