Table of Contents
Fetching ...

A Galois theorem for machine learning: Functions on symmetric matrices and point clouds via lightweight invariant features

Ben Blum-Smith, Ningyuan Huang, Marco Cuturi, Soledad Villar

TL;DR

The work addresses learning invariant functions on symmetric matrices and point clouds under permutation and Euclidean symmetries by developing a Galois-inspired framework that yields generically separating invariant features. These features, initially $O(n^2)$ for graphs and $O(n^2)$ for point clouds, can be reduced to $O(n)$ in fixed dimension via low-rank embeddings and permutation-orbit separators, enabling scalable universal approximation when combined with DeepSets. Theoretical results establish generically separating invariants for the relevant group actions, and practical architectures DS-CI and OI-DS demonstrate effectiveness on molecule property regression and Gromov–Wasserstein distance prediction for point clouds. The approach offers a scalable, theoretically grounded alternative to full invariant generation with promising applications in chemistry, geometry, and cosmology.

Abstract

In this work, we present a mathematical formulation for machine learning of (1) functions on symmetric matrices that are invariant with respect to the action of permutations by conjugation, and (2) functions on point clouds that are invariant with respect to rotations, reflections, and permutations of the points. To achieve this, we provide a general construction of generically separating invariant features using ideas inspired by Galois theory. We construct $O(n^2)$ invariant features derived from generators for the field of rational functions on $n\times n$ symmetric matrices that are invariant under joint permutations of rows and columns. We show that these invariant features can separate all distinct orbits of symmetric matrices except for a measure zero set; such features can be used to universally approximate invariant functions on almost all weighted graphs. For point clouds in a fixed dimension, we prove that the number of invariant features can be reduced, generically without losing expressivity, to $O(n)$, where $n$ is the number of points. We combine these invariant features with DeepSets to learn functions on symmetric matrices and point clouds with varying sizes. We empirically demonstrate the feasibility of our approach on molecule property regression and point cloud distance prediction.

A Galois theorem for machine learning: Functions on symmetric matrices and point clouds via lightweight invariant features

TL;DR

The work addresses learning invariant functions on symmetric matrices and point clouds under permutation and Euclidean symmetries by developing a Galois-inspired framework that yields generically separating invariant features. These features, initially for graphs and for point clouds, can be reduced to in fixed dimension via low-rank embeddings and permutation-orbit separators, enabling scalable universal approximation when combined with DeepSets. Theoretical results establish generically separating invariants for the relevant group actions, and practical architectures DS-CI and OI-DS demonstrate effectiveness on molecule property regression and Gromov–Wasserstein distance prediction for point clouds. The approach offers a scalable, theoretically grounded alternative to full invariant generation with promising applications in chemistry, geometry, and cosmology.

Abstract

In this work, we present a mathematical formulation for machine learning of (1) functions on symmetric matrices that are invariant with respect to the action of permutations by conjugation, and (2) functions on point clouds that are invariant with respect to rotations, reflections, and permutations of the points. To achieve this, we provide a general construction of generically separating invariant features using ideas inspired by Galois theory. We construct invariant features derived from generators for the field of rational functions on symmetric matrices that are invariant under joint permutations of rows and columns. We show that these invariant features can separate all distinct orbits of symmetric matrices except for a measure zero set; such features can be used to universally approximate invariant functions on almost all weighted graphs. For point clouds in a fixed dimension, we prove that the number of invariant features can be reduced, generically without losing expressivity, to , where is the number of points. We combine these invariant features with DeepSets to learn functions on symmetric matrices and point clouds with varying sizes. We empirically demonstrate the feasibility of our approach on molecule property regression and point cloud distance prediction.
Paper Structure (12 sections, 11 theorems, 59 equations, 2 figures, 2 tables)

This paper contains 12 sections, 11 theorems, 59 equations, 2 figures, 2 tables.

Key Result

Theorem 3.1

Let $X,\Gamma, G,\mathcal{F}$ be as above. Suppose $f_1,\dots,f_r: X\rightarrow\mathbb{F}$ are $\Gamma$-invariant functions that are generically separating for $\Gamma$. If $f^\star_1,\dots,f^\star_s:X\rightarrow\mathbb{F}$ are $G$-invariant functions belonging to $\mathcal{F}$, such that for $\gamm then $f_1,\dots,f_r,f^\star_1,\dots,f^\star_s$ are generically separating for $G$.

Figures (2)

  • Figure 1: Example point clouds from ModelNet10: (Top) original; (Bottom) downsampled with 100 random sampling points per point cloud.
  • Figure 2: Performance of GW-based distance regression. The cross-class pairwise distances are shown in (a) for the training set, and (b) for the test set. The left, middle, right panel in (a) and (b) corresponds to the target distances, the predicted distances from DS-CI, and the predicted distances from OI-DS, respectively.

Theorems & Definitions (30)

  • Theorem 3.1
  • proof
  • Proposition 4.1
  • Remark 4.2
  • Remark 4.3
  • proof : Proof of Proposition \ref{['prop:universal-approx-for-matrices']}
  • Theorem 4.4
  • proof
  • Proposition 5.1
  • proof
  • ...and 20 more