Table of Contents
Fetching ...

Permutation Equivariant Neural Networks for Symmetric Tensors

Edward Pearce-Crump

TL;DR

This work addresses learning linear $S_n$-equivariant maps between symmetric power spaces $S^k(\mathbb{R}^n)$ and $S^l(\mathbb{R}^n)$. It provides two exact characterizations of all such maps through an orbit basis and a diagram basis, and introduces map label notation to enable memory-efficient implementation by avoiding explicit storage of large weight matrices. The authors show that these permutation-equivariant linear functions are highly data-efficient on synthetic tasks and can generalize across tensor sizes, with substantial computational speedups when using the diagram-based map-label approach. Collectively, the results enable scalable, exact, and transferable learning with symmetric tensors under permutation symmetry, with potential applications across physics, chemistry, and graph-structured data.

Abstract

Incorporating permutation equivariance into neural networks has proven to be useful in ensuring that models respect symmetries that exist in data. Symmetric tensors, which naturally appear in statistics, machine learning, and graph theory, are essential for many applications in physics, chemistry, and materials science, amongst others. However, existing research on permutation equivariant models has not explored symmetric tensors as inputs, and most prior work on learning from these tensors has focused on equivariance to Euclidean groups. In this paper, we present two different characterisations of all linear permutation equivariant functions between symmetric power spaces of $\mathbb{R}^n$. We show on two tasks that these functions are highly data efficient compared to standard MLPs and have potential to generalise well to symmetric tensors of different sizes.

Permutation Equivariant Neural Networks for Symmetric Tensors

TL;DR

This work addresses learning linear -equivariant maps between symmetric power spaces and . It provides two exact characterizations of all such maps through an orbit basis and a diagram basis, and introduces map label notation to enable memory-efficient implementation by avoiding explicit storage of large weight matrices. The authors show that these permutation-equivariant linear functions are highly data-efficient on synthetic tasks and can generalize across tensor sizes, with substantial computational speedups when using the diagram-based map-label approach. Collectively, the results enable scalable, exact, and transferable learning with symmetric tensors under permutation symmetry, with potential applications across physics, chemistry, and graph-structured data.

Abstract

Incorporating permutation equivariance into neural networks has proven to be useful in ensuring that models respect symmetries that exist in data. Symmetric tensors, which naturally appear in statistics, machine learning, and graph theory, are essential for many applications in physics, chemistry, and materials science, amongst others. However, existing research on permutation equivariant models has not explored symmetric tensors as inputs, and most prior work on learning from these tensors has focused on equivariance to Euclidean groups. In this paper, we present two different characterisations of all linear permutation equivariant functions between symmetric power spaces of . We show on two tasks that these functions are highly data efficient compared to standard MLPs and have potential to generalise well to symmetric tensors of different sizes.

Paper Structure

This paper contains 18 sections, 9 theorems, 138 equations, 3 figures, 3 tables.

Key Result

Proposition 3.3

Linear permutation equivariant scalar-valued and vector-valued functions on symmetric tensors in $(\mathbb{R}^{n})^{\otimes k}$ are elements of $\mathop{\mathrm{Hom}}\nolimits_{S_n}(S^k(\mathbb{R}^{n}),S^0(\mathbb{R}^{n}))$ and $\mathop{\mathrm{Hom}}\nolimits_{S_n}(S^k(\mathbb{R}^{n}),S^1(\mathbb{R}

Figures (3)

  • Figure 1: For the $(9,7)$--bipartition $\{ [3,1], [2,2], [2,1], [1,3], [1,0] \}$, we show how to obtain the element in the corresponding $S_6$ orbit of $S[6]^{7} \times S[6]^{9}$ that comes from labelling its blocks with the $5$-length tuple $\{6,2,3,1,4\}$. By reordering the labelled green nodes of the $(9,7)$--orbit bipartition diagram and propagating these values to the ends of the wires, we see that this element is $\binom{1112236}{122334666}$.
  • Figure 2: After labelling the central green nodes of a $(5,4)$--bipartition diagram with the tuple $(1,2,1)$ and reordering the spiders into ascending numerical order, we fuse together any spiders that are labelled with the same value.
  • Figure 3: Data efficiency for the synthetic $S_{12}$-invariant task. The shaded regions depict 95% confidence intervals taken over 3 runs.

Theorems & Definitions (53)

  • Example 3.1
  • Definition 3.2
  • Proposition 3.3
  • Proposition 4.1
  • proof
  • Example 4.2
  • Definition 4.3
  • Example 4.4
  • Remark 4.5
  • Example 4.6
  • ...and 43 more