Table of Contents
Fetching ...

Connecting Permutation Equivariant Neural Networks and Partition Diagrams

Edward Pearce-Crump

TL;DR

All of the weight matrices that appear in neural networks can be obtained from Schur-Weyl duality between the symmetric group and the partition algebra, in particular, to derive a simple diagrammatic method for calculating the weight matrices themselves.

Abstract

Permutation equivariant neural networks are often constructed using tensor powers of $\mathbb{R}^{n}$ as their layer spaces. We show that all of the weight matrices that appear in these neural networks can be obtained from Schur-Weyl duality between the symmetric group and the partition algebra. In particular, we adapt Schur-Weyl duality to derive a simple, diagrammatic method for calculating the weight matrices themselves.

Connecting Permutation Equivariant Neural Networks and Partition Diagrams

TL;DR

All of the weight matrices that appear in neural networks can be obtained from Schur-Weyl duality between the symmetric group and the partition algebra, in particular, to derive a simple diagrammatic method for calculating the weight matrices themselves.

Abstract

Permutation equivariant neural networks are often constructed using tensor powers of as their layer spaces. We show that all of the weight matrices that appear in these neural networks can be obtained from Schur-Weyl duality between the symmetric group and the partition algebra. In particular, we adapt Schur-Weyl duality to derive a simple, diagrammatic method for calculating the weight matrices themselves.
Paper Structure (20 sections, 7 theorems, 55 equations, 2 figures)

This paper contains 20 sections, 7 theorems, 55 equations, 2 figures.

Key Result

Proposition 9

The basis elements of $\mathop{\mathrm{Hom}}\nolimits_{S_n}((\mathbb{R}^{n})^{\otimes k}, (\mathbb{R}^{n})^{\otimes l})$ are in bijective correspondence with the orbits coming from the action of $S_n$ on the $(l+k)-$fold Cartesian product set $[n]^{l+k}$.

Figures (2)

  • Figure 1: We obtain the two basis matrices whose weighted linear combination gives all of the possible weight matrices that can appear in an $S_4$-equivariant neural network from $\mathbb{R}^4$ to $\mathbb{R}^4$. We obtain these matrices from the orbit basis diagrams in $P_1^1(4)$ that have at most $4$ blocks. For each orbit basis diagram, to calculate the $(I,J)$-entry of its associated basis matrix, we place the $I$-tuple on the top row of the diagram and the $J$-tuple on the bottom row of the diagram and see if they consistently label the diagram's blocks such that no two blocks have the same label. If the labelling is consistent, then we put a $1$ in the $(I,J)$-entry of the matrix, otherwise $0$.
  • Figure 2: We show the eight orbit basis diagrams in $P_2^2(2)$ that have at most $2$ blocks. They are needed to calculate the weight matrix for an $S_2$-equivariant linear layer function $(\mathbb{R}^{2})^{\otimes 2} \rightarrow (\mathbb{R}^{2})^{\otimes 2}$. As the number of orbit basis diagrams in $P_2^2(2)$ is $\mathop{\mathrm{B}}\nolimits(4) = 15$, this example highlights that the number of weights that appear in a permutation equivariant weight matrix depends on the relationship between the degree $n$ of the symmetric group $S_n$ and the sum of the tensor power orders $l + k$ that define the layers of the permutation equivariant neural network.

Theorems & Definitions (26)

  • Definition 1
  • Definition 2
  • Definition 3
  • Definition 4
  • Example 5
  • Example 6
  • Remark 7
  • Example 8
  • Proposition 9
  • proof
  • ...and 16 more