Table of Contents
Fetching ...

Any-Subgroup Equivariant Networks via Symmetry Breaking

Abhinav Goel, Derek Lim, Hannah Lawrence, Stefanie Jegelka, Ningyuan Huang

Abstract

The inclusion of symmetries as an inductive bias, known as equivariance, often improves generalization on geometric data (e.g. grids, sets, and graphs). However, equivariant architectures are usually highly constrained, designed for symmetries chosen a priori, and not applicable to datasets with other symmetries. This precludes the development of flexible, multi-modal foundation models capable of processing diverse data equivariantly. In this work, we build a single model -- the Any-Subgroup Equivariant Network (ASEN) -- that can be simultaneously equivariant to several groups, simply by modulating a certain auxiliary input feature. In particular, we start with a fully permutation-equivariant base model, and then obtain subgroup equivariance by using a symmetry-breaking input whose automorphism group is that subgroup. However, finding an input with the desired automorphism group is computationally hard. We overcome this by relaxing from exact to approximate symmetry breaking, leveraging the notion of 2-closure to derive fast algorithms. Theoretically, we show that our subgroup-equivariant networks can simulate equivariant MLPs, and their universality can be guaranteed if the base model is universal. Empirically, we validate our method on symmetry selection for graph and image tasks, as well as multitask and transfer learning for sequence tasks, showing that a single network equivariant to multiple permutation subgroups outperforms both separate equivariant models and a single non-equivariant model.

Any-Subgroup Equivariant Networks via Symmetry Breaking

Abstract

The inclusion of symmetries as an inductive bias, known as equivariance, often improves generalization on geometric data (e.g. grids, sets, and graphs). However, equivariant architectures are usually highly constrained, designed for symmetries chosen a priori, and not applicable to datasets with other symmetries. This precludes the development of flexible, multi-modal foundation models capable of processing diverse data equivariantly. In this work, we build a single model -- the Any-Subgroup Equivariant Network (ASEN) -- that can be simultaneously equivariant to several groups, simply by modulating a certain auxiliary input feature. In particular, we start with a fully permutation-equivariant base model, and then obtain subgroup equivariance by using a symmetry-breaking input whose automorphism group is that subgroup. However, finding an input with the desired automorphism group is computationally hard. We overcome this by relaxing from exact to approximate symmetry breaking, leveraging the notion of 2-closure to derive fast algorithms. Theoretically, we show that our subgroup-equivariant networks can simulate equivariant MLPs, and their universality can be guaranteed if the base model is universal. Empirically, we validate our method on symmetry selection for graph and image tasks, as well as multitask and transfer learning for sequence tasks, showing that a single network equivariant to multiple permutation subgroups outperforms both separate equivariant models and a single non-equivariant model.
Paper Structure (35 sections, 8 theorems, 20 equations, 12 figures, 7 tables, 1 algorithm)

This paper contains 35 sections, 8 theorems, 20 equations, 12 figures, 7 tables, 1 algorithm.

Key Result

Proposition 1

Let $h_\theta: \mathcal{X} \times \mathcal{V} \to \mathcal{Y}$ be $\mathbf{G}$-equivariant, and let $\mathrm{Aut}(\mathbf{v}) = G$. Then $f_\theta(x) := h_\theta(x, \mathbf{v})$ is equivariant to $G$. If additionally $h_\theta$ is injective in the input $\mathbf{v}$, then $f_\theta$ is not equivaria

Figures (12)

  • Figure 1: Example symmetry breaking objects as positional features $A^{(1)}$ and edge features $A^{(2)}$ for encoding subgroup symmetries in $4$-node paths. These symmetries are explored further in \ref{['sec:experiments']}.
  • Figure 2: ASEN Architecture to model any permutation subgroup-equivariant functions.
  • Figure 3: Pairwise distances of learned positional encodings of Transformer on Pathfinder-64, with test accuracy shown at the top: imposing local permutation symmetry improves performance.
  • Figure 4: ASEN with the correct group ("Equivariant") converges faster and to a lower loss than its trivial symmetry counterpart ("Non-equivariant").
  • Figure 5: Initial edge weights (left) and trained weights (right): ASEN learns more symmetries from data.
  • ...and 7 more figures

Theorems & Definitions (12)

  • Proposition 1
  • Lemma 1
  • Theorem 1
  • Theorem 2
  • Proposition 1
  • proof
  • Lemma 1
  • proof
  • Theorem 2
  • proof
  • ...and 2 more