Table of Contents
Fetching ...

Improving Equivariant Networks with Probabilistic Symmetry Breaking

Hannah Lawrence, Vasco Portilheiro, Yan Zhang, Sékou-Oumar Kaba

TL;DR

This work tackles the limitation of equivariant networks that enforce self-symmetry by reframing prediction as sampling from equivariant conditional distributions ${\mathbb{P}(Y|X)}$ and introducing randomized canonicalization to enable symmetry breaking. The authors prove a representation theorem using an inversion kernel and propose SymPE, a symmetry-breaking positional encoding that injects a sampled group element into inputs in an equivariant manner. They also show that equivariant noise injection sits within the same representational class, derive generalization benefits for symmetry-breaking, and connect to existing relaxed-equivariance frameworks. Empirically, SymPE improves performance across graph autoencoding, diffusion-based graph generation, and Ising-model ground-state prediction, demonstrating robust symmetry-breaking while preserving the inductive bias of symmetry. Overall, the approach provides a principled, end-to-end framework for expanding the expressive power of equivariant networks without discarding their symmetry-driven generalization advantages.

Abstract

Equivariance encodes known symmetries into neural networks, often enhancing generalization. However, equivariant networks cannot break symmetries: the output of an equivariant network must, by definition, have at least the same self-symmetries as the input. This poses an important problem, both (1) for prediction tasks on domains where self-symmetries are common, and (2) for generative models, which must break symmetries in order to reconstruct from highly symmetric latent spaces. This fundamental limitation can be addressed by considering equivariant conditional distributions, instead of equivariant functions. We present novel theoretical results that establish necessary and sufficient conditions for representing such distributions. Concretely, this representation provides a practical framework for breaking symmetries in any equivariant network via randomized canonicalization. Our method, SymPE (Symmetry-breaking Positional Encodings), admits a simple interpretation in terms of positional encodings. This approach expands the representational power of equivariant networks while retaining the inductive bias of symmetry, which we justify through generalization bounds. Experimental results demonstrate that SymPE significantly improves performance of group-equivariant and graph neural networks across diffusion models for graphs, graph autoencoders, and lattice spin system modeling.

Improving Equivariant Networks with Probabilistic Symmetry Breaking

TL;DR

This work tackles the limitation of equivariant networks that enforce self-symmetry by reframing prediction as sampling from equivariant conditional distributions and introducing randomized canonicalization to enable symmetry breaking. The authors prove a representation theorem using an inversion kernel and propose SymPE, a symmetry-breaking positional encoding that injects a sampled group element into inputs in an equivariant manner. They also show that equivariant noise injection sits within the same representational class, derive generalization benefits for symmetry-breaking, and connect to existing relaxed-equivariance frameworks. Empirically, SymPE improves performance across graph autoencoding, diffusion-based graph generation, and Ising-model ground-state prediction, demonstrating robust symmetry-breaking while preserving the inductive bias of symmetry. Overall, the approach provides a principled, end-to-end framework for expanding the expressive power of equivariant networks without discarding their symmetry-driven generalization advantages.

Abstract

Equivariance encodes known symmetries into neural networks, often enhancing generalization. However, equivariant networks cannot break symmetries: the output of an equivariant network must, by definition, have at least the same self-symmetries as the input. This poses an important problem, both (1) for prediction tasks on domains where self-symmetries are common, and (2) for generative models, which must break symmetries in order to reconstruct from highly symmetric latent spaces. This fundamental limitation can be addressed by considering equivariant conditional distributions, instead of equivariant functions. We present novel theoretical results that establish necessary and sufficient conditions for representing such distributions. Concretely, this representation provides a practical framework for breaking symmetries in any equivariant network via randomized canonicalization. Our method, SymPE (Symmetry-breaking Positional Encodings), admits a simple interpretation in terms of positional encodings. This approach expands the representational power of equivariant networks while retaining the inductive bias of symmetry, which we justify through generalization bounds. Experimental results demonstrate that SymPE significantly improves performance of group-equivariant and graph neural networks across diffusion models for graphs, graph autoencoders, and lattice spin system modeling.

Paper Structure

This paper contains 44 sections, 9 theorems, 36 equations, 7 figures, 5 tables, 1 algorithm.

Key Result

Theorem 3.4

${\mathbb{P}\left(Y|X\right)}$ is equivariant if and only if for a function $f:\mathcal{X}\times G \times(0,1)\to\mathcal{Y}$ jointly equivariant in its first two inputs (i.e. $f(hx,hg,\epsilon)=hf(x,g,\epsilon)$), noise $\epsilon\sim\mathop{\mathrm{Unif}}\nolimits(0,1)$, and $\tilde{g}|X$ distributed according to some inversion kernel.

Figures (7)

  • Figure 1: Example applications requiring symmetry breaking. Top: A rotation-equivariant network for molecules cannot transform benzene into dichlorobenzene due to benzene's sixfold symmetry. Middle: A permutation-equivariant graph decoder cannot break latent space symmetries (see \ref{['apd:graph-ae-experiment']}). Bottom: A rotation-equivariant network for point clouds cannot transform a table into a chair, as the table's legs are rotationally indistinguishable, unlike the chair's.
  • Figure 2: Example of how a group acts on distributions. Top: An element $g \in SO(2)$, the group of two-dimensional rotations, can act on a unit vector in $\mathbb{R}^2$ by standard rotation. Middle: This induces an action of $g$ on a distribution (blue) over unit vectors (orange). Under this action, $g$ rotates the entire distribution. Bottom: When $g=60^\circ$ acts on a distribution which is already $60^\circ$ self-symmetric, the distribution remains unchanged---even though a vector sampled from the distribution has no $60^\circ$ self-symmetry.
  • Figure 3: When the input $x$ is rotated by $30^\circ$, as shown from the top to the bottom row, the equivariant conditional distribution ${\mathbb{P}\left(Y|X=x\right)}$ (middle column) also rotates by $30^\circ$. The distribution thus has the same self-symmetry as $x$, which is sixfold rotational symmetry. However, individual samples from ${\mathbb{P}\left(Y|X=x\right)}$ are free to break this self-symmetry, as shown in the rightmost column.
  • Figure 4: Illustration of our symmetry-breaking method. Here, a die indicates randomness, which is used in the canonicalization method (shown in the dotted box) to sample $\tilde{g}$. (Optionally, a random variable $\epsilon$ can also be input to the equivariant network $f$, to capture randomness unrelated to symmetry-breaking; we do not include this variable in our experiments.) Ultimately, the input $x$ and the sampled group element $\tilde{g}$ are input to an equivariant network $f$ as $f(x,\tilde{g})$.
  • Figure 5: Phase diagrams predicted by the different methods. For each configuration predicted by the neural network on a test set Hamiltonian, we compute the values of the three order parameters: the ferromagnetic phase (red), the antiferromagnetic phase (green), and the stripes phase (blue). Brighter colors are associated with larger values of the order parameter, and black to the disordered phase.
  • ...and 2 more figures

Theorems & Definitions (26)

  • Definition 3.1: Canonicalization function
  • Definition 3.2: Inversion kernel
  • Example 3.3
  • Theorem 3.4
  • Proposition 5.1: Noise injection
  • Theorem 6.1
  • Remark 6.2
  • Theorem A.1: Randomized canonicalization
  • Definition A.2
  • Corollary A.3: Relaxed equivariance
  • ...and 16 more