Table of Contents
Fetching ...

Equivariance via Minimal Frame Averaging for More Symmetries and Efficiency

Yuchao Lin, Jacob Helwig, Shurui Gui, Shuiwang Ji

TL;DR

This work addresses the challenge of encoding symmetries in ML models with exact equivariance while maintaining computational efficiency. It introduces Minimal Frame Averaging (MFA), a theory that constructs provably minimal frames yielding exact $G$-equivariance, and extends to broad groups including the Lorentz group $O(1,d-1)$ and the unitary group $U(d)$ via a generalized QR framework. The paper develops induced-$G$-set canonicalization, generalized QR decomposition, and a spectrum of minimal frames for linear algebraic and permutation groups, with extensive experiments across $n$-body dynamics, collider top tagging, OC20 energy prediction, graph separation, and convex hull problems. MFA demonstrates exact equivariance with a single backbone call per forward pass and outperforms sampling-based frame averaging in invariance quality and efficiency, offering a versatile tool for symmetry-aware ML on unstructured data and beyond.

Abstract

We consider achieving equivariance in machine learning systems via frame averaging. Current frame averaging methods involve a costly sum over large frames or rely on sampling-based approaches that only yield approximate equivariance. Here, we propose Minimal Frame Averaging (MFA), a mathematical framework for constructing provably minimal frames that are exactly equivariant. The general foundations of MFA also allow us to extend frame averaging to more groups than previously considered, including the Lorentz group for describing symmetries in space-time, and the unitary group for complex-valued domains. Results demonstrate the efficiency and effectiveness of encoding symmetries via MFA across a diverse range of tasks, including $n$-body simulation, top tagging in collider physics, and relaxed energy prediction. Our code is available at https://github.com/divelab/MFA.

Equivariance via Minimal Frame Averaging for More Symmetries and Efficiency

TL;DR

This work addresses the challenge of encoding symmetries in ML models with exact equivariance while maintaining computational efficiency. It introduces Minimal Frame Averaging (MFA), a theory that constructs provably minimal frames yielding exact -equivariance, and extends to broad groups including the Lorentz group and the unitary group via a generalized QR framework. The paper develops induced--set canonicalization, generalized QR decomposition, and a spectrum of minimal frames for linear algebraic and permutation groups, with extensive experiments across -body dynamics, collider top tagging, OC20 energy prediction, graph separation, and convex hull problems. MFA demonstrates exact equivariance with a single backbone call per forward pass and outperforms sampling-based frame averaging in invariance quality and efficiency, offering a versatile tool for symmetry-aware ML on unstructured data and beyond.

Abstract

We consider achieving equivariance in machine learning systems via frame averaging. Current frame averaging methods involve a costly sum over large frames or rely on sampling-based approaches that only yield approximate equivariance. Here, we propose Minimal Frame Averaging (MFA), a mathematical framework for constructing provably minimal frames that are exactly equivariant. The general foundations of MFA also allow us to extend frame averaging to more groups than previously considered, including the Lorentz group for describing symmetries in space-time, and the unitary group for complex-valued domains. Results demonstrate the efficiency and effectiveness of encoding symmetries via MFA across a diverse range of tasks, including -body simulation, top tagging in collider physics, and relaxed energy prediction. Our code is available at https://github.com/divelab/MFA.
Paper Structure (81 sections, 23 theorems, 94 equations, 4 figures, 10 tables, 3 algorithms)

This paper contains 81 sections, 23 theorems, 94 equations, 4 figures, 10 tables, 3 algorithms.

Key Result

Lemma 3.1

Given a frame $\mathcal{F}:\mathcal{S}\rightarrow \mathcal{P}(G) \setminus \{\emptyset\}$, for all $x\in \mathcal{S}$, there exists $x_0\in \text{Orb}_G(x)$ such that $\text{Stab}_G(x_0)\subseteq \mathcal{F} (x_0)$.

Figures (4)

  • Figure 1: Comparative illustrations of equivariance test on the groups $\mathrm O(3), \mathrm O(5), \mathrm{SO}(3), \mathrm{SO}(5), \mathrm E(3), \mathrm E(5), \mathrm{SE}(3)$, and $\mathrm{SE}(5)$. Figures (a) and (b) depict the outcomes for $\mathrm O(3)$ and $\mathrm O(5)$, figures (c) and (d) display the results for $\mathrm{SO}(3)$ and $\mathrm{SO}(5)$, figures (e) and (f) show the results for $\mathrm{E}(3)$ and $\mathrm{E}(5)$, figures (g) and (h) demonstrate the results for $\mathrm{SE}(3)$ and $\mathrm{SE}(5)$, respectively. Data for left column is randomly sampled with a shape of $100\times 3$ and data for the right column is randomly sampled with a shape of $20\times 5$. All figures show the effectiveness of both our frame averaging method and original frame averaging method puny2021frame.
  • Figure 2: Illustrations of equivariance test on the groups $\mathrm O(5), \mathrm{SO}(5), \mathrm{E}(5), \mathrm{SE}(5)$ with degenerate singular values. All data is randomly sampled with a shape of $20\times 5$ with $3$ repeated singular values. As shown above, the original frame averaging method puny2021frame fails the degenerate cases due to the repeated eigenvalues causing the frame size into infinity, while our method is not affected by the repeated eigenvalues and our minimal frame averaging is still equivariant to these groups.
  • Figure 3: Illustrations of equivariance test on the groups $\mathrm O(1,3), \mathrm{SO}(1,3), \mathrm{GL}(3,\mathbb{R}), \mathrm{SL}(3,\mathbb{R})$. As the original frame averaging method puny2021frame does not give the frame construction for these groups, we only compare with the model without our method. The data for $\mathrm O(1,3)$ and $\mathrm{SO}(1,3)$ is randomly sampled with a shape of $100\times 4$, and data for $\mathrm{GL}(3,\mathbb{R})$ and $\mathrm{SL}(3,\mathbb{R})$ is randomly sampled with a shape of $100\times 3$. All the error scaling our method in the figures of is below $1e^{1e-3}$, showing that our method is indeed equivariant with respect to these group.
  • Figure 4: Illustrations of equivariance test on the groups $\mathrm U(3), \mathrm{SU}(3), \mathrm{S}_n, \mathrm{S}_n \times \mathrm{O}(3), \mathrm{S}_n \times \mathrm{O}(1,3)$. In (d), the first two rows correspond to the error test of $\mathrm{S}_n \times \mathrm{O}(3)$ and the last two rows correspond to that of $\mathrm{S}_n \times \mathrm{O}(1,3)$. As the original frame averaging method puny2021frame does not give the frame construction for these groups, we only compare with the model without our method. The data for $\mathrm U(3)$ and $\mathrm{SU}(3)$ is randomly sampled with a shape of $100\times 3$, data for $\mathrm{S}_n\times \mathrm{O}(3)$ is randomly sampled with a shape of $32\times 3$, and data for $\mathrm{S}_n\times \mathrm{O}(1, 3)$ is randomly sampled with a shape of $32\times 4$. Note that the networks used for these groups are different from the previous groups to accommodate the properties of these groups. The MLP models used for both $\mathrm{U}(3)$ and $\mathrm{SU}(3)$ are complex valued networks, and the MLP models used for $\mathrm{S}_n\times\mathrm{O}(3)$ and $\mathrm{S}_n\times\mathrm{O}(1,3)$ are transforming both the node dimension and the feature dimension, and are neither $\mathrm{S}_n\times \mathrm{O}(3)$ nor $\mathrm{S}_n\times\mathrm{O}(1,3)$-equivariant. Specially, the $\mathrm{S}_n\times \mathrm{O}(3)$ or $\mathrm{S}_n\times\mathrm{O}(1,3)$-equivariant frame is created by applying $\mathrm{S}_n$-equivariant and $\mathrm{O}(3)$ or $\mathrm{O}(1,3)$-invariant frame to the MLP with $\mathrm{O}(3)$ or $\mathrm{O}(1,3)$-equivariant frame, corresponding to \ref{['sec:pcloud']}.

Theorems & Definitions (42)

  • Definition 2.1: $G$-Equivariant Frame puny2021frame
  • Definition 3.1: Minimal Frame
  • Lemma 3.1
  • Theorem 3.2
  • Definition 3.3: Canonical Form
  • Definition 3.4: Induced $G$-set
  • Theorem 3.5
  • Theorem 4.1
  • Theorem 4.2
  • Theorem 5.1: puny2021frame
  • ...and 32 more