Table of Contents
Fetching ...

A Canonicalization Perspective on Invariant and Equivariant Learning

George Ma, Yifei Wang, Derek Lim, Stefanie Jegelka, Yisen Wang

TL;DR

A canonicalization perspective is introduced that provides an essential and complete view of the design of frames and suggests that canonicalization provides a fundamental understanding of existing frame-averaging methods and unifies existing equivariant and invariant learning methods.

Abstract

In many applications, we desire neural networks to exhibit invariance or equivariance to certain groups due to symmetries inherent in the data. Recently, frame-averaging methods emerged to be a unified framework for attaining symmetries efficiently by averaging over input-dependent subsets of the group, i.e., frames. What we currently lack is a principled understanding of the design of frames. In this work, we introduce a canonicalization perspective that provides an essential and complete view of the design of frames. Canonicalization is a classic approach for attaining invariance by mapping inputs to their canonical forms. We show that there exists an inherent connection between frames and canonical forms. Leveraging this connection, we can efficiently compare the complexity of frames as well as determine the optimality of certain frames. Guided by this principle, we design novel frames for eigenvectors that are strictly superior to existing methods -- some are even optimal -- both theoretically and empirically. The reduction to the canonicalization perspective further uncovers equivalences between previous methods. These observations suggest that canonicalization provides a fundamental understanding of existing frame-averaging methods and unifies existing equivariant and invariant learning methods. Code is available at https://github.com/PKU-ML/canonicalization.

A Canonicalization Perspective on Invariant and Equivariant Learning

TL;DR

A canonicalization perspective is introduced that provides an essential and complete view of the design of frames and suggests that canonicalization provides a fundamental understanding of existing frame-averaging methods and unifies existing equivariant and invariant learning methods.

Abstract

In many applications, we desire neural networks to exhibit invariance or equivariance to certain groups due to symmetries inherent in the data. Recently, frame-averaging methods emerged to be a unified framework for attaining symmetries efficiently by averaging over input-dependent subsets of the group, i.e., frames. What we currently lack is a principled understanding of the design of frames. In this work, we introduce a canonicalization perspective that provides an essential and complete view of the design of frames. Canonicalization is a classic approach for attaining invariance by mapping inputs to their canonical forms. We show that there exists an inherent connection between frames and canonical forms. Leveraging this connection, we can efficiently compare the complexity of frames as well as determine the optimality of certain frames. Guided by this principle, we design novel frames for eigenvectors that are strictly superior to existing methods -- some are even optimal -- both theoretically and empirically. The reduction to the canonicalization perspective further uncovers equivalences between previous methods. These observations suggest that canonicalization provides a fundamental understanding of existing frame-averaging methods and unifies existing equivariant and invariant learning methods. Code is available at https://github.com/PKU-ML/canonicalization.
Paper Structure (58 sections, 21 theorems, 38 equations, 2 figures, 11 tables, 8 algorithms)

This paper contains 58 sections, 21 theorems, 38 equations, 2 figures, 11 tables, 8 algorithms.

Key Result

Theorem 3.1

For any frame $\mathcal{F}$ there exists an orbit canonicalization $\mathcal{C}_\mathcal{F}$s.t. for all $X\in V$, $g\in G$, and backbone $\phi$, we have $\varPhi_\mathrm{FA}(X;\mathcal{F},\phi)=\varPhi_\mathrm{CA}(X;\mathcal{C}_\mathcal{F},\phi)$ and In turn, for any orbit canonicalization $\mathcal{C}$ there exists a frame $\mathcal{F}_\mathcal{C}$s.t. for all $X\in V$, $g\in G$, and backbone $

Figures (2)

  • Figure 2: Using PCA-frame methods to achieve orthogonal equivariance, described as a neural network model $f({\bm{X}})=h({\bm{X}}{\bm{R}}_{\bm{X}}){\bm{R}}_{\bm{X}}^\top$, where ${\bm{R}}_{\bm{X}}$ is a choice of principal components for the point cloud ${\bm{X}}\in{\mathbb{R}}^{n\times k}$. We first transform ${\bm{X}}$ via ${\bm{R}}_{\bm{X}}$ into an orientation that is unique up to sign flips, then process ${\bm{X}}{\bm{R}}_{\bm{X}}$ using a network $h$, and finally reintegrate orientation information back into the output via ${\bm{R}}_{\bm{X}}^\top$. Figure reproduced with permission from sign-equivariant.
  • Figure 3: The training time and test MSE of models in the $n$-body problem. Results are averaged over 4 runs with different seeds.

Theorems & Definitions (40)

  • Theorem 3.1
  • Theorem 3.2
  • Theorem 3.3
  • Theorem 3.4
  • Corollary 3.5
  • Theorem 4.1
  • Theorem 4.2
  • Theorem 4.3
  • Corollary 4.4
  • Theorem B.1
  • ...and 30 more