Table of Contents
Fetching ...

A Generative Model of Symmetry Transformations

James Urquhart Allingham, Bruno Kacper Mlodozeniec, Shreyas Padhy, Javier Antorán, David Krueger, Richard E. Turner, Eric Nalisnick, José Miguel Hernández-Lobato

TL;DR

The paper introduces the Symmetry-aware Generative Model (SGM), a two-stage generative framework that separates an invariant prototype $\\hat{{\mathbf{x}}}$ from an equivariant transformation latent $\\bm{\upeta}$ so that ${\mathbf{x}} = {\\mathcal{T}}_{\\bm{\upeta}}(\\hat{{\mathbf{x}}})$. By modeling a group of transformations and learning $p(\\bm{\upeta}|\\hat{{\mathbf{x}}})$ alongside a self-supervised invariant mapping $f_{\\omega}$, the method captures the distribution of naturally occurring symmetries without requiring a distribution over prototypes. The approach yields interpretable symmetry representations, enables natural data augmentation, and improves marginal log-likelihoods and data efficiency when integrated with VAEs, as demonstrated on datasets with affine and color transformations (e.g., dSprites, MNIST, GalaxyMNIST). However, it requires specifying a superset of possible symmetries and exhibits limitations in boundary-content datasets, motivating future work to relax symmetry sets and handle boundary effects. Overall, SGM offers a principled, group-theoretic avenue for uncovering and leveraging data symmetries in generative modeling and beyond.

Abstract

Correctly capturing the symmetry transformations of data can lead to efficient models with strong generalization capabilities, though methods incorporating symmetries often require prior knowledge. While recent advancements have been made in learning those symmetries directly from the dataset, most of this work has focused on the discriminative setting. In this paper, we take inspiration from group theoretic ideas to construct a generative model that explicitly aims to capture the data's approximate symmetries. This results in a model that, given a prespecified but broad set of possible symmetries, learns to what extent, if at all, those symmetries are actually present. Our model can be seen as a generative process for data augmentation. We provide a simple algorithm for learning our generative model and empirically demonstrate its ability to capture symmetries under affine and color transformations, in an interpretable way. Combining our symmetry model with standard generative models results in higher marginal test-log-likelihoods and improved data efficiency.

A Generative Model of Symmetry Transformations

TL;DR

The paper introduces the Symmetry-aware Generative Model (SGM), a two-stage generative framework that separates an invariant prototype from an equivariant transformation latent so that . By modeling a group of transformations and learning alongside a self-supervised invariant mapping , the method captures the distribution of naturally occurring symmetries without requiring a distribution over prototypes. The approach yields interpretable symmetry representations, enables natural data augmentation, and improves marginal log-likelihoods and data efficiency when integrated with VAEs, as demonstrated on datasets with affine and color transformations (e.g., dSprites, MNIST, GalaxyMNIST). However, it requires specifying a superset of possible symmetries and exhibits limitations in boundary-content datasets, motivating future work to relax symmetry sets and handle boundary effects. Overall, SGM offers a principled, group-theoretic avenue for uncovering and leveraging data symmetries in generative modeling and beyond.

Abstract

Correctly capturing the symmetry transformations of data can lead to efficient models with strong generalization capabilities, though methods incorporating symmetries often require prior knowledge. While recent advancements have been made in learning those symmetries directly from the dataset, most of this work has focused on the discriminative setting. In this paper, we take inspiration from group theoretic ideas to construct a generative model that explicitly aims to capture the data's approximate symmetries. This results in a model that, given a prespecified but broad set of possible symmetries, learns to what extent, if at all, those symmetries are actually present. Our model can be seen as a generative process for data augmentation. We provide a simple algorithm for learning our generative model and empirically demonstrate its ability to capture symmetries under affine and color transformations, in an interpretable way. Combining our symmetry model with standard generative models results in higher marginal test-log-likelihoods and improved data efficiency.
Paper Structure (69 sections, 14 equations, 23 figures, 1 algorithm)

This paper contains 69 sections, 14 equations, 23 figures, 1 algorithm.

Figures (23)

  • Figure 1: SGM graphical model. The implicit edges denote that $\hat{{\mathbf{x}}}$ is fully specified by ${\bm{\upeta}}$ and ${\mathbf{x}}$---since $\hat{{\mathbf{x}}} = {\mathcal{T}}_{\bm{\upeta}}^{-1}({\mathbf{x}})$---and thus only ${\bm{\upeta}}$ needs to be inferred given and observation ${\mathbf{x}}$.
  • Figure 2: Orbits due to horizontal shift transformations. Each point $(x_1, x_2)$ is transformed via ${\mathcal{T}}_\eta: (x_1, x_2) \mapsto (x_1, x_2) + (\eta, 0)$. Thus, horizontal lines form disjoint orbits in which any point can be transformed into any other point on the same line but not on another line. For each line, we can choose an arbitrary prototype ( ) from which all other points on the line can be reached via ${\mathcal{T}}_\eta$.
  • Figure 3: Self-supervised symmetry learning. We encourage $f_{\bm{\upomega}}({\mathbf{x}})$ to be equivariant by mapping ${\bm{x}}$ and a randomly transformed ${\bm{x}}$ to the same prototype $\hat{{\bm{x}}}$. Gray text shows examples for each variable in the graph. Note that $\hat{{\bm{x}}}$ and ${\bm{x}}_\text{rnd}$ may not appear in the dataset; see \ref{['fig:sym_gen_model']}.
  • Figure 4: Idealized examples of simple and flexible learned distributions over angles $\prob[][{\bm{\uppsi}}]{{\upeta}\,\middle|\,\hat{{\mathbf{x}}}}$--- ---given the true distribution $\prob[]{{\upeta}\,\middle|\,\hat{{\mathbf{x}}}}=\sum_{{\mathbf{x}} \in \{\textcolor{Color1}{origin=c]{30}{8}}, \ldots, \textcolor{Color0}{origin=c]{0}{8}}, \ldots, \textcolor{Color2}{origin=c]{-30}{8}}\}} \prob[]{{\upeta}\,\middle|\,{\mathbf{x}},\,\hat{{\mathbf{x}}}}$--- .
  • Figure 5: Examples of learned distributions over angles $\prob[][{\bm{\uppsi}}]{\cdot}$--- ---with and without dependence on $\hat{{\mathbf{x}}}$, given the true distribution $\prob[]{\cdot}$--- .
  • ...and 18 more figures