Table of Contents
Fetching ...

Explorations in Homeomorphic Variational Auto-Encoding

Luca Falorsi, Pim de Haan, Tim R. Davidson, Nicola De Cao, Maurice Weiler, Patrick Forré, Taco S. Cohen

TL;DR

The exper-iments show that choosing manifold-valued latent variables that match the topology of the latent data manifold, is crucial to preserve the topological structure and learn a well-behaved latent space.

Abstract

The manifold hypothesis states that many kinds of high-dimensional data are concentrated near a low-dimensional manifold. If the topology of this data manifold is non-trivial, a continuous encoder network cannot embed it in a one-to-one manner without creating holes of low density in the latent space. This is at odds with the Gaussian prior assumption typically made in Variational Auto-Encoders (VAEs), because the density of a Gaussian concentrates near a blob-like manifold. In this paper we investigate the use of manifold-valued latent variables. Specifically, we focus on the important case of continuously differentiable symmetry groups (Lie groups), such as the group of 3D rotations $\operatorname{SO}(3)$. We show how a VAE with $\operatorname{SO}(3)$-valued latent variables can be constructed, by extending the reparameterization trick to compact connected Lie groups. Our experiments show that choosing manifold-valued latent variables that match the topology of the latent data manifold, is crucial to preserve the topological structure and learn a well-behaved latent space.

Explorations in Homeomorphic Variational Auto-Encoding

TL;DR

The exper-iments show that choosing manifold-valued latent variables that match the topology of the latent data manifold, is crucial to preserve the topological structure and learn a well-behaved latent space.

Abstract

The manifold hypothesis states that many kinds of high-dimensional data are concentrated near a low-dimensional manifold. If the topology of this data manifold is non-trivial, a continuous encoder network cannot embed it in a one-to-one manner without creating holes of low density in the latent space. This is at odds with the Gaussian prior assumption typically made in Variational Auto-Encoders (VAEs), because the density of a Gaussian concentrates near a blob-like manifold. In this paper we investigate the use of manifold-valued latent variables. Specifically, we focus on the important case of continuously differentiable symmetry groups (Lie groups), such as the group of 3D rotations . We show how a VAE with -valued latent variables can be constructed, by extending the reparameterization trick to compact connected Lie groups. Our experiments show that choosing manifold-valued latent variables that match the topology of the latent data manifold, is crucial to preserve the topological structure and learn a well-behaved latent space.

Paper Structure

This paper contains 29 sections, 1 theorem, 35 equations, 8 figures, 4 tables.

Key Result

Theorem 1

Let $(\mathbb{R}^3,\lambda, \mathcal{B}[\mathbb{R}^3] )$ the real space, provided with the Lebesgue measure on the Borel algebra on $\mathbb{R}^3$. Let $(\operatorname{SO}(3),\nu,\mathcal{B}[\operatorname{SO}(3)])$ the group of 3 dimensional rotations, provided with the normalized Haar measure $\nu$

Figures (8)

  • Figure 1.1: An example of problems that arise in mapping manifolds not diffeomorphic to each other. Notice that in the illustrated example the 'holes' in the first manifold, prevent a smooth mapping to the second.
  • Figure 1.1: Reconstructions of a $S^1$ trajectory in the Toy data set. The $\mathbb{R}^{64}$ elements are mapped to 3D by Principal Component Analysis. See Section \ref{['sec:toy']} for details.
  • Figure 1.2: Discontinuities in the latent space along a $S^1$ trajectory in Toy data set. Shown is $\|f(x_{i+1})-f(x_i)\|^2$ for encoder $f$ along the trajectory. See Section \ref{['sec:toy']} for details.
  • Figure 2.1: Illustration of our extended reparameterization trick in comparison to the classic reparameterization trick.
  • Figure 4.1: The encoder infers the $R \in \operatorname{SO}(3)$ and Fourier modes $\hat{f}$ if working with multiple objects, otherwise $\hat{f}$ is a parameter. Shown is the commutative diagram between taking the Inverse Fourier Transform, rotating the result and taking the Fourier Transform, and acting with the group representation $W$ on $\hat{f}$. The decoder maps the transformed $\hat{f}'$ to pixels.
  • ...and 3 more figures

Theorems & Definitions (3)

  • Theorem 1
  • proof
  • proof