Table of Contents
Fetching ...

Manifold-Matching Autoencoders

Laurent Cheret, Vincent Létourneau, Isar Nejadgholi, Chris Drummond, Hussein Al Osman, Maia Fraser

Abstract

We study a simple unsupervised regularization scheme for autoencoders called Manifold-Matching (MMAE): we align the pairwise distances in the latent space to those of the input data space by minimizing mean squared error. Because alignment occurs on pairwise distances rather than coordinates, it can also be extended to a lower-dimensional representation of the data, adding flexibility to the method. We find that this regularization outperforms similar methods on metrics based on preservation of nearest-neighbor distances and persistent homology-based measures. We also observe that MMAE provides a scalable approximation of Multi-Dimensional Scaling (MDS).

Manifold-Matching Autoencoders

Abstract

We study a simple unsupervised regularization scheme for autoencoders called Manifold-Matching (MMAE): we align the pairwise distances in the latent space to those of the input data space by minimizing mean squared error. Because alignment occurs on pairwise distances rather than coordinates, it can also be extended to a lower-dimensional representation of the data, adding flexibility to the method. We find that this regularization outperforms similar methods on metrics based on preservation of nearest-neighbor distances and persistent homology-based measures. We also observe that MMAE provides a scalable approximation of Multi-Dimensional Scaling (MDS).
Paper Structure (33 sections, 2 theorems, 11 equations, 3 figures, 1 table)

This paper contains 33 sections, 2 theorems, 11 equations, 3 figures, 1 table.

Key Result

Theorem 2.1

for all homology dimensions $p \geq 0$, where $d_B$ is the bottleneck distance and $d_{GH}$ the Gromov-Hausdorff distance.

Figures (3)

  • Figure 1: Left: Overview of the current approach. The Manifold-Matching regularization MM-reg is added to the objective function of the standard AE, forming MMAEs. Top Right: 2D latent spaces of the Nested Spheres dataset moor2021TAE. Standard AE (Vanilla) using no MM-reg and 9 MMAE models using different number of PCA components in their regularization ($1\rightarrow100$). Bottom Right: MMAE 2D latent spaces "copying" 2D embeddings from UMAP, t-SNE, and PCA across MNIST, F-MNIST, and CIFAR10 datasets.
  • Figure 2: 2D latent spaces of synthetic shapesLeft: Quantitative metrics. Average over 5 runs (optimized for metric $\text{KL}_{0.1}$) Right: 2D Latent representations: a) Standard AE (Vanilla); b) MMAE; c) TopoAE; d) RTD-AE; e) GeomAE; f) GGAE; g) SPAE.
  • Figure 3: Training time versus batch size. MMAE scales similarly to the standard AE (Vanilla). RTD-AE limited to batch size of 80.

Theorems & Definitions (2)

  • Theorem 2.1: Stability cohen2007stabilitychazal2014structure
  • Corollary 2.2: Distance Preservation Implies Topology Preservation