Table of Contents
Fetching ...

Canonical normalizing flows for manifold learning

Kyriakos Flouris, Ender Konukoglu

TL;DR

Canonical manifold learning flows (CMF) address latent-space degeneracy in manifold learning flows by regularizing the learning objective to promote a sparse and near-orthogonal canonical intrinsic basis. By minimizing the off-diagonal elements of the Riemannian metric tensor $G_{ij}$ with an $\ell_1$ penalty and integrating this with the standard manifold-flow likelihood and reconstruction losses, CMF encourages the network to use a compact set of latents that capture non-degenerate directions on the learned manifold. The approach yields improved density estimation and generation quality across simulated, image, and tabular data, evidenced by qualitative latent-basis analyses and quantitative metrics such as FID. While providing notable gains, CMF remains computationally intensive for high-dimensional data and relies on approximate gradient techniques for scalability, leaving room for future efficiency optimizations and intrinsic-dimension estimation.

Abstract

Manifold learning flows are a class of generative modelling techniques that assume a low-dimensional manifold description of the data. The embedding of such a manifold into the high-dimensional space of the data is achieved via learnable invertible transformations. Therefore, once the manifold is properly aligned via a reconstruction loss, the probability density is tractable on the manifold and maximum likelihood can be used to optimize the network parameters. Naturally, the lower-dimensional representation of the data requires an injective-mapping. Recent approaches were able to enforce that the density aligns with the modelled manifold, while efficiently calculating the density volume-change term when embedding to the higher-dimensional space. However, unless the injective-mapping is analytically predefined, the learned manifold is not necessarily an efficient representation of the data. Namely, the latent dimensions of such models frequently learn an entangled intrinsic basis, with degenerate information being stored in each dimension. Alternatively, if a locally orthogonal and/or sparse basis is to be learned, here coined canonical intrinsic basis, it can serve in learning a more compact latent space representation. Toward this end, we propose a canonical manifold learning flow method, where a novel optimization objective enforces the transformation matrix to have few prominent and non-degenerate basis functions. We demonstrate that by minimizing the off-diagonal manifold metric elements $\ell_1$-norm, we can achieve such a basis, which is simultaneously sparse and/or orthogonal. Canonical manifold flow yields a more efficient use of the latent space, automatically generating fewer prominent and distinct dimensions to represent data, and a better approximation of target distributions than other manifold flow methods in most experiments we conducted, resulting in lower FID scores.

Canonical normalizing flows for manifold learning

TL;DR

Canonical manifold learning flows (CMF) address latent-space degeneracy in manifold learning flows by regularizing the learning objective to promote a sparse and near-orthogonal canonical intrinsic basis. By minimizing the off-diagonal elements of the Riemannian metric tensor with an penalty and integrating this with the standard manifold-flow likelihood and reconstruction losses, CMF encourages the network to use a compact set of latents that capture non-degenerate directions on the learned manifold. The approach yields improved density estimation and generation quality across simulated, image, and tabular data, evidenced by qualitative latent-basis analyses and quantitative metrics such as FID. While providing notable gains, CMF remains computationally intensive for high-dimensional data and relies on approximate gradient techniques for scalability, leaving room for future efficiency optimizations and intrinsic-dimension estimation.

Abstract

Manifold learning flows are a class of generative modelling techniques that assume a low-dimensional manifold description of the data. The embedding of such a manifold into the high-dimensional space of the data is achieved via learnable invertible transformations. Therefore, once the manifold is properly aligned via a reconstruction loss, the probability density is tractable on the manifold and maximum likelihood can be used to optimize the network parameters. Naturally, the lower-dimensional representation of the data requires an injective-mapping. Recent approaches were able to enforce that the density aligns with the modelled manifold, while efficiently calculating the density volume-change term when embedding to the higher-dimensional space. However, unless the injective-mapping is analytically predefined, the learned manifold is not necessarily an efficient representation of the data. Namely, the latent dimensions of such models frequently learn an entangled intrinsic basis, with degenerate information being stored in each dimension. Alternatively, if a locally orthogonal and/or sparse basis is to be learned, here coined canonical intrinsic basis, it can serve in learning a more compact latent space representation. Toward this end, we propose a canonical manifold learning flow method, where a novel optimization objective enforces the transformation matrix to have few prominent and non-degenerate basis functions. We demonstrate that by minimizing the off-diagonal manifold metric elements -norm, we can achieve such a basis, which is simultaneously sparse and/or orthogonal. Canonical manifold flow yields a more efficient use of the latent space, automatically generating fewer prominent and distinct dimensions to represent data, and a better approximation of target distributions than other manifold flow methods in most experiments we conducted, resulting in lower FID scores.
Paper Structure (26 sections, 21 equations, 28 figures, 4 tables)

This paper contains 26 sections, 21 equations, 28 figures, 4 tables.

Figures (28)

  • Figure 7: Generated samples for increasing prominent dimensions used for sampling from top to bottom along the columns. The model is trained on Fashion-MNIST with $\mathbb{R}^{d=40}$, (a) RNF, (b) CMF. Rows represent different samples. The CMF captures the generations almost fully, even with the first prominent dimensions.
  • Figure 8: Generated samples for 10 different hierarchical subgroups of prominent dimensions $(\{1,2\},\{3,4\},\{5,6\} \dots, \{19,20\})$ along the columns from bottom to top, while setting all other dimensions to zero. Models were trained on Fashion-MNIST with $\mathbb{R}^{d=20}$ with (a) RNF, (b) CMF. Rows represent different samples. Sparse learning is evident for the CMF method.
  • Figure 9: Sampled manifold for the two most prominent dimensions as calculated from $G_{kk}$, (a) RNF, (b) CMF.
  • Figure 10: OoD dection with CMF, trained on Fashion-MNIST.
  • Figure : (a) Density plot for a fuzzy line learned with RNF, via Equation \ref{['eq:rnflangragian']}.
  • ...and 23 more figures

Theorems & Definitions (1)

  • Definition 4.1