Table of Contents
Fetching ...

A Geometric Unification of Generative AI with Manifold-Probabilistic Projection Models

Leah Bar, Liron Mor Yosef, Shai Zucker, Neta Shoham, Inbar Seroussi, Nir Sochen

TL;DR

This work unifies geometric and probabilistic viewpoints in generative imaging by modeling images as lying on a low-dimensional manifold and interpreting diffusion as a projection toward the manifold. It introduces the Manifold-Probabilistic Projection Model (MPPM) and its Latent variant (LMPPM), coupling a distance-to-manifold function with an autoencoder and kernel-based probability on ambient and latent spaces. A diffusion-like flow is guided by the score of the manifold-based probability, enabling iterative restoration and generation that respect both geometry and data distribution. Empirical results on MNIST and SCUT-FBP5500 show that LMPPM can outperform Latent Diffusion Models, particularly under severe distortions, thanks to operating in the latent space and leveraging a learned distance function. Key contributions include a concrete distance-to-manifold formulation, a kernel-based nonuniform probability model, and two training losses that enforce geometric consistency and probabilistic alignment across pixel and latent domains, with demonstrated improvements in image restoration and generation tasks.

Abstract

The foundational premise of generative AI for images is the assumption that images are inherently low-dimensional objects embedded within a high-dimensional space. Additionally, it is often implicitly assumed that thematic image datasets form smooth or piecewise smooth manifolds. Common approaches overlook the geometric structure and focus solely on probabilistic methods, approximating the probability distribution through universal approximation techniques such as the kernel method. In some generative models, the low dimensional nature of the data manifest itself by the introduction of a lower dimensional latent space. Yet, the probability distribution in the latent or the manifold coordinate space is considered uninteresting and is predefined or considered uniform. This study unifies the geometric and probabilistic perspectives by providing a geometric framework and a kernel-based probabilistic method simultaneously. The resulting framework demystifies diffusion models by interpreting them as a projection mechanism onto the manifold of ``good images''. This interpretation leads to the construction of a new deterministic model, the Manifold-Probabilistic Projection Model (MPPM), which operates in both the representation (pixel) space and the latent space. We demonstrate that the Latent MPPM (LMPPM) outperforms the Latent Diffusion Model (LDM) across various datasets, achieving superior results in terms of image restoration and generation.

A Geometric Unification of Generative AI with Manifold-Probabilistic Projection Models

TL;DR

This work unifies geometric and probabilistic viewpoints in generative imaging by modeling images as lying on a low-dimensional manifold and interpreting diffusion as a projection toward the manifold. It introduces the Manifold-Probabilistic Projection Model (MPPM) and its Latent variant (LMPPM), coupling a distance-to-manifold function with an autoencoder and kernel-based probability on ambient and latent spaces. A diffusion-like flow is guided by the score of the manifold-based probability, enabling iterative restoration and generation that respect both geometry and data distribution. Empirical results on MNIST and SCUT-FBP5500 show that LMPPM can outperform Latent Diffusion Models, particularly under severe distortions, thanks to operating in the latent space and leveraging a learned distance function. Key contributions include a concrete distance-to-manifold formulation, a kernel-based nonuniform probability model, and two training losses that enforce geometric consistency and probabilistic alignment across pixel and latent domains, with demonstrated improvements in image restoration and generation tasks.

Abstract

The foundational premise of generative AI for images is the assumption that images are inherently low-dimensional objects embedded within a high-dimensional space. Additionally, it is often implicitly assumed that thematic image datasets form smooth or piecewise smooth manifolds. Common approaches overlook the geometric structure and focus solely on probabilistic methods, approximating the probability distribution through universal approximation techniques such as the kernel method. In some generative models, the low dimensional nature of the data manifest itself by the introduction of a lower dimensional latent space. Yet, the probability distribution in the latent or the manifold coordinate space is considered uninteresting and is predefined or considered uniform. This study unifies the geometric and probabilistic perspectives by providing a geometric framework and a kernel-based probabilistic method simultaneously. The resulting framework demystifies diffusion models by interpreting them as a projection mechanism onto the manifold of ``good images''. This interpretation leads to the construction of a new deterministic model, the Manifold-Probabilistic Projection Model (MPPM), which operates in both the representation (pixel) space and the latent space. We demonstrate that the Latent MPPM (LMPPM) outperforms the Latent Diffusion Model (LDM) across various datasets, achieving superior results in terms of image restoration and generation.

Paper Structure

This paper contains 28 sections, 28 equations, 14 figures, 10 tables, 2 algorithms.

Figures (14)

  • Figure 1: Illustration of our manifold-aware restoration approach. The blue path shows direct projection onto manifold $\mathcal{M}$ using distance function $\mathcal{D}_\mathcal{M}(x)$, while the red-green path represents encoding-decoding through latent space $\mathbb{R}^d$ via functions $F$ and $G$. Ideally, both paths converge to the same manifold point, ensuring geometrically consistent restoration.
  • Figure 2: An illustration of the kernel approximation $P_{\text{ker}}(z)$ of the probability distribution $P(z)$ in the latent space.
  • Figure 3: The manifold $\mathcal{M}$ is illustrated as the curved line. $x_i^*$ is the closest point to $x$ on the manifold. $\bar{G}(x)$ is depicted as well and is not necessarily a point on the manifold.
  • Figure 4: The manifold $\mathcal{M}$ is the unit circle lying in the $\text{xy}$-plane and is parametrized by the azimuth angle $\theta$. It is sampled according to a normal distribution centered at $\theta_0$ indicated by the red line. The reconstruction trajectory is shown in dark red. Note that the final result of the iterations on $x$ does not converge to $x^*$ which is the closest point on the circle. Instead, it is influenced by the data distribution on the manifold through the effect of $\bar{G}(x)$.
  • Figure 5: Comparison between the DAE and our proposed MPPM, this example uses the same setup as in Fig. \ref{['fig:circle-illustration']}. The error was computed as the deviation from the unit circle in 2D. In regions of the circle with lower probability density, the DAE is more prone to error than the proposed MPPM method.
  • ...and 9 more figures