A Geometric Unification of Generative AI with Manifold-Probabilistic Projection Models
Leah Bar, Liron Mor Yosef, Shai Zucker, Neta Shoham, Inbar Seroussi, Nir Sochen
TL;DR
This work unifies geometric and probabilistic viewpoints in generative imaging by modeling images as lying on a low-dimensional manifold and interpreting diffusion as a projection toward the manifold. It introduces the Manifold-Probabilistic Projection Model (MPPM) and its Latent variant (LMPPM), coupling a distance-to-manifold function with an autoencoder and kernel-based probability on ambient and latent spaces. A diffusion-like flow is guided by the score of the manifold-based probability, enabling iterative restoration and generation that respect both geometry and data distribution. Empirical results on MNIST and SCUT-FBP5500 show that LMPPM can outperform Latent Diffusion Models, particularly under severe distortions, thanks to operating in the latent space and leveraging a learned distance function. Key contributions include a concrete distance-to-manifold formulation, a kernel-based nonuniform probability model, and two training losses that enforce geometric consistency and probabilistic alignment across pixel and latent domains, with demonstrated improvements in image restoration and generation tasks.
Abstract
The foundational premise of generative AI for images is the assumption that images are inherently low-dimensional objects embedded within a high-dimensional space. Additionally, it is often implicitly assumed that thematic image datasets form smooth or piecewise smooth manifolds. Common approaches overlook the geometric structure and focus solely on probabilistic methods, approximating the probability distribution through universal approximation techniques such as the kernel method. In some generative models, the low dimensional nature of the data manifest itself by the introduction of a lower dimensional latent space. Yet, the probability distribution in the latent or the manifold coordinate space is considered uninteresting and is predefined or considered uniform. This study unifies the geometric and probabilistic perspectives by providing a geometric framework and a kernel-based probabilistic method simultaneously. The resulting framework demystifies diffusion models by interpreting them as a projection mechanism onto the manifold of ``good images''. This interpretation leads to the construction of a new deterministic model, the Manifold-Probabilistic Projection Model (MPPM), which operates in both the representation (pixel) space and the latent space. We demonstrate that the Latent MPPM (LMPPM) outperforms the Latent Diffusion Model (LDM) across various datasets, achieving superior results in terms of image restoration and generation.
