Generative Assignment Flows for Representing and Learning Joint Distributions of Discrete Data
Bastian Boll, Daniel Gonzalez-Alvarado, Stefania Petra, Christoph Schnörr
TL;DR
This work tackles the challenge of modeling joint distributions over large discrete alphabets by embedding factorizing distributions onto a metrical manifold and transporting a simple reference measure through randomized assignment flows. The core idea is to represent any discrete distribution as an expectation of factorizing components via the embedding $T(W)$ and a learned flow on the assignment manifold, trained with simulation-free Riemannian flow matching on geodesics. Key contributions include the introduction of an infinitetime transport to mitigate boundary issues, a principled way to couple many discrete variables through the submanifold geometry, and empirical demonstrations on class scaling and image-segmentation tasks that show improved scalability and interpolation capability. The framework yields efficient sampling and likelihood evaluation for discrete data while leveraging information-geometric structure, making it suitable for structured prediction in large-class settings.
Abstract
We introduce a novel generative model for the representation of joint probability distributions of a possibly large number of discrete random variables. The approach uses measure transport by randomized assignment flows on the statistical submanifold of factorizing distributions, which enables to represent and sample efficiently from any target distribution and to assess the likelihood of unseen data points. The complexity of the target distribution only depends on the parametrization of the affinity function of the dynamical assignment flow system. Our model can be trained in a simulation-free manner by conditional Riemannian flow matching, using the training data encoded as geodesics on the assignment manifold in closed-form, with respect to the e-connection of information geometry. Numerical experiments devoted to distributions of structured image labelings demonstrate the applicability to large-scale problems, which may include discrete distributions in other application areas. Performance measures show that our approach scales better with the increasing number of classes than recent related work.
