Generative Assignment Flows for Representing and Learning Joint Distributions of Discrete Data

Bastian Boll; Daniel Gonzalez-Alvarado; Stefania Petra; Christoph Schnörr

Generative Assignment Flows for Representing and Learning Joint Distributions of Discrete Data

Bastian Boll, Daniel Gonzalez-Alvarado, Stefania Petra, Christoph Schnörr

TL;DR

This work tackles the challenge of modeling joint distributions over large discrete alphabets by embedding factorizing distributions onto a metrical manifold and transporting a simple reference measure through randomized assignment flows. The core idea is to represent any discrete distribution as an expectation of factorizing components via the embedding $T(W)$ and a learned flow on the assignment manifold, trained with simulation-free Riemannian flow matching on geodesics. Key contributions include the introduction of an infinitetime transport to mitigate boundary issues, a principled way to couple many discrete variables through the submanifold geometry, and empirical demonstrations on class scaling and image-segmentation tasks that show improved scalability and interpolation capability. The framework yields efficient sampling and likelihood evaluation for discrete data while leveraging information-geometric structure, making it suitable for structured prediction in large-class settings.

Abstract

We introduce a novel generative model for the representation of joint probability distributions of a possibly large number of discrete random variables. The approach uses measure transport by randomized assignment flows on the statistical submanifold of factorizing distributions, which enables to represent and sample efficiently from any target distribution and to assess the likelihood of unseen data points. The complexity of the target distribution only depends on the parametrization of the affinity function of the dynamical assignment flow system. Our model can be trained in a simulation-free manner by conditional Riemannian flow matching, using the training data encoded as geodesics on the assignment manifold in closed-form, with respect to the e-connection of information geometry. Numerical experiments devoted to distributions of structured image labelings demonstrate the applicability to large-scale problems, which may include discrete distributions in other application areas. Performance measures show that our approach scales better with the increasing number of classes than recent related work.

Generative Assignment Flows for Representing and Learning Joint Distributions of Discrete Data

TL;DR

and a learned flow on the assignment manifold, trained with simulation-free Riemannian flow matching on geodesics. Key contributions include the introduction of an infinitetime transport to mitigate boundary issues, a principled way to couple many discrete variables through the submanifold geometry, and empirical demonstrations on class scaling and image-segmentation tasks that show improved scalability and interpolation capability. The framework yields efficient sampling and likelihood evaluation for discrete data while leveraging information-geometric structure, making it suitable for structured prediction in large-class settings.

Abstract

Paper Structure (42 sections, 6 theorems, 108 equations, 10 figures, 1 table)

This paper contains 42 sections, 6 theorems, 108 equations, 10 figures, 1 table.

Introduction
Overview, Motivation
Related Work
Statistics
Own Prior Work
Machine Learning
Organization
Basic Notation, List of Main Symbols
Background
Assignment Flows
Meta-Simplex, Flow Embedding
Approach
Generative Model
Goal
Representation of General Distributions
...and 27 more sections

Key Result

Proposition 2.2

For every $W\in\mathcal{W}_{c}$, the distribution $T(W)\in\mathcal{S}_{N}$ has maximum entropy among all $p\in\mathcal{S}_{N}$ subject to the marginal constraint

Figures (10)

Figure 1.1: (a) The simplex $\Delta_{N}$\ref{['def:Delta-N']}, for $N=4$, depicted in local coordinates, and the submanifold of factorizing discrete distributions which connects all extreme points of $\Delta_{4}$. (b) Visualization of 1000 samples from the target distribution $p(\alpha_{1},\alpha_{2})$ given by \ref{['eq:py1y2']}, corresponding to the blue point $p\in\Delta_{4}$. Each sample corresponds to an integral curve of a flow which evolves on the submanifold and can be computed efficiently by geometric integration. The parametrized vector field of the dynamical system which generates the flow has been trainined by matching the flow to geodesics on the submanifold which encode given training data. As a result, each component $p_{\alpha}$ of the target distribution corresponds to the relative frequency of integral curves converging to the vertex $e_{\alpha}$, such that the entire distribution $p$ is represented by the convex combination $\sum_{\alpha} p_{\alpha} e_{\alpha} = p$. In this way, the flow realizes the pushforward of a simple reference distribution, centered at $0$ in the tangent space at the barycenter (red point), to the discrete target distribution $p$. Figure \ref{['fig:approach']} (p. \ref{['fig:approach']}) provides a more detailled illustration of the approach.
Figure 3.1: Overview of the approach: The standard Gaussian reference measure $\mathcal{N}(0,I)$ is pushed forward by the lifting map $\exp_{W}$ from the flat tangent product space $\mathcal{T}_{0}$ to the assignment manifold $\mathcal{W}_{c}$, and further to the meta-simplex $\mathcal{S}_{N}$ via the embedding map $T$\ref{['eq:def-T-embedding']}, by geometrically integrating the assignment flow equation \ref{['eq:AF-general']}. Since the assignment flow converges to the extreme points of $\overline{\mathcal{W}_{c}}$ which after embedding agree with the extreme points of $\Delta_{N}=\overline{\mathcal{S}_{N}}$, an approximation $\widetilde{p}(\alpha)$ of a general discrete target measure $p(\alpha)$ can be learned in terms of a corresponding convex combination of extreme points. This is achieved by matching the flow of e-geodesics which encode given training samples to the generating assignment flow, by empirical expectation, and by learning the parameters of the affinity function $F_{\theta}$\ref{['eq:def-affinity-function']}. Since factorizing distributions $T(W),\, W\in\mathcal{W}_{c}$, are only required, the approach is computationally feasible also in high dimensions.
Figure 3.2: Influence of the parameter $\lambda$ controlling in \ref{['eq:tangent_gaussian_path']} and \ref{['eq:condvectorfield-model']}, respectively, the rate of assignment of mass of the pushforward probability measure \ref{['eq:nu_cond_lifted_gauss']} to a target label, depending on the number $c$ of labels (classes, categories).
Figure 3.3: Norms $\|v(s)\|$ of the tangent vectors $v(s) = \exp_{\mathbb{1}_{\mathcal{S}}}^{-1}(p(s))$ with $p(s)=(\frac{s-1}{s},\frac{1}{(c-1) s},\dotsc,\frac{1}{(c-1) s}) \to e_{1}\in\mathbb{R}^{c}$ if $s\to\infty$, for numbers of labels $c\in\{3,10,100,1000\}$. Since $\|e_{1}-p(s)\|=(\frac{c}{c-1})^{1/2}\frac{1}{s}\approx \frac{1}{s}$, the simplex $\Delta_{c}$ is covered, up to a very small distance to its boundary, by $\exp_{\mathbb{1}_{\mathcal{S}}}(B_{0}(r)) \subset \mathcal{S}_{c}$ and tangent vectors $v\in B_{0}(r)\subset T_{0}$ within a ball $B_{0}(r)$ centered at $0\in T_{0}$ with radius $r = 15$.
Figure 4.1: Relative entropy between learned models (histogram of 512k samples) and a known, factorizing target distribution on $n=4$ simplices with varying number of classes $c$. By leveraging information geometry and gradual decision-making over time, our proposed approach (red) is able to outperform our earlier method Boll:2024ab as well as Dirichlet flow matching Stark:2024aa in terms of scaling to many classes $c$.
...and 5 more figures

Theorems & Definitions (14)

Example 2.1
Proposition 2.2: Boll:2024aa
Lemma 3.1: convex combination of embedded nodewise measures
proof
Proposition 3.2: conditional vector fields
proof
Proposition 3.3: conditional path constraints
proof
Lemma 3.4: orthogonal projection onto $\mathop{\mathrm{img}}\nolimits(Q)\cap \mathcal{T}_0\mathcal{S}_N$
Theorem 3.5: projected flow matching on $\mathcal{S}_N$
...and 4 more

Generative Assignment Flows for Representing and Learning Joint Distributions of Discrete Data

TL;DR

Abstract

Generative Assignment Flows for Representing and Learning Joint Distributions of Discrete Data

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (10)

Theorems & Definitions (14)