Table of Contents
Fetching ...

Categorical Flow Maps

Daan Roos, Oscar Davis, Floor Eijkelboom, Michael Bronstein, Max Welling, İsmail İlkan Ceylan, Luca Ambrogioni, Jan-Willem van de Meent

TL;DR

Categorical Flow Maps (CFM) introduce a variational, endpoint-driven flow-matching framework to accelerate generation for discrete data by transporting a continuous prior to a discrete target on the probability simplex. By enforcing simplex-constrained endpoint predictions and leveraging Lagrangian self-distillation with endpoint consistency (ECLD), CFMs enable effective self-distillation in discrete domains and support test-time guidance via flow-map lookahead. Empirically, CFMs achieve state-of-the-art or competitive few-step generation across graphs (QM9, ZINC), images (binary MNIST), and text (Text8, LM1B), often matching or approaching multi-step baselines with far fewer steps. The approach unifies continuous-domain consistency techniques with discrete data, enabling efficient, steerable generation and practical deployment in diverse modalities.

Abstract

We introduce Categorical Flow Maps, a flow-matching method for accelerated few-step generation of categorical data via self-distillation. Building on recent variational formulations of flow matching and the broader trend towards accelerated inference in diffusion and flow-based models, we define a flow map towards the simplex that transports probability mass toward a predicted endpoint, yielding a parametrisation that naturally constrains model predictions. Since our trajectories are continuous rather than discrete, Categorical Flow Maps can be trained with existing distillation techniques, as well as a new objective based on endpoint consistency. This continuous formulation also automatically unlocks test-time inference: we can directly reuse existing guidance and reweighting techniques in the categorical setting to steer sampling toward downstream objectives. Empirically, we achieve state-of-the-art few-step results on images, molecular graphs, and text, with strong performance even in single-step generation.

Categorical Flow Maps

TL;DR

Categorical Flow Maps (CFM) introduce a variational, endpoint-driven flow-matching framework to accelerate generation for discrete data by transporting a continuous prior to a discrete target on the probability simplex. By enforcing simplex-constrained endpoint predictions and leveraging Lagrangian self-distillation with endpoint consistency (ECLD), CFMs enable effective self-distillation in discrete domains and support test-time guidance via flow-map lookahead. Empirically, CFMs achieve state-of-the-art or competitive few-step generation across graphs (QM9, ZINC), images (binary MNIST), and text (Text8, LM1B), often matching or approaching multi-step baselines with far fewer steps. The approach unifies continuous-domain consistency techniques with discrete data, enabling efficient, steerable generation and practical deployment in diverse modalities.

Abstract

We introduce Categorical Flow Maps, a flow-matching method for accelerated few-step generation of categorical data via self-distillation. Building on recent variational formulations of flow matching and the broader trend towards accelerated inference in diffusion and flow-based models, we define a flow map towards the simplex that transports probability mass toward a predicted endpoint, yielding a parametrisation that naturally constrains model predictions. Since our trajectories are continuous rather than discrete, Categorical Flow Maps can be trained with existing distillation techniques, as well as a new objective based on endpoint consistency. This continuous formulation also automatically unlocks test-time inference: we can directly reuse existing guidance and reweighting techniques in the categorical setting to steer sampling toward downstream objectives. Empirically, we achieve state-of-the-art few-step results on images, molecular graphs, and text, with strong performance even in single-step generation.
Paper Structure (53 sections, 5 theorems, 66 equations, 10 figures, 2 tables, 4 algorithms)

This paper contains 53 sections, 5 theorems, 66 equations, 10 figures, 2 tables, 4 algorithms.

Key Result

Proposition 3.1

Under the linear VFM decoder and the induced flow map from eq:flowmap_endpoint_param, define the endpoint consistency loss with $w_t = (1-t)^{-2}$, and the temporal drift term where $\gamma_{s,t} = (t-s)/(1-s)$. Then

Figures (10)

  • Figure 1: Overview of our method: CFM instantaneous velocity sampling (left), CFM flow-map sampling (middle), and Lagrangian training (right), plus their corresponding paths on a probability simplex. The red arrows denote the instantaneous velocity induced by $\pi_{t, t}$, the green arrows denote the flow map induced by $\pi_{s, t}$, and the yellow arrow denotes the time derivative of the flow map.
  • Figure 2: Sample quality versus NFEs on QM9 (two leftmost panels) and ZINC (two rightmost panels), showing the fractions of valid and unique molecules ($\uparrow$), and FCD ($\downarrow$). We compare our methods (CSD, ECLD) against the unconstrained Flow Map baseline (Naive).
  • Figure 3: Samples from our categorical flow map on binary MNIST: (left) unconditional (right) and tilted by the classifier reward towards zeroes.
  • Figure 4: Text8: NFE against NLL as measured by GPT-J-6B. The numbers above the points are the corresponding entropies on the token space of the GPT model.
  • Figure 5: LM1B: NFE against Gen-PPL as measured by GPT-2. The numbers above are the corresponding entropies on the token space of the GPT model.
  • ...and 5 more figures

Theorems & Definitions (9)

  • Proposition 3.1: ECLD controls the Lagrangian residual
  • Corollary 3.2: Cross-entropy ECLD
  • Remark 3.3
  • Proposition 2.1
  • proof
  • Proposition 2.2: ECLD controls the Lagrangian residual (linear VFM decoder), reverse KL
  • proof
  • Proposition 3.1: Geometric Confinement
  • proof