Categorical Flow Maps
Daan Roos, Oscar Davis, Floor Eijkelboom, Michael Bronstein, Max Welling, İsmail İlkan Ceylan, Luca Ambrogioni, Jan-Willem van de Meent
TL;DR
Categorical Flow Maps (CFM) introduce a variational, endpoint-driven flow-matching framework to accelerate generation for discrete data by transporting a continuous prior to a discrete target on the probability simplex. By enforcing simplex-constrained endpoint predictions and leveraging Lagrangian self-distillation with endpoint consistency (ECLD), CFMs enable effective self-distillation in discrete domains and support test-time guidance via flow-map lookahead. Empirically, CFMs achieve state-of-the-art or competitive few-step generation across graphs (QM9, ZINC), images (binary MNIST), and text (Text8, LM1B), often matching or approaching multi-step baselines with far fewer steps. The approach unifies continuous-domain consistency techniques with discrete data, enabling efficient, steerable generation and practical deployment in diverse modalities.
Abstract
We introduce Categorical Flow Maps, a flow-matching method for accelerated few-step generation of categorical data via self-distillation. Building on recent variational formulations of flow matching and the broader trend towards accelerated inference in diffusion and flow-based models, we define a flow map towards the simplex that transports probability mass toward a predicted endpoint, yielding a parametrisation that naturally constrains model predictions. Since our trajectories are continuous rather than discrete, Categorical Flow Maps can be trained with existing distillation techniques, as well as a new objective based on endpoint consistency. This continuous formulation also automatically unlocks test-time inference: we can directly reuse existing guidance and reweighting techniques in the categorical setting to steer sampling toward downstream objectives. Empirically, we achieve state-of-the-art few-step results on images, molecular graphs, and text, with strong performance even in single-step generation.
