$α$-Flow: A Unified Framework for Continuous-State Discrete Flow Matching Models

Chaoran Cheng; Jiahan Li; Jiajun Fan; Ge Liu

$α$-Flow: A Unified Framework for Continuous-State Discrete Flow Matching Models

Chaoran Cheng, Jiahan Li, Jiajun Fan, Ge Liu

TL;DR

This work presents a unified framework for CS-DFM models, under which the existing variants can be understood as operating on different $\alpha$-representations of probabilities, and introduces $\alpha$-Flow, a family of CS-DFM models that adheres to the canonical $\alpha$-geometry of the statistical manifold, and demonstrates its optimality in minimizing the generalized kinetic energy.

Abstract

Recent efforts have extended the flow-matching framework to discrete generative modeling. One strand of models directly works with the continuous probabilities instead of discrete tokens, which we colloquially refer to as Continuous-State Discrete Flow Matching (CS-DFM). Existing CS-DFM models differ significantly in their representations and geometric assumptions. This work presents a unified framework for CS-DFM models, under which the existing variants can be understood as operating on different $α$-representations of probabilities. Building upon the theory of information geometry, we introduce $α$-Flow, a family of CS-DFM models that adheres to the canonical $α$-geometry of the statistical manifold, and demonstrate its optimality in minimizing the generalized kinetic energy. Theoretically, we show that the flow matching loss for $α$-flow establishes a unified variational bound for the discrete negative log-likelihood. We comprehensively evaluate different instantiations of $α$-flow on various discrete generation domains to demonstrate their effectiveness in discrete generative modeling, including intermediate values whose geometries have never been explored before. $α$-flow significantly outperforms its discrete-state counterpart in image and protein sequence generation and better captures the entropy in language modeling.

$α$-Flow: A Unified Framework for Continuous-State Discrete Flow Matching Models

TL;DR

This work presents a unified framework for CS-DFM models, under which the existing variants can be understood as operating on different

-representations of probabilities, and introduces

-Flow, a family of CS-DFM models that adheres to the canonical

-geometry of the statistical manifold, and demonstrates its optimality in minimizing the generalized kinetic energy.

Abstract

-representations of probabilities. Building upon the theory of information geometry, we introduce

-Flow, a family of CS-DFM models that adheres to the canonical

-geometry of the statistical manifold, and demonstrate its optimality in minimizing the generalized kinetic energy. Theoretically, we show that the flow matching loss for

-flow establishes a unified variational bound for the discrete negative log-likelihood. We comprehensively evaluate different instantiations of

-flow on various discrete generation domains to demonstrate their effectiveness in discrete generative modeling, including intermediate values whose geometries have never been explored before.

-flow significantly outperforms its discrete-state counterpart in image and protein sequence generation and better captures the entropy in language modeling.

$α$-Flow: A Unified Framework for Continuous-State Discrete Flow Matching Models

TL;DR

Abstract

$α$-Flow: A Unified Framework for Continuous-State Discrete Flow Matching Models

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (7)

Theorems & Definitions (23)