Table of Contents
Fetching ...

Discrete Markov Bridge

Hengli Li, Yuxuan Wang, Song-Chun Zhu, Ying Nian Wu, Zilong Zheng

TL;DR

The paper introduces the Discrete Markov Bridge (DMB), a variational framework for learning discrete representations by unifying discrete diffusion with latent-variable learning. It decomposes learning into forward Matrix-learning, which learns an adaptive, diagonalizable rate-transition matrix $Q_\alpha$, and backward Score-learning, which trains a neural score to construct the inverse dynamics, optimizing a continuous-time $ELBO$. The authors provide formal guarantees for the forward process (validity and accessibility) and convergence of the overall CTDMB algorithm, plus practical strategies for efficient matrix exponentiation and space usage. Empirically, the method achieves an $ELBO$ of $1.38$ on Text8 and delivers competitive results on CIFAR-10, illustrating its effectiveness and versatility for discrete data modalities. Overall, DMB offers a principled, scalable approach to discrete representation learning with strong theoretical foundations and broad applicability.

Abstract

Discrete diffusion has recently emerged as a promising paradigm in discrete data modeling. However, existing methods typically rely on a fixed rate transition matrix during training, which not only limits the expressiveness of latent representations, a fundamental strength of variational methods, but also constrains the overall design space. To address these limitations, we propose Discrete Markov Bridge, a novel framework specifically designed for discrete representation learning. Our approach is built upon two key components: Matrix Learning and Score Learning. We conduct a rigorous theoretical analysis, establishing formal performance guarantees for Matrix Learning and proving the convergence of the overall framework. Furthermore, we analyze the space complexity of our method, addressing practical constraints identified in prior studies. Extensive empirical evaluations validate the effectiveness of the proposed Discrete Markov Bridge, which achieves an Evidence Lower Bound (ELBO) of 1.38 on the Text8 dataset, outperforming established baselines. Moreover, the proposed model demonstrates competitive performance on the CIFAR-10 dataset, achieving results comparable to those obtained by image-specific generation approaches.

Discrete Markov Bridge

TL;DR

The paper introduces the Discrete Markov Bridge (DMB), a variational framework for learning discrete representations by unifying discrete diffusion with latent-variable learning. It decomposes learning into forward Matrix-learning, which learns an adaptive, diagonalizable rate-transition matrix , and backward Score-learning, which trains a neural score to construct the inverse dynamics, optimizing a continuous-time . The authors provide formal guarantees for the forward process (validity and accessibility) and convergence of the overall CTDMB algorithm, plus practical strategies for efficient matrix exponentiation and space usage. Empirically, the method achieves an of on Text8 and delivers competitive results on CIFAR-10, illustrating its effectiveness and versatility for discrete data modalities. Overall, DMB offers a principled, scalable approach to discrete representation learning with strong theoretical foundations and broad applicability.

Abstract

Discrete diffusion has recently emerged as a promising paradigm in discrete data modeling. However, existing methods typically rely on a fixed rate transition matrix during training, which not only limits the expressiveness of latent representations, a fundamental strength of variational methods, but also constrains the overall design space. To address these limitations, we propose Discrete Markov Bridge, a novel framework specifically designed for discrete representation learning. Our approach is built upon two key components: Matrix Learning and Score Learning. We conduct a rigorous theoretical analysis, establishing formal performance guarantees for Matrix Learning and proving the convergence of the overall framework. Furthermore, we analyze the space complexity of our method, addressing practical constraints identified in prior studies. Extensive empirical evaluations validate the effectiveness of the proposed Discrete Markov Bridge, which achieves an Evidence Lower Bound (ELBO) of 1.38 on the Text8 dataset, outperforming established baselines. Moreover, the proposed model demonstrates competitive performance on the CIFAR-10 dataset, achieving results comparable to those obtained by image-specific generation approaches.

Paper Structure

This paper contains 42 sections, 20 theorems, 76 equations, 1 figure, 2 tables, 1 algorithm.

Key Result

Theorem 3.1

Given the Forward Kolmogorov Equation of a CTDMC: There exists a reverse CTDMC with Forward Kolmogorov Equation:

Figures (1)

  • Figure 1: Overview of the CTDMB framework. CTDMB consists of two component: the Matrix-learning and the Score-learning. The Matrix-learning process is designed to learn an adaptive transition rate matrix, which facilitates the estimation of an adapted latent distribution. Concurrently, the score-learning process focuses on estimating the probability ratio necessary for constructing the inverse transition rate matrix, thereby enabling the reconstruction of the original data distribution.

Theorems & Definitions (30)

  • Theorem 3.1: Reversibility campbell2022continuouslou2024discrete
  • Proposition 4.1: Conservation of the Sum
  • Theorem 4.2: Accessibility
  • Lemma 4.3
  • Lemma 4.4
  • Lemma 4.5
  • Proposition 4.6: Supervision of Score-learning
  • Theorem 4.7: Convergence of the algorithm
  • Proposition 5.1
  • Proposition 5.2
  • ...and 20 more