Table of Contents
Fetching ...

Aligned Diffusion Schrödinger Bridges

Vignesh Ram Somnath, Matteo Pariset, Ya-Ping Hsieh, Maria Rodriguez Martinez, Andreas Krause, Charlotte Bunne

TL;DR

SBalign tackles the interpolation problem in diffusion Schrödinger bridges when data is aligned, by integrating Schrödinger bridge theory with Doob's $h$-transform to honor pairings $(\mathbf{x}_0^i, \mathbf{x}_1^i)$. It derives a loss that bypasses the traditional IPF procedure, stabilizes training through $h$-transform regularization, and represents the target process as a mixture of scaled Brownian bridges guided by the optimal coupling $\pi^{\star}$. The framework also uses paired Schrödinger bridges as priors to improve classical SB when pairings are scarce. Empirical results across synthetic datasets, single-cell differentiation, and protein docking show substantial improvements over unaligned baselines, highlighting the practical value of leveraging alignment in diffusion-based trajectory inference.

Abstract

Diffusion Schrödinger bridges (DSB) have recently emerged as a powerful framework for recovering stochastic dynamics via their marginal observations at different time points. Despite numerous successful applications, existing algorithms for solving DSBs have so far failed to utilize the structure of aligned data, which naturally arises in many biological phenomena. In this paper, we propose a novel algorithmic framework that, for the first time, solves DSBs while respecting the data alignment. Our approach hinges on a combination of two decades-old ideas: The classical Schrödinger bridge theory and Doob's $h$-transform. Compared to prior methods, our approach leads to a simpler training procedure with lower variance, which we further augment with principled regularization schemes. This ultimately leads to sizeable improvements across experiments on synthetic and real data, including the tasks of predicting conformational changes in proteins and temporal evolution of cellular differentiation processes.

Aligned Diffusion Schrödinger Bridges

TL;DR

SBalign tackles the interpolation problem in diffusion Schrödinger bridges when data is aligned, by integrating Schrödinger bridge theory with Doob's -transform to honor pairings . It derives a loss that bypasses the traditional IPF procedure, stabilizes training through -transform regularization, and represents the target process as a mixture of scaled Brownian bridges guided by the optimal coupling . The framework also uses paired Schrödinger bridges as priors to improve classical SB when pairings are scarce. Empirical results across synthetic datasets, single-cell differentiation, and protein docking show substantial improvements over unaligned baselines, highlighting the practical value of leveraging alignment in diffusion-based trajectory inference.

Abstract

Diffusion Schrödinger bridges (DSB) have recently emerged as a powerful framework for recovering stochastic dynamics via their marginal observations at different time points. Despite numerous successful applications, existing algorithms for solving DSBs have so far failed to utilize the structure of aligned data, which naturally arises in many biological phenomena. In this paper, we propose a novel algorithmic framework that, for the first time, solves DSBs while respecting the data alignment. Our approach hinges on a combination of two decades-old ideas: The classical Schrödinger bridge theory and Doob's -transform. Compared to prior methods, our approach leads to a simpler training procedure with lower variance, which we further augment with principled regularization schemes. This ultimately leads to sizeable improvements across experiments on synthetic and real data, including the tasks of predicting conformational changes in proteins and temporal evolution of cellular differentiation processes.
Paper Structure (59 sections, 17 equations, 7 figures, 3 tables, 1 algorithm)

This paper contains 59 sections, 17 equations, 7 figures, 3 tables, 1 algorithm.

Figures (7)

  • Figure 1: Overview of SBalign: In biological tasks such as protein docking, one is naturally provided with aligned data in the form of unbound and bound structures of participating proteins. Our goal is to therefore recover a stochastic trajectory from the unbound ($\mathbf{x}_0$) to the bound ($\mathbf{x}_1$) structure. To achieve this, we connect the characterization of an SDE conditioned on $\mathbf{x}_0$ and $\mathbf{x}_1$ (utilizing the Doob's $h$-transform) with that of a Brownian bridge between $\mathbf{x}_0$ and $\mathbf{x}_1$ (classical Schrödinger bridge theory). We show that this leads to a simpler training procedure with lower variance and strong empirical results.
  • Figure 2: Experimental results on the Moon dataset (a-c) and T-dataset (d-f). The top row shows the trajectory sampled using the learned drift, and the bottom row shows the matching based on the learnt drift. Compared to other baselines, SBalign is able to learn an appropriate drift respecting the true alignment. (f) further showcases the utility of SBalign's learnt drift as a suitable reference process to improve other training methods.
  • Figure 3: Cell differentiation trajectories based on (a) the ground truth and (b-d) learned drifts. SBalign is able to learn an appropriate drift underlying the true differentiation process while respecting the alignment. (d) Using the learned drift from SBalign as a reference process helps improve the drift learned by other training methods.
  • Figure 4: Cell type prediction on the differentiation dataset. All distributions are plotted on the first two principal components. a-b: Ground truth cell types on day 2 and day 4 respectively. c-d:fbSB and SBalign cell type predictions on day 4. SBalign is able to better model the underlying differentiation processes and capture the diversity in cell types.
  • Figure 5: Initial (blue) and final (red) marginals for the two toy datasets (a) moon and (b) T, together with arrows indicating a few alignments
  • ...and 2 more figures