Table of Contents
Fetching ...

A Unified Density Operator View of Flow Control and Merging

Riccardo De Santi, Malte Franke, Ya-Ping Hsieh, Andreas Krause

TL;DR

Addresses the challenge of combining and adapting pre-trained flow models for downstream tasks by introducing a unifying probability-space framework of implicit density operators that express intersection, union, and interpolation. It defines a reward-guided objective $\mathcal{G}(p_1^{\pi})=\mathbb{E}_{x\sim p_1^{\pi}}[f(x)]-\sum_{i=1}^n \alpha_i D_i(p_1^{\pi}\|p_1^{pre,i})$ and shows how classic flow merging and reward-tuning are limit cases. The core method, Reward-Guided Flow Merging (RFM), uses a mirror-descent scheme to reduce complex density-operator optimizations to sequential fine-tuning steps, with convergence guarantees for both reward-guided and pure merging. The authors demonstrate the approach on illustrative settings and real-world tasks including high-dimensional molecular design, conformer generation, and image-model merging, achieving controlled trade-offs between safety, diversity, and discovery.

Abstract

Recent progress in large-scale flow and diffusion models raised two fundamental algorithmic challenges: (i) control-based reward adaptation of pre-trained flows, and (ii) integration of multiple models, i.e., flow merging. While current approaches address them separately, we introduce a unifying probability-space framework that subsumes both as limit cases, and enables reward-guided flow merging, allowing principled, task-aware combination of multiple pre-trained flows (e.g., merging priors while maximizing drug-discovery utilities). Our formulation renders possible to express a rich family of operators over generative models densities, including intersection (e.g., to enforce safety), union (e.g., to compose diverse models), interpolation (e.g., for discovery), their reward-guided counterparts, as well as complex logical expressions via generative circuits. Next, we introduce Reward-Guided Flow Merging (RFM), a mirror-descent scheme that reduces reward-guided flow merging to a sequence of standard fine-tuning problems. Then, we provide first-of-their-kind theoretical guarantees for reward-guided and pure flow merging via RFM. Ultimately, we showcase the capabilities of the proposed method on illustrative settings providing visually interpretable insights, and apply our method to high-dimensional de-novo molecular design and low-energy conformer generation.

A Unified Density Operator View of Flow Control and Merging

TL;DR

Addresses the challenge of combining and adapting pre-trained flow models for downstream tasks by introducing a unifying probability-space framework of implicit density operators that express intersection, union, and interpolation. It defines a reward-guided objective and shows how classic flow merging and reward-tuning are limit cases. The core method, Reward-Guided Flow Merging (RFM), uses a mirror-descent scheme to reduce complex density-operator optimizations to sequential fine-tuning steps, with convergence guarantees for both reward-guided and pure merging. The authors demonstrate the approach on illustrative settings and real-world tasks including high-dimensional molecular design, conformer generation, and image-model merging, achieving controlled trade-offs between safety, diversity, and discovery.

Abstract

Recent progress in large-scale flow and diffusion models raised two fundamental algorithmic challenges: (i) control-based reward adaptation of pre-trained flows, and (ii) integration of multiple models, i.e., flow merging. While current approaches address them separately, we introduce a unifying probability-space framework that subsumes both as limit cases, and enables reward-guided flow merging, allowing principled, task-aware combination of multiple pre-trained flows (e.g., merging priors while maximizing drug-discovery utilities). Our formulation renders possible to express a rich family of operators over generative models densities, including intersection (e.g., to enforce safety), union (e.g., to compose diverse models), interpolation (e.g., for discovery), their reward-guided counterparts, as well as complex logical expressions via generative circuits. Next, we introduce Reward-Guided Flow Merging (RFM), a mirror-descent scheme that reduces reward-guided flow merging to a sequence of standard fine-tuning problems. Then, we provide first-of-their-kind theoretical guarantees for reward-guided and pure flow merging via RFM. Ultimately, we showcase the capabilities of the proposed method on illustrative settings providing visually interpretable insights, and apply our method to high-dimensional de-novo molecular design and low-energy conformer generation.
Paper Structure (36 sections, 7 theorems, 48 equations, 8 figures, 3 tables, 3 algorithms)

This paper contains 36 sections, 7 theorems, 48 equations, 8 figures, 3 tables, 3 algorithms.

Key Result

proposition 1

Given $\overline{p}_1^{pre} = \sum_{i=1}^n \alpha_i p_1^{pre,i}/\sum_{i=1}^n \alpha_i$, i.e., the $\alpha$-weighted mixture density of pre-trained models, the following hold:

Figures (8)

  • Figure 1: (\ref{['fig:process_drawing']}) Pre-trained and fine-tuned policies inducing $\{p_1^{pre,i}\}_{i=1}^n$ and optimal density $p_1^*$ computed via flow merging, i.e., subcase of Problem \ref{['eq:reward_guided_flow_merging_problem']} where $f$ is disregarded. (\ref{['fig:prob_opt_viewpoint']}) Probability-space optimization viewpoint on reward-guided flow merging, as in Problem \ref{['eq:reward_guided_flow_merging_problem']}.
  • Figure 2: Illustrative settings with visually interpretable results. (top) Flow model balanced pure intersection (\ref{['fig:toy_top_b']}), and reward-guided intersection (\ref{['fig:toy_top_c']}), (mid) Flow balanced and unbalanced union, (bottom) Flow model pure and reward-guided interpolation. Crucially, RFM can correctly implement these practically relevant and diverse operators with high degree of expressivity (e.g., $\alpha$, reward-guidance).
  • Figure 3: Drug-like molecules generated by $\pi_{AND}^*$ flow via RFM.
  • Figure 4: (top) RFM implements a generative circuit (\ref{['fig:toy2_top_d']}) describing a complex logical expressions ($\pi^* = (\pi_1 \land \pi_2) \lor (\pi_3 \land \pi_4)$) by computing sequential operators (\ref{['fig:toy2_top_a']}-\ref{['fig:toy2_top_c']}). (bottom) RFM computes a flows intersection $\pi^*$ generating drug molecules with desired energy levels.
  • Figure 5: RFM can perform balanced (B), unbalanced (UB), reward-guided (RG) intersections, as well as unions (UNION) of ETFlow hassan2024etflow conformer generation models. We evaluate the resulting flows in terms of median absolute errors of energy (\ref{['fig:conf_energy']}), dipole moment (\ref{['fig:conf_dipole']}), HOMO–LUMO gap (\ref{['fig:conf_eps']}), and minimum energy (\ref{['fig:conf_min_energy']}). These results demonstrate the ability of RFM to compute new flow models whose properties predictably interpolate those of the available pre-trained flows.
  • ...and 3 more figures

Theorems & Definitions (10)

  • proposition 1: Union operator via Pre-trained Mixture Density Representation
  • lemma 5.0: First Variation of Flow Process Functional
  • theorem 6.0: SOC Retains Score Information
  • theorem 6.0: Asymptotic convergence under inexact updates (Informal)
  • theorem B.0: SOC Retains Score Information
  • proof
  • theorem B.0: Convergence guarantee in the trajectory setting
  • proof
  • proposition 1: Union operator via Pre-trained Mixture Density Representation
  • proof