A Unified Density Operator View of Flow Control and Merging
Riccardo De Santi, Malte Franke, Ya-Ping Hsieh, Andreas Krause
TL;DR
Addresses the challenge of combining and adapting pre-trained flow models for downstream tasks by introducing a unifying probability-space framework of implicit density operators that express intersection, union, and interpolation. It defines a reward-guided objective $\mathcal{G}(p_1^{\pi})=\mathbb{E}_{x\sim p_1^{\pi}}[f(x)]-\sum_{i=1}^n \alpha_i D_i(p_1^{\pi}\|p_1^{pre,i})$ and shows how classic flow merging and reward-tuning are limit cases. The core method, Reward-Guided Flow Merging (RFM), uses a mirror-descent scheme to reduce complex density-operator optimizations to sequential fine-tuning steps, with convergence guarantees for both reward-guided and pure merging. The authors demonstrate the approach on illustrative settings and real-world tasks including high-dimensional molecular design, conformer generation, and image-model merging, achieving controlled trade-offs between safety, diversity, and discovery.
Abstract
Recent progress in large-scale flow and diffusion models raised two fundamental algorithmic challenges: (i) control-based reward adaptation of pre-trained flows, and (ii) integration of multiple models, i.e., flow merging. While current approaches address them separately, we introduce a unifying probability-space framework that subsumes both as limit cases, and enables reward-guided flow merging, allowing principled, task-aware combination of multiple pre-trained flows (e.g., merging priors while maximizing drug-discovery utilities). Our formulation renders possible to express a rich family of operators over generative models densities, including intersection (e.g., to enforce safety), union (e.g., to compose diverse models), interpolation (e.g., for discovery), their reward-guided counterparts, as well as complex logical expressions via generative circuits. Next, we introduce Reward-Guided Flow Merging (RFM), a mirror-descent scheme that reduces reward-guided flow merging to a sequence of standard fine-tuning problems. Then, we provide first-of-their-kind theoretical guarantees for reward-guided and pure flow merging via RFM. Ultimately, we showcase the capabilities of the proposed method on illustrative settings providing visually interpretable insights, and apply our method to high-dimensional de-novo molecular design and low-energy conformer generation.
