Table of Contents
Fetching ...

Logical Guidance for the Exact Composition of Diffusion Models

Francesco Alesiani, Jonathan Warrell, Tanja Bien, Henrik Christiansen, Matheus Ferraz, Mathias Niepert

TL;DR

LoGDiff addresses the lack of principled support for complex Boolean guidance in diffusion models by deriving an exact Boolean calculus that composes atomic guidance signals under circuit-structured constraints. It replaces fixed weights with probability-dependent coefficients, enabling exact posterior-based guidance for conjunctions, disjunctions, and negations through recursive rules; the framework also provides completeness results for common distribution classes via compilability into probabilistic circuits. The approach is instantiated as a hybrid strategy that combines classifier-free scores with posterior estimates, and is demonstrated on both image generation tasks and protein-ligand design, including repulsive guidance to reduce class confusion. Overall, LoGDiff enables expressive, inference-time constraint satisfaction with diffusion models, with practical impact for controllable generation in vision and drug design.

Abstract

We propose LOGDIFF (Logical Guidance for the Exact Composition of Diffusion Models), a guidance framework for diffusion models that enables principled constrained generation with complex logical expressions at inference time. We study when exact score-based guidance for complex logical formulas can be obtained from guidance signals associated with atomic properties. First, we derive an exact Boolean calculus that provides a sufficient condition for exact logical guidance. Specifically, if a formula admits a circuit representation in which conjunctions combine conditionally independent subformulas and disjunctions combine subformulas that are either conditionally independent or mutually exclusive, exact logical guidance is achievable. In this case, the guidance signal can be computed exactly from atomic scores and posterior probabilities using an efficient recursive algorithm. Moreover, we show that, for commonly encountered classes of distributions, any desired Boolean formula is compilable into such a circuit representation. Second, by combining atomic guidance scores with posterior probability estimates, we introduce a hybrid guidance approach that bridges classifierguidance and classifier-free guidance, applicable to both compositional logical guidance and standard conditional generation. We demonstrate the effectiveness of our framework on multiple image and protein structure generation tasks.

Logical Guidance for the Exact Composition of Diffusion Models

TL;DR

LoGDiff addresses the lack of principled support for complex Boolean guidance in diffusion models by deriving an exact Boolean calculus that composes atomic guidance signals under circuit-structured constraints. It replaces fixed weights with probability-dependent coefficients, enabling exact posterior-based guidance for conjunctions, disjunctions, and negations through recursive rules; the framework also provides completeness results for common distribution classes via compilability into probabilistic circuits. The approach is instantiated as a hybrid strategy that combines classifier-free scores with posterior estimates, and is demonstrated on both image generation tasks and protein-ligand design, including repulsive guidance to reduce class confusion. Overall, LoGDiff enables expressive, inference-time constraint satisfaction with diffusion models, with practical impact for controllable generation in vision and drug design.

Abstract

We propose LOGDIFF (Logical Guidance for the Exact Composition of Diffusion Models), a guidance framework for diffusion models that enables principled constrained generation with complex logical expressions at inference time. We study when exact score-based guidance for complex logical formulas can be obtained from guidance signals associated with atomic properties. First, we derive an exact Boolean calculus that provides a sufficient condition for exact logical guidance. Specifically, if a formula admits a circuit representation in which conjunctions combine conditionally independent subformulas and disjunctions combine subformulas that are either conditionally independent or mutually exclusive, exact logical guidance is achievable. In this case, the guidance signal can be computed exactly from atomic scores and posterior probabilities using an efficient recursive algorithm. Moreover, we show that, for commonly encountered classes of distributions, any desired Boolean formula is compilable into such a circuit representation. Second, by combining atomic guidance scores with posterior probability estimates, we introduce a hybrid guidance approach that bridges classifierguidance and classifier-free guidance, applicable to both compositional logical guidance and standard conditional generation. We demonstrate the effectiveness of our framework on multiple image and protein structure generation tasks.
Paper Structure (38 sections, 5 theorems, 46 equations, 18 figures, 11 tables, 2 algorithms)

This paper contains 38 sections, 5 theorems, 46 equations, 18 figures, 11 tables, 2 algorithms.

Key Result

Proposition 1

Let $\varphi$ be a propositional formula over atoms $\{c_i\}$. Suppose that $\varphi$ admits a circuit representation whose internal nodes are $\land$, $\lor$, and $\lnot$, and whose $\land$- and $\lor$-nodes satisfy, for every $t\in(0,T]$ and every $\bm{x}\in\mathcal{X}$: Assume furthermore that $\widehat{\pi}(\psi)=p_t(\psi\mid \bm{x})$ for all subformulas $\psi$ of $\varphi$, for every $t\in(0

Figures (18)

  • Figure 1: Logical Compositional Guidance. Visualization of logical composition using logical scores $s_t(\varphi, \bm{x})$ for two specific queries $\varphi$. Our framework replaces constant mixing weights with probability-dependent coefficients derived explicitly from posterior probabilities, allowing for mathematically grounded compositions.
  • Figure 2: Failure cases of constant baseline. Constant baseline (top) and LoGDiff (bottom). The constant baseline struggles with disjunctions, mixing attributes (left), collapsing to an intersection (AND behavior) (middle), or failing for complex queries (right). These failures worsen with higher guidance scales ($w=2.5$).
  • Figure 3: Conformity-diversity trade-off on CMNIST. Conformity Score $\uparrow$ vs. Joint Shannon Entropy across varying guidance scales $w \in [1.0, 2.5]$. Vertical dotted lines indicate the theoretical optimal entropy for each task (note that for AND, the optimal entropy is low as the solution space is highly constrained). While the constant baseline (blue) suffers from low entropy, indicating mode collapse, as guidance strength increases, our method (orange) successfully maintains high sample diversity while achieving high conformity scores.
  • Figure 4: Impact of adaptive repulsive guidance. (Left) Moderate guidance weights improve FID scores compared to no repulsive guiding ($w_\text{not} = 0.0$). (Right) Repulsive guiding removes artifacts while clearly defined samples remain unchanged.
  • Figure 5: Visualization of ligands in the GRM5-RRM1 dual-target binding site. (Top) Reference ligand in the aligned binding pocket. (Middle) Representative ligand generated under the guidance term ($A \land B$). (Bottom) Representative ligand generated under the selective constraint ($A \land \lnot B$), designed to engage GRM5 while avoiding RRM1 binding. Protein surfaces are shown for GRM5 (target A) and RRM1 (target B), with the ligand displayed as sticks.
  • ...and 13 more figures

Theorems & Definitions (8)

  • Proposition 1
  • Proposition 2
  • Proposition 3
  • proof
  • Proposition 4
  • proof
  • Proposition 5
  • proof