Table of Contents
Fetching ...

MacroGuide: Topological Guidance for Macrocycle Generation

Alicja Maksymiuk, Alexandre Duplessis, Michael Bronstein, Alexander Tong, Fernanda Duarte, İsmail İlkan Ceylan

TL;DR

Empirically, applying MacroGuide to pretrained diffusion models increases macrocycle generation rates from 1% to 99%, while matching or exceeding state-of-the-art performance on key quality metrics such as chemical validity, diversity, and PoseBusters checks.

Abstract

Macrocycles are ring-shaped molecules that offer a promising alternative to small-molecule drugs due to their enhanced selectivity and binding affinity against difficult targets. Despite their chemical value, they remain underexplored in generative modeling, likely owing to their scarcity in public datasets and the challenges of enforcing topological constraints in standard deep generative models. We introduce MacroGuide: Topological Guidance for Macrocycle Generation, a diffusion guidance mechanism that uses Persistent Homology to steer the sampling of pretrained molecular generative models toward the generation of macrocycles, in both unconditional and conditional (protein pocket) settings. At each denoising step, MacroGuide constructs a Vietoris-Rips complex from atomic positions and promotes ring formation by optimizing persistent homology features. Empirically, applying MacroGuide to pretrained diffusion models increases macrocycle generation rates from 1% to 99%, while matching or exceeding state-of-the-art performance on key quality metrics such as chemical validity, diversity, and PoseBusters checks.

MacroGuide: Topological Guidance for Macrocycle Generation

TL;DR

Empirically, applying MacroGuide to pretrained diffusion models increases macrocycle generation rates from 1% to 99%, while matching or exceeding state-of-the-art performance on key quality metrics such as chemical validity, diversity, and PoseBusters checks.

Abstract

Macrocycles are ring-shaped molecules that offer a promising alternative to small-molecule drugs due to their enhanced selectivity and binding affinity against difficult targets. Despite their chemical value, they remain underexplored in generative modeling, likely owing to their scarcity in public datasets and the challenges of enforcing topological constraints in standard deep generative models. We introduce MacroGuide: Topological Guidance for Macrocycle Generation, a diffusion guidance mechanism that uses Persistent Homology to steer the sampling of pretrained molecular generative models toward the generation of macrocycles, in both unconditional and conditional (protein pocket) settings. At each denoising step, MacroGuide constructs a Vietoris-Rips complex from atomic positions and promotes ring formation by optimizing persistent homology features. Empirically, applying MacroGuide to pretrained diffusion models increases macrocycle generation rates from 1% to 99%, while matching or exceeding state-of-the-art performance on key quality metrics such as chemical validity, diversity, and PoseBusters checks.
Paper Structure (69 sections, 2 theorems, 26 equations, 17 figures, 16 tables, 4 algorithms)

This paper contains 69 sections, 2 theorems, 26 equations, 17 figures, 16 tables, 4 algorithms.

Key Result

Theorem 3.1

Consider a regular cyclic conformation of n atoms ($n$ even) with bond length $\ell$ and bond angle $\theta$. The death time d of the dominant $H_1$ component in the Vietoris-Rips filtration is given by: In the limit of large n,

Figures (17)

  • Figure 1: Method overview.MacroGuide drives the denoising trajectory towards macrocyclic structures using updates from a topological objective.
  • Figure 2: Examples of generated macrocycles.Top: Unconditional generation. Bottom: Protein conditioning. Bottom right: This molecule was specifically optimized to be bicyclic (two rings). Molecule fragments appear transparent when hidden by parts of the protein pocket.
  • Figure 3: Topological guidance for diffusion-based macrocycle generation.
  • Figure 4: Performance of MolDiff with increasing molecular size. Adding $H_0$ guidance term improves performance for large molecule sizes. Results obtained from $200$ samples each.
  • Figure 5: Median max cycle size as a function of the target death. The empirical results are compared to the theoretical formulas for $\ell=1.0$ and $\ell=1.5$, the minimum and maximum typical bond lengths, respectively. Results are computed for $200$ samples of 30 heavy atoms, with each target size $d^\star$ being constrained in the relaxed form of an interval $[d^\star-0.05, d^\star+0.05]$, sampled at 0.1 intervals.
  • ...and 12 more figures

Theorems & Definitions (4)

  • Theorem 3.1: Vietoris-Rips death time of a tetrahedral cycle
  • Lemma 3.1: Vietoris-Rips death time of a regular n-gon
  • proof : Proof of Lemma \ref{['lemma:ngon']}
  • proof : Proof of \ref{['thm:rips_death_tetra']}