Decomposed Direct Preference Optimization for Structure-Based Drug Design

Xiwei Cheng; Xiangxin Zhou; Yuwei Yang; Yu Bao; Quanquan Gu

Decomposed Direct Preference Optimization for Structure-Based Drug Design

Xiwei Cheng, Xiangxin Zhou, Yuwei Yang, Yu Bao, Quanquan Gu

TL;DR

This work tackles the data-scarcity and objective-mismatch challenges in structure-based drug design by introducing DecompDPO, a multi-granularity preference-alignment framework for diffusion models. It decomposes optimization objectives into substructures (arms and scaffold) and applies GlobalDPO for non-decomposable goals (e.g., QED, SA) and LocalDPO for decomposable ones (e.g., Vina Minimize), unified under a DecompDPO loss; it also incorporates physics-informed constraints via ${r^{*}}({\mathcal{M}}, {\mathcal{P}}) = r({\mathcal{M}}, {\mathcal{P}}) - \lambda r_{constraint}({\mathcal{M}}, {\mathcal{P}})$ with ${r_{constraint}} = E_{bond} + E_{angle}$ and uses a linear beta schedule ${\beta}_{t} = \frac{t}{T} {\beta}_{T}$ to balance learning. The method is evaluated on CrossDocked2020 for both structure-based molecule generation and subpocket-targeted optimization, demonstrating significant improvements in binding affinity metrics and success rates while preserving physically realistic conformations, aided by decomposed preferences and conformational penalties. Code and models are provided to enable replication and adaptation in drug-design pipelines. The work advances practical SBDD by enabling flexible, multi-objective optimization that respects chemistry and physics, potentially accelerating lead discovery and optimization.

Abstract

Diffusion models have achieved promising results for Structure-Based Drug Design (SBDD). Nevertheless, high-quality protein subpocket and ligand data are relatively scarce, which hinders the models' generation capabilities. Recently, Direct Preference Optimization (DPO) has emerged as a pivotal tool for aligning generative models with human preferences. In this paper, we propose DecompDPO, a structure-based optimization method aligns diffusion models with pharmaceutical needs using multi-granularity preference pairs. DecompDPO introduces decomposition into the optimization objectives and obtains preference pairs at the molecule or decomposed substructure level based on each objective's decomposability. Additionally, DecompDPO introduces a physics-informed energy term to ensure reasonable molecular conformations in the optimization results. Notably, DecompDPO can be effectively used for two main purposes: (1) fine-tuning pretrained diffusion models for molecule generation across various protein families, and (2) molecular optimization given a specific protein subpocket after generation. Extensive experiments on the CrossDocked2020 benchmark show that DecompDPO significantly improves model performance, achieving up to 95.2% Med. High Affinity and a 36.2% success rate for molecule generation, and 100% Med. High Affinity and a 52.1% success rate for molecular optimization. Code is available at https://github.com/laviaf/DecompDPO.

Decomposed Direct Preference Optimization for Structure-Based Drug Design

TL;DR

with

and uses a linear beta schedule

to balance learning. The method is evaluated on CrossDocked2020 for both structure-based molecule generation and subpocket-targeted optimization, demonstrating significant improvements in binding affinity metrics and success rates while preserving physically realistic conformations, aided by decomposed preferences and conformational penalties. Code and models are provided to enable replication and adaptation in drug-design pipelines. The work advances practical SBDD by enabling flexible, multi-objective optimization that respects chemistry and physics, potentially accelerating lead discovery and optimization.

Abstract

Paper Structure (46 sections, 13 equations, 17 figures, 11 tables)

This paper contains 46 sections, 13 equations, 17 figures, 11 tables.

Introduction
Related Work
Structure-based Drug Design
Structure-based Molecule Optimization
Method
Preliminaries
Direct Preference Optimization in Decomposed Space
Decomposable Optimization Objectives
GlobalDPO
LocalDPO
DecompDpo
Physically Constrained Optimization
Linear Beta Schedule
Experiments
Experimental Setup
...and 31 more sections

Figures (17)

Figure 1: Overview of DecompDpo. (a) Sample molecules and select molecule pairs for each target protein using a pre-trained diffusion model; (b) Construct physically constrained preference for each optimization objective based on its decomposability; (c) Compute the DecompDpo loss and align the diffusion model with the multi-objective preference.
Figure 2: Illustration of decomposable objectives. Decompose a molecule into two arms (purple and pink) and a scaffold (yellow), where the sum of the substructures' Vina Minimize Scores equals to the molecule's (left). The Pearson correlation between molecule's and sum of substructure's Vina Minimize Scores in the training dataset (right).
Figure 3: Visualization of reference binding ligands and the molecule generated by DecompDiff* and DecompDpo on protein 4D7O (top) and 1UMD (bottom).
Figure 4: Noise schedule of atom and bond types.
Figure 5: Compare pairwise distance distributions between all atoms in generated molecules and reference molecules from the test set. Jensen-Shannon divergence (JSD) between two distributions is reported.
...and 12 more figures

Decomposed Direct Preference Optimization for Structure-Based Drug Design

TL;DR

Abstract

Decomposed Direct Preference Optimization for Structure-Based Drug Design

Authors

TL;DR

Abstract

Table of Contents

Figures (17)