Decomposed Direct Preference Optimization for Structure-Based Drug Design
Xiwei Cheng, Xiangxin Zhou, Yuwei Yang, Yu Bao, Quanquan Gu
TL;DR
This work tackles the data-scarcity and objective-mismatch challenges in structure-based drug design by introducing DecompDPO, a multi-granularity preference-alignment framework for diffusion models. It decomposes optimization objectives into substructures (arms and scaffold) and applies GlobalDPO for non-decomposable goals (e.g., QED, SA) and LocalDPO for decomposable ones (e.g., Vina Minimize), unified under a DecompDPO loss; it also incorporates physics-informed constraints via ${r^{*}}({\mathcal{M}}, {\mathcal{P}}) = r({\mathcal{M}}, {\mathcal{P}}) - \lambda r_{constraint}({\mathcal{M}}, {\mathcal{P}})$ with ${r_{constraint}} = E_{bond} + E_{angle}$ and uses a linear beta schedule ${\beta}_{t} = \frac{t}{T} {\beta}_{T}$ to balance learning. The method is evaluated on CrossDocked2020 for both structure-based molecule generation and subpocket-targeted optimization, demonstrating significant improvements in binding affinity metrics and success rates while preserving physically realistic conformations, aided by decomposed preferences and conformational penalties. Code and models are provided to enable replication and adaptation in drug-design pipelines. The work advances practical SBDD by enabling flexible, multi-objective optimization that respects chemistry and physics, potentially accelerating lead discovery and optimization.
Abstract
Diffusion models have achieved promising results for Structure-Based Drug Design (SBDD). Nevertheless, high-quality protein subpocket and ligand data are relatively scarce, which hinders the models' generation capabilities. Recently, Direct Preference Optimization (DPO) has emerged as a pivotal tool for aligning generative models with human preferences. In this paper, we propose DecompDPO, a structure-based optimization method aligns diffusion models with pharmaceutical needs using multi-granularity preference pairs. DecompDPO introduces decomposition into the optimization objectives and obtains preference pairs at the molecule or decomposed substructure level based on each objective's decomposability. Additionally, DecompDPO introduces a physics-informed energy term to ensure reasonable molecular conformations in the optimization results. Notably, DecompDPO can be effectively used for two main purposes: (1) fine-tuning pretrained diffusion models for molecule generation across various protein families, and (2) molecular optimization given a specific protein subpocket after generation. Extensive experiments on the CrossDocked2020 benchmark show that DecompDPO significantly improves model performance, achieving up to 95.2% Med. High Affinity and a 36.2% success rate for molecule generation, and 100% Med. High Affinity and a 52.1% success rate for molecular optimization. Code is available at https://github.com/laviaf/DecompDPO.
