Molecule Generation for Target Protein Binding with Hierarchical Consistency Diffusion Model
Guanlue Li, Chenran Jiang, Ziqi Gao, Yu Liu, Chenyang Liu, Jiean Chen, Yong Huang, Jia Li
TL;DR
AMDiff addresses the challenge of de novo ligand design conditioned on target proteins by introducing a hierarchical diffusion framework that jointly models atom-level and motif-level ligand representations. It leverages classifier-free guidance and binding-site conditioning, augmented by topological features, to generate valid, diverse, and high-affinity molecules. Across CrossDocked benchmarks and kinase targets ALK and CDK4, AMDiff demonstrates superior validity, diversity, novelty, and docking performance, while remaining robust to pocket size changes and protein mutations. By bridging atom-level detail with motif-level priors and enabling cross-view information exchange, AMDiff advances structure-based drug design and has potential to speed up target-aware molecular generation in drug discovery.
Abstract
Effective generation of molecular structures, or new chemical entities, that bind to target proteins is crucial for lead identification and optimization in drug discovery. Despite advancements in atom- and motif-wise deep learning models for 3D molecular generation, current methods often struggle with validity and reliability. To address these issues, we develop the Atom-Motif Consistency Diffusion Model (AMDiff), utilizing a joint-training paradigm for multi-view learning. This model features a hierarchical diffusion architecture that integrates both atom- and motif-level views of molecules, allowing for comprehensive exploration of complementary information. By leveraging classifier-free guidance and incorporating binding site features as conditional inputs, AMDiff ensures robust molecule generation across diverse targets. Compared to existing approaches, AMDiff exhibits superior validity and novelty in generating molecules tailored to fit various protein pockets. Case studies targeting protein kinases, including Anaplastic Lymphoma Kinase (ALK) and Cyclin-dependent kinase 4 (CDK4), demonstrate the model's capability in structure-based de novo drug design. Overall, AMDiff bridges the gap between atom-view and motif-view drug discovery and speeds up the process of target-aware molecular generation.
