MolChord: Structure-Sequence Alignment for Protein-Guided Drug Design

Wei Zhang; Zekun Guo; Yingce Xia; Peiran Jin; Shufang Xie; Tao Qin; Xiang-Yang Li

MolChord: Structure-Sequence Alignment for Protein-Guided Drug Design

Wei Zhang, Zekun Guo, Yingce Xia, Peiran Jin, Shufang Xie, Tao Qin, Xiang-Yang Li

TL;DR

MolChord is proposed, which integrates two key techniques to align protein and molecule structures with their textual descriptions and sequential representations, and to guide molecules toward desired properties by integrating preference data and refine the alignment process using Direct Preference Optimization (DPO).

Abstract

Structure-based drug design (SBDD), which maps target proteins to candidate molecular ligands, is a fundamental task in drug discovery. Effectively aligning protein structural representations with molecular representations, and ensuring alignment between generated drugs and their pharmacological properties, remains a critical challenge. To address these challenges, we propose MolChord, which integrates two key techniques: (1) to align protein and molecule structures with their textual descriptions and sequential representations (e.g., FASTA for proteins and SMILES for molecules), we leverage NatureLM, an autoregressive model unifying text, small molecules, and proteins, as the molecule generator, alongside a diffusion-based structure encoder; and (2) to guide molecules toward desired properties, we curate a property-aware dataset by integrating preference data and refine the alignment process using Direct Preference Optimization (DPO). Experimental results on CrossDocked2020 demonstrate that our approach achieves state-of-the-art performance on key evaluation metrics, highlighting its potential as a practical tool for SBDD.

MolChord: Structure-Sequence Alignment for Protein-Guided Drug Design

TL;DR

Abstract

MolChord: Structure-Sequence Alignment for Protein-Guided Drug Design

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (6)