ChatGPT-powered Conversational Drug Editing Using Retrieval and Domain Feedback

Shengchao Liu; Jiongxiao Wang; Yijin Yang; Chengpeng Wang; Ling Liu; Hongyu Guo; Chaowei Xiao

ChatGPT-powered Conversational Drug Editing Using Retrieval and Domain Feedback

Shengchao Liu, Jiongxiao Wang, Yijin Yang, Chengpeng Wang, Ling Liu, Hongyu Guo, Chaowei Xiao

TL;DR

This work addresses the challenge of applying conversational LLMs to drug editing by introducing ChatDrug, a modular framework that combines domain-specific prompt design (PDDS), retrieval-and-domain-feedback (ReDF), and interactive conversation. The approach enables open-vocabulary, compositional drug editing across small molecules, peptides, and proteins, yielding top performance on most tasks and providing interpretable domain insights. Empirical results across 39 tasks show that ChatDrug achieves state-of-the-art or competitive performance, supported by qualitative case studies of substructure and motif edits. The work demonstrates the potential to augment drug discovery pipelines with iterative, knowledge-grounded human-in-the-loop guidance from LLMs.

Abstract

Recent advancements in conversational large language models (LLMs), such as ChatGPT, have demonstrated remarkable promise in various domains, including drug discovery. However, existing works mainly focus on investigating the capabilities of conversational LLMs on chemical reaction and retrosynthesis. While drug editing, a critical task in the drug discovery pipeline, remains largely unexplored. To bridge this gap, we propose ChatDrug, a framework to facilitate the systematic investigation of drug editing using LLMs. ChatDrug jointly leverages a prompt module, a retrieval and domain feedback (ReDF) module, and a conversation module to streamline effective drug editing. We empirically show that ChatDrug reaches the best performance on 33 out of 39 drug editing tasks, encompassing small molecules, peptides, and proteins. We further demonstrate, through 10 case studies, that ChatDrug can successfully identify the key substructures (e.g., the molecule functional groups, peptide motifs, and protein structures) for manipulation, generating diverse and valid suggestions for drug editing. Promisingly, we also show that ChatDrug can offer insightful explanations from a domain-specific perspective, enhancing interpretability and enabling informed decision-making. This research sheds light on the potential of ChatGPT and conversational LLMs for drug editing. It paves the way for a more efficient and collaborative drug discovery pipeline, contributing to the advancement of pharmaceutical research and development.

ChatGPT-powered Conversational Drug Editing Using Retrieval and Domain Feedback

TL;DR

Abstract

Paper Structure (42 sections, 3 equations, 16 figures, 23 tables)

This paper contains 42 sections, 3 equations, 16 figures, 23 tables.

Introduction
Preliminaries
Method: ChatDrug Framework
PDDS Module
ReDF Module
Conversation Module
Experiment
Text-guided Molecule Property Editing
Text-guided Immunogenic Binding Peptide Editing
Text-guided Protein Secondary Structure Editing
Ablation Study on Comparison with Zero-shot and In-context Learning
Ablation Study on the Number of Conversation Rounds in ChatDrug
Ablation Study on the Thresholds in Feedback Condition Function
Ablation Study on the Similarity Between Input and Output Drugs
Why ChatDrug Works? Knowledge Extraction
...and 27 more sections

Figures (16)

Figure 1: The pipeline for ChatDrug with 3 modules. PDDS generates drug editing prompts. ReDF updates the prompts using retrieved information and domain feedback. Finally, ChatDrug adopts the conversational module for interactive refinement. Further, we demonstrate 3 drug types: small molecules, peptides, and proteins.
Figure 2: Visualization of two peptide editing tasks using PWM. The x-axis corresponds to the position index, while the y-axis corresponds to the distribution of each amino acid (in alphabets) at each position.
Figure 3: Visualization of two protein editing tasks. For the protein secondary structures, the $\alpha$-helix is marked in red, and $\beta$-sheet is marked in yellow. The edited regions before and after ChatDrug are marked in blue circles.
Figure 4: Similarity distribution between input molecules ${\bm{x}}_{\text{in}}$ and retrieval ${\bm{x}}_R$, intermediate ${\bm{x}}_1$, and output molecules ${\bm{x}}_{\text{out}}$. We pick up three tasks on small molecules for visualization, and more results are in \ref{['sec:visual_analysis']}.
Figure 5: Knowledge extraction of ChatDrug.
...and 11 more figures

ChatGPT-powered Conversational Drug Editing Using Retrieval and Domain Feedback

TL;DR

Abstract

ChatGPT-powered Conversational Drug Editing Using Retrieval and Domain Feedback

Authors

TL;DR

Abstract

Table of Contents

Figures (16)