Regressor-free Molecule Generation to Support Drug Response Prediction

Kun Li; Xiuwen Gong; Shirui Pan; Jia Wu; Bo Du; Wenbin Hu

Regressor-free Molecule Generation to Support Drug Response Prediction

Kun Li, Xiuwen Gong, Shirui Pan, Jia Wu, Bo Du, Wenbin Hu

TL;DR

This work tackles DRP by introducing regressor-free guidance for diffusion-based molecule generation, replacing classifier-based conditioning with a regression controller that maps target IC50 values to text via a CN-KG to ensure ordered numerical representations. A dual-branch DBControl diffusion model provides robust score estimation under limited task-specific data, enabling sampling within a narrow, target-centered space. Experimental results on real DRP data show improved generation quality (lower FCD) and better alignment with target IC50 values compared to strong baselines, suggesting practical gains in de novo drug design. The framework supports more efficient screening by producing molecules with higher likelihoods of desired DRP outcomes, though it requires substantial computational resources and lacks wet-lab validation at present.

Abstract

Drug response prediction (DRP) is a crucial phase in drug discovery, and the most important metric for its evaluation is the IC50 score. DRP results are heavily dependent on the quality of the generated molecules. Existing molecule generation methods typically employ classifier-based guidance, enabling sampling within the IC50 classification range. However, these methods fail to ensure the sampling space range's effectiveness, generating numerous ineffective molecules. Through experimental and theoretical study, we hypothesize that conditional generation based on the target IC50 score can obtain a more effective sampling space. As a result, we introduce regressor-free guidance molecule generation to ensure sampling within a more effective space and support DRP. Regressor-free guidance combines a diffusion model's score estimation with a regression controller model's gradient based on number labels. To effectively map regression labels between drugs and cell lines, we design a common-sense numerical knowledge graph that constrains the order of text representations. Experimental results on the real-world dataset for the DRP task demonstrate our method's effectiveness in drug discovery. The code is available at:https://anonymous.4open.science/r/RMCD-DBD1.

Regressor-free Molecule Generation to Support Drug Response Prediction

TL;DR

Abstract

Paper Structure (28 sections, 6 theorems, 16 equations, 7 figures, 9 tables, 2 algorithms)

This paper contains 28 sections, 6 theorems, 16 equations, 7 figures, 9 tables, 2 algorithms.

Introduction
Methods
Notations
Method overview
The regression controller model
The dual-branch controlled noise prediction model
Theoretical discussion
Experiments
Experimental setup
Overall experiment
Varying the regressor-free guidance strength experiment
Ablation studies
Conclusion
Model details
Framework of regressor-free guidance molecule generation
...and 13 more sections

Key Result

Proposition 1

For any $\bm{C}_{\mathrm{aim} } \in\left ( 0,1 \right )$, then $\left \| S_{cls} \right \| \ge \left \| S_{reg} \right \|$ exists.

Figures (7)

Figure 1: Sampling space comparison for target conditions in classifier- vs. regressor-based guidance molecule generation.
Figure 2: (a) illustrates the training process of the regression controller model, which serves as a conditional encoder guiding diffusion. (b) depicts the regressor-free guidance diffusion process, utilizing the text encoder of the trained regression controller model to encode the target conditions. The DBControl model is a score-based noise prediction model trained on a mixture of the conditional GDSCv2 and unconditional QM9 dataset.
Figure 3: UMAP visualization of molecule generation results with our method compared to four mainstream methods using the target pair (NCI-H187, $\mathrm{IC}_{50}$=0.35).
Figure 4: Visualization of regressor-free guidance strength trends. The x-axis represents the conditions' intensity, where $w = 0.0$ refers to non-guided models, while the y-axis represents the corresponding metric values.
Figure 5: Schematic of the DBControl.
...and 2 more figures

Theorems & Definitions (6)

Proposition 1: Main proposition
Proposition 2: Uniqueness of $\bm{C}_{\mathrm{aim} }$ Representation
Proposition 3: Equal interval representation of $\Theta$
Proposition 1: Main proposition
Proposition 2: Uniqueness of $\bm{C}_{\mathrm{aim} }$ Representation
Proposition 3: Equal interval representation of $\Theta$

Regressor-free Molecule Generation to Support Drug Response Prediction

TL;DR

Abstract

Regressor-free Molecule Generation to Support Drug Response Prediction

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (7)

Theorems & Definitions (6)