Regressor-free Molecule Generation to Support Drug Response Prediction
Kun Li, Xiuwen Gong, Shirui Pan, Jia Wu, Bo Du, Wenbin Hu
TL;DR
This work tackles DRP by introducing regressor-free guidance for diffusion-based molecule generation, replacing classifier-based conditioning with a regression controller that maps target IC50 values to text via a CN-KG to ensure ordered numerical representations. A dual-branch DBControl diffusion model provides robust score estimation under limited task-specific data, enabling sampling within a narrow, target-centered space. Experimental results on real DRP data show improved generation quality (lower FCD) and better alignment with target IC50 values compared to strong baselines, suggesting practical gains in de novo drug design. The framework supports more efficient screening by producing molecules with higher likelihoods of desired DRP outcomes, though it requires substantial computational resources and lacks wet-lab validation at present.
Abstract
Drug response prediction (DRP) is a crucial phase in drug discovery, and the most important metric for its evaluation is the IC50 score. DRP results are heavily dependent on the quality of the generated molecules. Existing molecule generation methods typically employ classifier-based guidance, enabling sampling within the IC50 classification range. However, these methods fail to ensure the sampling space range's effectiveness, generating numerous ineffective molecules. Through experimental and theoretical study, we hypothesize that conditional generation based on the target IC50 score can obtain a more effective sampling space. As a result, we introduce regressor-free guidance molecule generation to ensure sampling within a more effective space and support DRP. Regressor-free guidance combines a diffusion model's score estimation with a regression controller model's gradient based on number labels. To effectively map regression labels between drugs and cell lines, we design a common-sense numerical knowledge graph that constrains the order of text representations. Experimental results on the real-world dataset for the DRP task demonstrate our method's effectiveness in drug discovery. The code is available at:https://anonymous.4open.science/r/RMCD-DBD1.
