Gradient Propagation in Retrosynthetic Space: An Efficient Framework for Synthesis Plan Generation
Chengyang Tian, Yuhang Chang, Yangpeng Zhang, Yang Liu
TL;DR
This work addresses retrosynthetic planning under uncertain chemical space by reframing the problem as an AND-OR graph search and introducing a gradient-propagation framework to maximize the Successful Synthesis Probability ($SSP$) across multiple routes. It combines differentiable SSP estimation with bottom-up $s$-value calculations and top-down gradient propagation to drive a greedy, influence-based node expansion, achieving efficient search and improved SSP. Key contributions include a bottom-up \\textit{s-value} update, a top-down gradient mechanism for node selection, and extensive ablations demonstrating efficiency gains over state-of-the-art baselines on large-scale benchmarks. The approach offers broad applicability to multi-route generation in synthesis planning and provides a foundation for integrating uncertainty modeling with efficient, gradient-guided search.
Abstract
Retrosynthesis, which aims to identify viable synthetic pathways for target molecules by decomposing them into simpler precursors, is often treated as a search problem. However, its complexity arises from multi-branched tree-structured pathways rather than linear paths. Some algorithms have been successfully applied in this task, but they either overlook the uncertainties inherent in chemical space or face limitations in practical application scenarios. To address these challenges, this paper introduces a novel gradient-propagation-based algorithmic framework for retrosynthetic route exploration. The proposed framework obtains the contributions of different nodes to the target molecule's success probability through gradient propagation and then guides the algorithm to greedily select the node with the highest contribution for expansion, thereby conducting efficient search in the chemical space. Experimental validations demonstrate that our algorithm achieves broad applicability across diverse molecular targets and exhibits superior computational efficiency compared to existing methods.
