Evaluating Molecule Synthesizability via Retrosynthetic Planning and Reaction Prediction
Songtao Liu, Dandan Zhang, Zhengkai Tu, Hanjun Dai, Peng Liu
TL;DR
The paper tackles the challenge of evaluating synthesizability in drug design by introducing the round-trip score, a data-driven metric that fuses retrosynthetic planning with forward reaction prediction. It defines a three-stage process—retrosynthetic route generation, reaction-based feasibility testing, and route–molecule similarity assessment—to determine whether generated molecules can be practically synthesized. Through large-scale experiments, it demonstrates that round-trip score outperforms traditional search-based metrics in identifying feasible routes and provides a robust benchmark for synthesizability of molecules produced by structure-based drug design models. The approach underscores the importance of considering both molecular quality and practical synthesizability, offering a path toward more realistically enforceable constraints in medicinal chemistry pipelines.
Abstract
A significant challenge in wet lab experiments with current drug design generative models is the trade-off between pharmacological properties and synthesizability. Molecules predicted to have highly desirable properties are often difficult to synthesize, while those that are easily synthesizable tend to exhibit less favorable properties. As a result, evaluating the synthesizability of molecules in general drug design scenarios remains a significant challenge in the field of drug discovery. The commonly used synthetic accessibility (SA) score aims to evaluate the ease of synthesizing generated molecules, but it falls short of guaranteeing that synthetic routes can actually be found. Inspired by recent advances in top-down synthetic route generation and forward reaction prediction, we propose a new, data-driven metric to evaluate molecule synthesizability. This novel metric leverages the synergistic duality between retrosynthetic planners and reaction predictors, both of which are trained on extensive reaction datasets. To demonstrate the efficacy of our metric, we conduct a comprehensive evaluation of round-trip scores across a range of representative molecule generative models.
