Table of Contents
Fetching ...

Evaluating Molecule Synthesizability via Retrosynthetic Planning and Reaction Prediction

Songtao Liu, Dandan Zhang, Zhengkai Tu, Hanjun Dai, Peng Liu

TL;DR

The paper tackles the challenge of evaluating synthesizability in drug design by introducing the round-trip score, a data-driven metric that fuses retrosynthetic planning with forward reaction prediction. It defines a three-stage process—retrosynthetic route generation, reaction-based feasibility testing, and route–molecule similarity assessment—to determine whether generated molecules can be practically synthesized. Through large-scale experiments, it demonstrates that round-trip score outperforms traditional search-based metrics in identifying feasible routes and provides a robust benchmark for synthesizability of molecules produced by structure-based drug design models. The approach underscores the importance of considering both molecular quality and practical synthesizability, offering a path toward more realistically enforceable constraints in medicinal chemistry pipelines.

Abstract

A significant challenge in wet lab experiments with current drug design generative models is the trade-off between pharmacological properties and synthesizability. Molecules predicted to have highly desirable properties are often difficult to synthesize, while those that are easily synthesizable tend to exhibit less favorable properties. As a result, evaluating the synthesizability of molecules in general drug design scenarios remains a significant challenge in the field of drug discovery. The commonly used synthetic accessibility (SA) score aims to evaluate the ease of synthesizing generated molecules, but it falls short of guaranteeing that synthetic routes can actually be found. Inspired by recent advances in top-down synthetic route generation and forward reaction prediction, we propose a new, data-driven metric to evaluate molecule synthesizability. This novel metric leverages the synergistic duality between retrosynthetic planners and reaction predictors, both of which are trained on extensive reaction datasets. To demonstrate the efficacy of our metric, we conduct a comprehensive evaluation of round-trip scores across a range of representative molecule generative models.

Evaluating Molecule Synthesizability via Retrosynthetic Planning and Reaction Prediction

TL;DR

The paper tackles the challenge of evaluating synthesizability in drug design by introducing the round-trip score, a data-driven metric that fuses retrosynthetic planning with forward reaction prediction. It defines a three-stage process—retrosynthetic route generation, reaction-based feasibility testing, and route–molecule similarity assessment—to determine whether generated molecules can be practically synthesized. Through large-scale experiments, it demonstrates that round-trip score outperforms traditional search-based metrics in identifying feasible routes and provides a robust benchmark for synthesizability of molecules produced by structure-based drug design models. The approach underscores the importance of considering both molecular quality and practical synthesizability, offering a path toward more realistically enforceable constraints in medicinal chemistry pipelines.

Abstract

A significant challenge in wet lab experiments with current drug design generative models is the trade-off between pharmacological properties and synthesizability. Molecules predicted to have highly desirable properties are often difficult to synthesize, while those that are easily synthesizable tend to exhibit less favorable properties. As a result, evaluating the synthesizability of molecules in general drug design scenarios remains a significant challenge in the field of drug discovery. The commonly used synthetic accessibility (SA) score aims to evaluate the ease of synthesizing generated molecules, but it falls short of guaranteeing that synthetic routes can actually be found. Inspired by recent advances in top-down synthetic route generation and forward reaction prediction, we propose a new, data-driven metric to evaluate molecule synthesizability. This novel metric leverages the synergistic duality between retrosynthetic planners and reaction predictors, both of which are trained on extensive reaction datasets. To demonstrate the efficacy of our metric, we conduct a comprehensive evaluation of round-trip scores across a range of representative molecule generative models.

Paper Structure

This paper contains 30 sections, 1 equation, 5 figures, 5 tables.

Figures (5)

  • Figure 1: For a given molecule, multiple synthetic routes can be identified within the reaction database, illustrating the diverse routes available for its synthesis.
  • Figure 2: Comparison of evaluation metrics for retrosynthetic planning. The search success rate deems both routes successful, while the matching-based metric correctly identifies the top route as incorrect and the bottom route as correct, demonstrating its superior reliability.
  • Figure 3: Illustration of the round-trip score calculation process. It consists of three stages: Retrosynthetic Planning, Forward Reproduction, and Similarity Computation.
  • Figure 4: The feasibility of this reaction is confirmed by the reactions documented in the CAS database.
  • Figure 5: The reaction shown above is predicted by the planner, while the one below is retrieved from the CAS database. The product of the reaction below differs from the product of the reaction above by only one additional methyl group.