Table of Contents
Fetching ...

Retro-BLEU: Quantifying Chemical Plausibility of Retrosynthesis Routes through Reaction Template Sequence Analysis

Junren Li, Lei Fang, Jian-Guang Lou

TL;DR

Retrosynthesis route plausibility is hard to quantify with traditional metrics. This work introduces Retro-BLEU, a statistics-based metric that adapts the BLEU idea to reaction-template sequences, combining a length penalty with bigram overlap against known routes: $Score_{\rm Retro-BLEU}(r) = \exp\left(\frac{L}{\max(L,\mathrm{len}(r))}\right) + \exp\left(f_{2}(r)\right)$ with $n=2$ and $L \approx 3$. Empirically, Retro-BLEU differentiates patent-derived routes from model-generated ones and outperforms baselines in top-k ranking, with template-bigram analysis providing deeper insights into plausible versus unproductive steps, though performance is limited by observed-template coverage and reaction-space sparsity. The method offers a practical screening tool to accelerate CASP-based route selection and motivates ongoing template-library updates and incorporation of additional feasibility factors such as yields and costs.

Abstract

Computer-assisted methods have emerged as valuable tools for retrosynthesis analysis. However, quantifying the plausibility of generated retrosynthesis routes remains a challenging task. We introduce Retro-BLEU, a statistical metric adapted from the well-established BLEU score in machine translation, to evaluate the plausibility of retrosynthesis routes based on reaction template sequences analysis. We demonstrate the effectiveness of Retro-BLEU by applying it to a diverse set of retrosynthesis routes generated by state-of-the-art algorithms and compare the performance with other evaluation metrics. The results show that Retro-BLEU is capable of differentiating between plausible and implausible routes. Furthermore, we provide insights into the strengths and weaknesses of Retro-BLEU, paving the way for future developments and improvements in this field.

Retro-BLEU: Quantifying Chemical Plausibility of Retrosynthesis Routes through Reaction Template Sequence Analysis

TL;DR

Retrosynthesis route plausibility is hard to quantify with traditional metrics. This work introduces Retro-BLEU, a statistics-based metric that adapts the BLEU idea to reaction-template sequences, combining a length penalty with bigram overlap against known routes: with and . Empirically, Retro-BLEU differentiates patent-derived routes from model-generated ones and outperforms baselines in top-k ranking, with template-bigram analysis providing deeper insights into plausible versus unproductive steps, though performance is limited by observed-template coverage and reaction-space sparsity. The method offers a practical screening tool to accelerate CASP-based route selection and motivates ongoing template-library updates and incorporation of additional feasibility factors such as yields and costs.

Abstract

Computer-assisted methods have emerged as valuable tools for retrosynthesis analysis. However, quantifying the plausibility of generated retrosynthesis routes remains a challenging task. We introduce Retro-BLEU, a statistical metric adapted from the well-established BLEU score in machine translation, to evaluate the plausibility of retrosynthesis routes based on reaction template sequences analysis. We demonstrate the effectiveness of Retro-BLEU by applying it to a diverse set of retrosynthesis routes generated by state-of-the-art algorithms and compare the performance with other evaluation metrics. The results show that Retro-BLEU is capable of differentiating between plausible and implausible routes. Furthermore, we provide insights into the strengths and weaknesses of Retro-BLEU, paving the way for future developments and improvements in this field.
Paper Structure (10 sections, 2 equations, 4 figures, 1 table)

This paper contains 10 sections, 2 equations, 4 figures, 1 table.

Figures (4)

  • Figure 1: An comparative view of evaluation in machine translation and retrosynthesis planning using bigram overlap: a) in machine translation, the BLEU-2 score, which can be considered as bigram overlap in this case, can be used to select high-quality translation. b) in retrosynthesis planning, template bigrams overlap can be used to select chemically plausible routes.
  • Figure 2: Top-k accuracies for Retro-BLEU and other metrics. The top and bottom of areas with the diagonal line markings represent the best-case and worst-case scenarios, respectively. (a) MCTS algorithm applied to set-n5, (b) Retro$^*$ algorithm applied to set-n5, (c) MCTS algorithm applied to set-n5 searchable routes. (d) Retro$^*$ algorithm applied to set-n5 searchable routes.
  • Figure 3: Most frequent positive (highlighted in green) and negative (highlighted in red) template bigrams.
  • Figure 4: Examples of using Retro-BLEU to select feasible retrosynthesis routes. Top: The recorded route in patent WO2012/58671 and the shortest generated route. Bottom: The recorded route in patent US4596872 and the top-ranked generated route ranked by Retro-BLEU. The search algorithm employed in these examples is MCTS.