Table of Contents
Fetching ...

A high-accuracy multi-model mixing retrosynthetic method

Shang Xiang, Lin Yao, Zhen Wang, Qifan Yu, Wentan Liu, Wentao Guo, Guolin Ke

TL;DR

A product prediction model is introduced aimed at enhancing the accuracy of single-step models and reduces the number of single-step reactions, but integrates multiple single-step models to maintain the overall reaction count and increase reaction diversity.

Abstract

The field of computer-aided synthesis planning (CASP) has seen rapid advancements in recent years, achieving significant progress across various algorithmic benchmarks. However, chemists often encounter numerous infeasible reactions when using CASP in practice. This article delves into common errors associated with CASP and introduces a product prediction model aimed at enhancing the accuracy of single-step models. While the product prediction model reduces the number of single-step reactions, it integrates multiple single-step models to maintain the overall reaction count and increase reaction diversity. Based on manual analysis and large-scale testing, the product prediction model, combined with the multi-model ensemble approach, has been proven to offer higher feasibility and greater diversity.

A high-accuracy multi-model mixing retrosynthetic method

TL;DR

A product prediction model is introduced aimed at enhancing the accuracy of single-step models and reduces the number of single-step reactions, but integrates multiple single-step models to maintain the overall reaction count and increase reaction diversity.

Abstract

The field of computer-aided synthesis planning (CASP) has seen rapid advancements in recent years, achieving significant progress across various algorithmic benchmarks. However, chemists often encounter numerous infeasible reactions when using CASP in practice. This article delves into common errors associated with CASP and introduces a product prediction model aimed at enhancing the accuracy of single-step models. While the product prediction model reduces the number of single-step reactions, it integrates multiple single-step models to maintain the overall reaction count and increase reaction diversity. Based on manual analysis and large-scale testing, the product prediction model, combined with the multi-model ensemble approach, has been proven to offer higher feasibility and greater diversity.
Paper Structure (24 sections, 3 equations, 2 figures, 4 tables, 1 algorithm)

This paper contains 24 sections, 3 equations, 2 figures, 4 tables, 1 algorithm.

Figures (2)

  • Figure 1: Infeasible reactions examples: (a)A Buchwald-Hartwig coupling reaction, but the presence of functional groups that hydrolyze under alkaline conditions will affect the reaction (such as carboxylic acid esters). (b)A Suzuki coupling reaction, but under this condition, benzyl chloride cannot remain stable and will also be reacted. (c)It is not feasible to do addition reaction for cyano groups without protecting the ketone carbonyl group, but simply adding a protecting group to the cyano group makes it feasible. (d)This is a two-step reaction of debocification and acid amine addition, which cannot be achieved in a single step reaction.
  • Figure 2: Two algorithm-generated synthetic routes for nirmatrelvir: (a) Based on a similarity template model without product prediction model filtering. The route is short, but existing two reactions which are difficult to achieve. In reaction 1, the product should be an acyl chloride, not an $\alpha$-chloro ketone. Additionally, the exposed amino group may also affect the reaction. In reaction 2, the functional groups, oxidation states, and atom counts of the reactants and products do not correspond. If the two reagents were mixed, it is highly likely that the carboxylic acid would replace the Cl. (b) Based on a similarity template model with product prediction model filtering. The route is longer, but the reactions are comparatively easier to achieve.