Table of Contents
Fetching ...

RLSynC: Offline-Online Reinforcement Learning for Synthon Completion

Frazier N. Baker, Ziqi Chen, Daniel Adu-Ampratwum, Xia Ning

TL;DR

RLSynC addresses synthon completion in semi-template-based retrosynthesis by framing it as a two-agent reinforcement learning problem with offline training and online data augmentation. Each synthon is assigned an agent that iteratively completes to a reactant, guided by a forward synthesis reward from a standalone predictor, and trained via a SARSA-like objective on offline episodes plus augmented online experiences. Empirical results on USPTO-50K show consistent gains in correctness and diversity over state-of-the-art baselines, with up to 14.9% improvement in MAP@10 and evidence of novel leaving-group discovery. The framework supports exploration of new reaction patterns, enabling synthesis planning with multiple viable options and providing a foundation for future extension to more reactants and graph-based representations.

Abstract

Retrosynthesis is the process of determining the set of reactant molecules that can react to form a desired product. Semi-template-based retrosynthesis methods, which imitate the reverse logic of synthesis reactions, first predict the reaction centers in the products, and then complete the resulting synthons back into reactants. We develop a new offline-online reinforcement learning method RLSynC for synthon completion in semi-template-based methods. RLSynC assigns one agent to each synthon, all of which complete the synthons by conducting actions step by step in a synchronized fashion. RLSynC learns the policy from both offline training episodes and online interactions, which allows RLSynC to explore new reaction spaces. RLSynC uses a standalone forward synthesis model to evaluate the likelihood of the predicted reactants in synthesizing a product, and thus guides the action search. Our results demonstrate that RLSynC can outperform state-of-the-art synthon completion methods with improvements as high as 14.9%, highlighting its potential in synthesis planning.

RLSynC: Offline-Online Reinforcement Learning for Synthon Completion

TL;DR

RLSynC addresses synthon completion in semi-template-based retrosynthesis by framing it as a two-agent reinforcement learning problem with offline training and online data augmentation. Each synthon is assigned an agent that iteratively completes to a reactant, guided by a forward synthesis reward from a standalone predictor, and trained via a SARSA-like objective on offline episodes plus augmented online experiences. Empirical results on USPTO-50K show consistent gains in correctness and diversity over state-of-the-art baselines, with up to 14.9% improvement in MAP@10 and evidence of novel leaving-group discovery. The framework supports exploration of new reaction patterns, enabling synthesis planning with multiple viable options and providing a foundation for future extension to more reactants and graph-based representations.

Abstract

Retrosynthesis is the process of determining the set of reactant molecules that can react to form a desired product. Semi-template-based retrosynthesis methods, which imitate the reverse logic of synthesis reactions, first predict the reaction centers in the products, and then complete the resulting synthons back into reactants. We develop a new offline-online reinforcement learning method RLSynC for synthon completion in semi-template-based methods. RLSynC assigns one agent to each synthon, all of which complete the synthons by conducting actions step by step in a synchronized fashion. RLSynC learns the policy from both offline training episodes and online interactions, which allows RLSynC to explore new reaction spaces. RLSynC uses a standalone forward synthesis model to evaluate the likelihood of the predicted reactants in synthesizing a product, and thus guides the action search. Our results demonstrate that RLSynC can outperform state-of-the-art synthon completion methods with improvements as high as 14.9%, highlighting its potential in synthesis planning.
Paper Structure (39 sections, 13 equations, 5 figures, 4 tables, 2 algorithms)

This paper contains 39 sections, 13 equations, 5 figures, 4 tables, 2 algorithms.

Figures (5)

  • Figure 1: Overview Scheme of $\mathop{\mathsf{RLSynC}}\limits$
  • Figure 2: Retrosynthesis Process
  • Figure 3: $\mathop{\mathsf{RLSynC}}\limits$ performance from data augmentation iterations; a, MAP@$N$; b, Diversity@$N$.
  • Figure 4: Improvement from Top-5 prediction search; a, MAP@$N$; b, NDCG@$N$; c, Diversity@$N$.
  • Figure 5: Predicted reactions by $\mathop{\mathsf{RLSynC}}\limits$ for case study; a, product; b, the ground-truth reactants in USPTO-50K; c-j, top predicted reactants.