Preference Optimization for Molecule Synthesis with Conditional Residual Energy-based Models

Songtao Liu; Hanjun Dai; Yue Zhao; Peng Liu

Preference Optimization for Molecule Synthesis with Conditional Residual Energy-based Models

Songtao Liu, Hanjun Dai, Yue Zhao, Peng Liu

TL;DR

This work addresses the challenge of generating feasible and criterion-aligned chemical synthesis routes by integrating a conditional residual energy-based model (CREBM) with existing retrosynthesis strategies. It frames route generation as a probabilistic combination $P_\theta(\mathcal{T}|m_{tar},c) \propto P_{Retro}(\mathcal{T}|m_{tar}) \exp(-E_\theta(\mathcal{T}|m_{tar},c))$, allowing a learnable energy function to steer toward routes that satisfy criteria such as feasibility and cost. The authors adopt a preference-based training regime inspired by reward modeling in LLMs, using a Bradley–Terry loss over route comparisons with a heuristic feasibility reward $\varphi$, and implement with a Transformer-based $E_\theta$. In experiments on RetroBench, CREBM consistently improves top-1 accuracy across diverse base strategies, with more pronounced gains for deeper routes, demonstrating the framework’s plug-and-play effectiveness and potential for controllable synthesis planning. The work highlights a practical path to integrating long-range criteria into molecule synthesis workflows without retraining base retrosynthesis models.

Abstract

Molecule synthesis through machine learning is one of the fundamental problems in drug discovery. Current data-driven strategies employ one-step retrosynthesis models and search algorithms to predict synthetic routes in a top-bottom manner. Despite their effective performance, these strategies face limitations in the molecule synthetic route generation due to a greedy selection of the next molecule set without any lookahead. Furthermore, existing strategies cannot control the generation of synthetic routes based on possible criteria such as material costs, yields, and step count. In this work, we propose a general and principled framework via conditional residual energy-based models (EBMs), that focus on the quality of the entire synthetic route based on the specific criteria. By incorporating an additional energy-based function into our probabilistic model, our proposed algorithm can enhance the quality of the most probable synthetic routes (with higher probabilities) generated by various strategies in a plug-and-play fashion. Extensive experiments demonstrate that our framework can consistently boost performance across various strategies and outperforms previous state-of-the-art top-1 accuracy by a margin of 2.5%. Code is available at https://github.com/SongtaoLiu0823/CREBM.

Preference Optimization for Molecule Synthesis with Conditional Residual Energy-based Models

TL;DR

, allowing a learnable energy function to steer toward routes that satisfy criteria such as feasibility and cost. The authors adopt a preference-based training regime inspired by reward modeling in LLMs, using a Bradley–Terry loss over route comparisons with a heuristic feasibility reward

, and implement with a Transformer-based

. In experiments on RetroBench, CREBM consistently improves top-1 accuracy across diverse base strategies, with more pronounced gains for deeper routes, demonstrating the framework’s plug-and-play effectiveness and potential for controllable synthesis planning. The work highlights a practical path to integrating long-range criteria into molecule synthesis workflows without retraining base retrosynthesis models.

Abstract

Paper Structure (41 sections, 20 equations, 4 figures, 6 tables, 1 algorithm)

This paper contains 41 sections, 20 equations, 4 figures, 6 tables, 1 algorithm.

Introduction
Preliminary
Synthetic Route
Reaction & One-step Retrosynthesis
Retrosynthetic Planning
Energy-based Models
Conditional Residual Energy-based Models for Molecule Synthetic Route Generation
Retrosynthetic Planning is a Conditional Generation Task
Your Retrosynthesis Model is Secretly a Locally Normalized Model in Retrosynthetic Planning
Template-based Model.
Semi-template-based Model.
Template-free Model.
Conditional Residual Energy-based Models
Training
Implementation of $\varphi$
...and 26 more sections

Figures (4)

Figure 1: Illustration of a synthetic route. The target molecule we aim to synthesize is the one located on the extreme left, while the molecules positioned at the leaf nodes are the starting materials. The remaining molecules in the diagram are intermediates.
Figure 2: For a given target molecule, we find two synthetic routes that can synthesize it in the dataset.
Figure 3: A visual example of a target molecule's synthetic routes used to train the energy function, with these routes ranked according to Eq. \ref{['eq:sim']}.
Figure 4: Hydrogen bromide (HBr) can undergo 1,2-addition and 1,4-addition with 1,3-butadiene under different reaction conditions.

Preference Optimization for Molecule Synthesis with Conditional Residual Energy-based Models

TL;DR

Abstract

Preference Optimization for Molecule Synthesis with Conditional Residual Energy-based Models

Authors

TL;DR

Abstract

Table of Contents

Figures (4)