Electron flow matching for generative reaction mechanism prediction obeying conservation laws
Joonyoung F. Joung, Mun Hong Fong, Nicholas Casetti, Jordan P. Liles, Ne S. Dassanayake, Connor W. Coley
TL;DR
This work introduces FlowER, a conservation‑aware generative framework that treats chemical reactions as electron redistribution captured by a Bond‑Electron ($BE$) matrix. By applying flow matching to learn a time‑dependent vector field, FlowER enforces exact mass conservation ($\sum \Delta BE =0$) and yields interpretable, mechanistic step predictions that align with textbook chemistry. Empirically, FlowER outperforms sequence‑based baselines in structural validity and pathway coverage, demonstrates strong generalization to unseen reaction classes with data‑efficient fine‑tuning, and provides a natural interface for thermodynamic or kinetic feasibility assessments via downstream quantum calculations. This conservation‑conscious approach bridges predictive accuracy with mechanistic understanding, offering a robust tool for synthesis planning and reaction discovery, while inviting expansion of mechanistic datasets to broaden its coverage.
Abstract
Central to our understanding of chemical reactivity is the principle of mass conservation, which is fundamental for ensuring physical consistency, balancing equations, and guiding reaction design. However, data-driven computational models for tasks such as reaction product prediction rarely abide by this most basic constraint. In this work, we recast the problem of reaction prediction as a problem of electron redistribution using the modern deep generative framework of flow matching. Our model, FlowER, overcomes limitations inherent in previous approaches by enforcing exact mass conservation, thereby resolving hallucinatory failure modes, recovering mechanistic reaction sequences for unseen substrate scaffolds, and generalizing effectively to out-of-domain reaction classes with extremely data-efficient fine-tuning. FlowER additionally enables estimation of thermodynamic or kinetic feasibility and manifests a degree of chemical intuition in reaction prediction tasks. This inherently interpretable framework represents a significant step in bridging the gap between predictive accuracy and mechanistic understanding in data-driven reaction outcome prediction.
