Modular Multi-Task Learning for Chemical Reaction Prediction
Jiayun Pang, Ahmed M. Zaitoun, Xacobe Couso Cambeiro, Ivan Vulić
TL;DR
This work tackles the challenge of specializing large chemistry-oriented LLMs to limited, domain-specific reaction datasets without losing broad chemical knowledge. It advocates Low-Rank Adaptation (LoRA), a parameter-efficient modular fine-tuning approach, formalized by the update $\Delta W = BA$ and $W' = W + BA$, where $A \in \mathbb{R}^{r \times d}$ and $B \in \mathbb{R}^{k \times r}$, freezing the base model while learning compact adapters. Through extensive evaluation on USPTO_1K_TPL and a challenging C–H borylation dataset, the study shows LoRA achieves accuracy comparable to full fine-tuning across forward reaction prediction, retrosynthesis, and reagent prediction, while better preserving multi-task performance and mitigating catastrophic forgetting. The results also reveal that LoRA and full fine-tuning can yield different reactivity representations and solvent-generation capabilities, and that LoRA offers greater flexibility for modular deployment as LLMs scale. Overall, LoRA emerges as a practical, scalable strategy for chemistry applications, enabling task-specific specialization without compromising broader chemical understanding.
Abstract
Adapting large language models (LLMs) trained on broad organic chemistry to smaller, domain-specific reaction datasets is a key challenge in chemical and pharmaceutical R&D. Effective specialisation requires learning new reaction knowledge while preserving general chemical understanding across related tasks. Here, we evaluate Low-Rank Adaptation (LoRA) as a parameter-efficient alternative to full fine-tuning for organic reaction prediction on limited, complex datasets. Using USPTO reaction classes and challenging C-H functionalisation reactions, we benchmark forward reaction prediction, retrosynthesis and reagent prediction. LoRA achieves accuracy comparable to full fine-tuning while effectively mitigating catastrophic forgetting and better preserving multi-task performance. Both fine-tuning approaches generalise beyond training distributions, producing plausible alternative solvent predictions. Notably, C-H functionalisation fine-tuning reveals that LoRA and full fine-tuning encode subtly different reactivity patterns, suggesting more effective reaction-specific adaptation with LoRA. As LLMs continue to scale, our results highlight the practicality of modular, parameter-efficient fine-tuning strategies for their flexible deployment for chemistry applications.
