Table of Contents
Fetching ...

Saturn: Sample-efficient Generative Molecular Design using Memory Manipulation

Jeff Guo, Philippe Schwaller

TL;DR

Saturn tackles the challenge of sample-efficient de novo molecular design by directly optimizing expensive, high-fidelity oracles through memory-assisted reinforcement learning. It integrates Augmented Memory with the Mamba architecture (and variants with RNN and Transformer backbones) to create a hop-and-locally-explore generative process, augmented by SMILES enumeration and an oracle cache to reduce oracle calls. Empirically, Saturn outperforms 22 models on MPO docking tasks under fixed budgets and approaches GEAM’s performance on similar benchmarks, while demonstrating transfer to physics-based docking objectives. The work also analyzes the mechanism of Augmented Memory, its impact on exploration, and the trade-offs between sample efficiency and diversity, suggesting pathways for applying Saturn to even higher-fidelity oracles and curriculum learning in drug discovery contexts.

Abstract

Generative molecular design for drug discovery has very recently achieved a wave of experimental validation, with language-based backbones being the most common architectures employed. The most important factor for downstream success is whether an in silico oracle is well correlated with the desired end-point. To this end, current methods use cheaper proxy oracles with higher throughput before evaluating the most promising subset with high-fidelity oracles. The ability to directly optimize high-fidelity oracles would greatly enhance generative design and be expected to improve hit rates. However, current models are not efficient enough to consider such a prospect, exemplifying the sample efficiency problem. In this work, we introduce Saturn, which leverages the Augmented Memory algorithm and demonstrates the first application of the Mamba architecture for generative molecular design. We elucidate how experience replay with data augmentation improves sample efficiency and how Mamba synergistically exploits this mechanism. Saturn outperforms 22 models on multi-parameter optimization tasks relevant to drug discovery and may possess sufficient sample efficiency to consider the prospect of directly optimizing high-fidelity oracles.

Saturn: Sample-efficient Generative Molecular Design using Memory Manipulation

TL;DR

Saturn tackles the challenge of sample-efficient de novo molecular design by directly optimizing expensive, high-fidelity oracles through memory-assisted reinforcement learning. It integrates Augmented Memory with the Mamba architecture (and variants with RNN and Transformer backbones) to create a hop-and-locally-explore generative process, augmented by SMILES enumeration and an oracle cache to reduce oracle calls. Empirically, Saturn outperforms 22 models on MPO docking tasks under fixed budgets and approaches GEAM’s performance on similar benchmarks, while demonstrating transfer to physics-based docking objectives. The work also analyzes the mechanism of Augmented Memory, its impact on exploration, and the trade-offs between sample efficiency and diversity, suggesting pathways for applying Saturn to even higher-fidelity oracles and curriculum learning in drug discovery contexts.

Abstract

Generative molecular design for drug discovery has very recently achieved a wave of experimental validation, with language-based backbones being the most common architectures employed. The most important factor for downstream success is whether an in silico oracle is well correlated with the desired end-point. To this end, current methods use cheaper proxy oracles with higher throughput before evaluating the most promising subset with high-fidelity oracles. The ability to directly optimize high-fidelity oracles would greatly enhance generative design and be expected to improve hit rates. However, current models are not efficient enough to consider such a prospect, exemplifying the sample efficiency problem. In this work, we introduce Saturn, which leverages the Augmented Memory algorithm and demonstrates the first application of the Mamba architecture for generative molecular design. We elucidate how experience replay with data augmentation improves sample efficiency and how Mamba synergistically exploits this mechanism. Saturn outperforms 22 models on multi-parameter optimization tasks relevant to drug discovery and may possess sufficient sample efficiency to consider the prospect of directly optimizing high-fidelity oracles.
Paper Structure (36 sections, 10 equations, 7 figures, 32 tables)

This paper contains 36 sections, 10 equations, 7 figures, 32 tables.

Figures (7)

  • Figure 1: Saturn generative workflow. All generated SMILES and their rewards are stored in the Oracle Cache after canonicalization. A genetic algorithm can be optionally applied using the replay buffer as the parent population. Augmented Memory is used to update the agent numerous times.
  • Figure 2: a. Average maximum token probability across agent states. Augmentation pushes the agent action distribution towards a delta distribution. b. Augmented Memory (10 augmentation rounds) makes the likelihood of generating SMILES in the buffer more likely. c. Top: On average, augmented forms of the buffer SMILES become more likely. Bottom: Similar loss magnitudes impose larger changes on improbable sequences and the agent is driven towards generating these specific sequences. When the Augmented Likelihood is equal to the agent likelihood, the loss approaches 0 (circles). d. 3,000 oracle budget test experiment chunked into 300 SMILES. UMAP embedding of the agent chemical space traversal (arrows are the centroid of each chunk). Mamba exhibits a directional traversal while RNN (baseline Augmented Memory) continues to sample globally. e. Mamba exhibits a "hop-and-locally-explore" behavior where the intra-chunk Tanimoto similarity (top values) are higher than RNN. The bottom value is the inter-chunk similarity.
  • Figure C3: Mamba (batch size 16, augmentation rounds 10) after running for 500 oracle calls of the illustrative example and isolating the effect of Augmented Memory. a. Augmented Memory makes the likelihood of generating SMILES in the Buffer more likely. b. Augmented forms of the Buffer SMILES become more likely, but still regularized by the prior.
  • Figure C4: Mamba and RNN (both batch size 16, augmentation rounds 10) and baseline Augmented Memory (batch size 64, augmentation rounds 2). a. 3,000 oracle budget test experiment chunked into 300 SMILES. UMAP embedding of the agent chemical space traversal (arrows are the centroid of each chunk). b. Mamba exhibits a "hop-and-locally-explore" behavior where the intra-chunk Tanimoto similarity (top values) are higher than RNN. The bottom value is the inter-chunk similarity.
  • Figure C5: Mamba (batch size 16, augmentation rounds 10) and baseline Augmented Memory (batch size 64, augmentation rounds 2) which is labelled as RNN. a. 3,000 oracle budget test experiment chunked into 100 SMILES. Mamba exhibits a "hop-and-locally-explore" behavior where the intra-chunk Tanimoto similarity (top values) are higher than RNN. The bottom value is the inter-chunk similarity. b. Qualitative examples of unique molecules generated at adjacent epochs. Many substructures are shared and the model generates in the local neighborhood. Yellow highlights are exact substructures shared while green indicates a portion.
  • ...and 2 more figures