Table of Contents
Fetching ...

Genetic-guided GFlowNets for Sample Efficient Molecular Optimization

Hyeonah Kim, Minsu Kim, Sanghyeok Choi, Jinkyoo Park

TL;DR

This work tackles the challenge of sample-efficient de novo molecular optimization where reward evaluations are expensive. It introduces Genetic GFN, which distills domain knowledge from genetic algorithms into a deep generative policy via off-policy GFlowNets, enhanced by unsupervised pretraining and a KL-regularized TB objective. The method combines SMILES-based policy generation with graph-based genetic search, enabling diverse, high-reward molecule generation and robust policy updates through trajectory balance with rank-based replay. Empirically, Genetic GFN achieves state-of-the-art performance on the Practical Molecular Optimization benchmark and significantly reduces reward calls in SARS-CoV-2 inhibitor design, highlighting its practical impact for drug discovery and materials design through improved sample efficiency.

Abstract

The challenge of discovering new molecules with desired properties is crucial in domains like drug discovery and material design. Recent advances in deep learning-based generative methods have shown promise but face the issue of sample efficiency due to the computational expense of evaluating the reward function. This paper proposes a novel algorithm for sample-efficient molecular optimization by distilling a powerful genetic algorithm into deep generative policy using GFlowNets training, the off-policy method for amortized inference. This approach enables the deep generative policy to learn from domain knowledge, which has been explicitly integrated into the genetic algorithm. Our method achieves state-of-the-art performance in the official molecular optimization benchmark, significantly outperforming previous methods. It also demonstrates effectiveness in designing inhibitors against SARS-CoV-2 with substantially fewer reward calls.

Genetic-guided GFlowNets for Sample Efficient Molecular Optimization

TL;DR

This work tackles the challenge of sample-efficient de novo molecular optimization where reward evaluations are expensive. It introduces Genetic GFN, which distills domain knowledge from genetic algorithms into a deep generative policy via off-policy GFlowNets, enhanced by unsupervised pretraining and a KL-regularized TB objective. The method combines SMILES-based policy generation with graph-based genetic search, enabling diverse, high-reward molecule generation and robust policy updates through trajectory balance with rank-based replay. Empirically, Genetic GFN achieves state-of-the-art performance on the Practical Molecular Optimization benchmark and significantly reduces reward calls in SARS-CoV-2 inhibitor design, highlighting its practical impact for drug discovery and materials design through improved sample efficiency.

Abstract

The challenge of discovering new molecules with desired properties is crucial in domains like drug discovery and material design. Recent advances in deep learning-based generative methods have shown promise but face the issue of sample efficiency due to the computational expense of evaluating the reward function. This paper proposes a novel algorithm for sample-efficient molecular optimization by distilling a powerful genetic algorithm into deep generative policy using GFlowNets training, the off-policy method for amortized inference. This approach enables the deep generative policy to learn from domain knowledge, which has been explicitly integrated into the genetic algorithm. Our method achieves state-of-the-art performance in the official molecular optimization benchmark, significantly outperforming previous methods. It also demonstrates effectiveness in designing inhibitors against SARS-CoV-2 with substantially fewer reward calls.
Paper Structure (56 sections, 6 equations, 15 figures, 20 tables, 2 algorithms)

This paper contains 56 sections, 6 equations, 15 figures, 20 tables, 2 algorithms.

Figures (15)

  • Figure 1: Overview of Genetic GFN. Our generative policy is trained to sample molecules proportional to rewards, and the genetic search refines them to higher-reward samples.
  • Figure 2: The optimization curve of the average scores of Top-10 over the score function calls. All optimization curves for 23 oracles are provided in \ref{['appnd:full_pmo']}.
  • Figure 3: Average of Top-10 score and diversity. Note that the fragment-based GFlowNet achieves 10.957 with a diversity of 0.816.
  • Figure 4: The final candidates for the PLPr_7JIR target with 100 steps.
  • Figure 5: The final candidates for the RdRp_6YYT target with 100 steps.
  • ...and 10 more figures