BetterBodies: Reinforcement Learning guided Diffusion for Antibody Sequence Design

Yannick Vogt; Mehdi Naouar; Maria Kalweit; Christoph Cornelius Miething; Justus Duyster; Joschka Boedecker; Gabriel Kalweit

BetterBodies: Reinforcement Learning guided Diffusion for Antibody Sequence Design

Yannick Vogt, Mehdi Naouar, Maria Kalweit, Christoph Cornelius Miething, Justus Duyster, Joschka Boedecker, Gabriel Kalweit

TL;DR

BetterBodies, a novel method which combines Variational Autoencoders (VAEs) with RL guided latent diffusion, is able to generate novel sets of antibody CDRH3 sequences from different data distributions, and demonstrates the improved affinity of the novel sequences to the SARS-CoV spike receptor-binding domain.

Abstract

Antibodies offer great potential for the treatment of various diseases. However, the discovery of therapeutic antibodies through traditional wet lab methods is expensive and time-consuming. The use of generative models in designing antibodies therefore holds great promise, as it can reduce the time and resources required. Recently, the class of diffusion models has gained considerable traction for their ability to synthesize diverse and high-quality samples. In their basic form, however, they lack mechanisms to optimize for specific properties, such as binding affinity to an antigen. In contrast, the class of offline Reinforcement Learning (RL) methods has demonstrated strong performance in navigating large search spaces, including scenarios where frequent real-world interaction, such as interaction with a wet lab, is impractical. Our novel method, BetterBodies, which combines Variational Autoencoders (VAEs) with RL guided latent diffusion, is able to generate novel sets of antibody CDRH3 sequences from different data distributions. Using the Absolut! simulator, we demonstrate the improved affinity of our novel sequences to the SARS-CoV spike receptor-binding domain. Furthermore, we reflect biophysical properties in the VAE latent space using a contrastive loss and add a novel Q-function based filtering to enhance the affinity of generated sequences. In conclusion, methods such as ours have the potential to have great implications for real-world biological sequence design, where the generation of novel high-affinity binders is a cost-intensive endeavor.

BetterBodies: Reinforcement Learning guided Diffusion for Antibody Sequence Design

TL;DR

Abstract

Paper Structure (23 sections, 6 equations, 4 figures, 2 tables)

This paper contains 23 sections, 6 equations, 4 figures, 2 tables.

Introduction
Background
Antibody Sequence Design
Variational Autoencoders
Diffusion Models
Reinforcement Learning
Related Work
BetterBodies
Continuous Amino Acid Representations and Encoding Biophysical Properties
Guiding Diffusion Policies using Reinforcement Learning
Filtering generated Sequences
Experiment Setup
Evaluation Metrics
Training Datasets
Results
...and 8 more sections

Figures (4)

Figure 1: Overview over our method on a fictional sequence of length 4. (1) A given dataset comprising sequence-affinity pairs is transformed into subsequences ($s$) and actions ($a$) which extend those sequences with additional amino acids, together with rewards representing the affinity of the full sequences. (2) Our method utilizes a to encode into a two-dimensional latent space. The diffusion policy $\pi$ is trained to generate a latent vector $a_\pi$ given an incomplete amino acid sequence $s$. We balance the policy between generating with high likelihood given the training dataset $D$ and that maximize a learned Q-function, which predicts sequence affinity to a given antigen. (3) By repeating the generative process, are iteratively concatenated to generate a sequence. In each timestep $t$ the policy $\pi$ generates a latent vector $a_t$ given $s_t$. Subsequently, the decodes the , which is then concatenated to $s_t$ to generate $s_{t+1}$.
Figure 2: The effect of various $\eta$ settings: On the basic diffusion loss $L_{BC}$ (top left), free energy evaluated during training (top right), and free energy distribution of generated unique novel sequences (bottom left), and Diversity and Novelty of generated sequences (bottom right). Distributions of generated sequences are plotted as a running average over three bins.
Figure 3: Free energy distributions of unique training dataset sequences and generated sequences. The random (left), natural (middle), and expert (right) datasets are visualized histograms. Sequences generated using BetterBodies $\eta=24$, it's F(iltering), and C(ontrastive) versions are plotted as a running average over three bins. Data is visualized as the mean over five seeds.
Figure 4: latent space encoding amino acids, utilizing no regularization (left) and with contrastive loss regularization (right). Amino acid groups are indicated by the coloring and the space occupied by their samples. The underlying heatmap displays the average Q-value over 1000 sequence-action pairs.

BetterBodies: Reinforcement Learning guided Diffusion for Antibody Sequence Design

TL;DR

Abstract

BetterBodies: Reinforcement Learning guided Diffusion for Antibody Sequence Design

Authors

TL;DR

Abstract

Table of Contents

Figures (4)