Table of Contents
Fetching ...

Conformational Rank Conditioned Committees for Machine Learning-Assisted Directed Evolution

Mia Adler, Carrie Liang, Brian Peng, Oleg Presnyakov, Justin M. Baker, Jannelle Lauffer, Himani Sharma, Barry Merriman

TL;DR

This work tackles antibody design under conformational uncertainty by introducing rank-conditioned committees (RCC) that assign a dedicated model ensemble to each predicted conformational rank. By decoupling epistemic uncertainty (within-rank) from conformational uncertainty (between ranks), RCC-MLDE provides a principled acquisition function that balances exploration and exploitation while down-weighting pose-driven uncertainty. The approach is validated on SARS-CoV-2 antibody docking, where RCC-MLDE improves mean docking scores and yields more robust candidate sets than baseline MLDE and bioinformatics strategies. The framework leverages ImmuneBuilder for conformations, AbMAP embeddings for sequence representation, and HADDOCK3 for docking, presenting a scalable path toward rapid, uncertainty-aware therapeutic antibody discovery. Future work includes more detailed binding simulations and experimental validation to fully establish real-world utility.

Abstract

Machine Learning-assisted directed evolution (MLDE) is a powerful tool for efficiently navigating antibody fitness landscapes. Many structure-aware MLDE pipelines rely on a single conformation or a single committee across all conformations, limiting their ability to separate conformational uncertainty from epistemic uncertainty. Here, we introduce a rank -conditioned committee (RCC) framework that leverages ranked conformations to assign a deep neural network committee per rank. This design enables a principled separation between epistemic uncertainty and conformational uncertainty. We validate our RCC-MLDE approach on SARS-CoV-2 antibody docking, demonstrating significant improvements over baseline strategies. Our results offer a scalable route for therapeutic antibody discovery while directly addressing the challenge of modeling conformational uncertainty.

Conformational Rank Conditioned Committees for Machine Learning-Assisted Directed Evolution

TL;DR

This work tackles antibody design under conformational uncertainty by introducing rank-conditioned committees (RCC) that assign a dedicated model ensemble to each predicted conformational rank. By decoupling epistemic uncertainty (within-rank) from conformational uncertainty (between ranks), RCC-MLDE provides a principled acquisition function that balances exploration and exploitation while down-weighting pose-driven uncertainty. The approach is validated on SARS-CoV-2 antibody docking, where RCC-MLDE improves mean docking scores and yields more robust candidate sets than baseline MLDE and bioinformatics strategies. The framework leverages ImmuneBuilder for conformations, AbMAP embeddings for sequence representation, and HADDOCK3 for docking, presenting a scalable path toward rapid, uncertainty-aware therapeutic antibody discovery. Future work includes more detailed binding simulations and experimental validation to fully establish real-world utility.

Abstract

Machine Learning-assisted directed evolution (MLDE) is a powerful tool for efficiently navigating antibody fitness landscapes. Many structure-aware MLDE pipelines rely on a single conformation or a single committee across all conformations, limiting their ability to separate conformational uncertainty from epistemic uncertainty. Here, we introduce a rank -conditioned committee (RCC) framework that leverages ranked conformations to assign a deep neural network committee per rank. This design enables a principled separation between epistemic uncertainty and conformational uncertainty. We validate our RCC-MLDE approach on SARS-CoV-2 antibody docking, demonstrating significant improvements over baseline strategies. Our results offer a scalable route for therapeutic antibody discovery while directly addressing the challenge of modeling conformational uncertainty.

Paper Structure

This paper contains 37 sections, 15 equations, 5 figures, 2 tables.

Figures (5)

  • Figure 1: The left panel depicts the residue-wise confidence scores in AlphaFold2 and ImmuneBuilder. The middle panel shows five aligned conformal antibodies with predicted by AlphaFold2 and ImmuneBuilder with colored CDR-H3 regions circled in red. The right panel presents wet-lab results of the best performing antibodies in three consecutive rounds of DE, normalized to the parent.
  • Figure 2: Comparison of results for (a) bioinformatics-based DE (b) MLDE with a DNN ensemble (c) RCC-MLDE with XGBoost and (d) RCC-MLDE with a DNN ensemle. Each histogram shows the evaluation of the initial dataset (grey) alongside 200 generated antibodies, divided into two learning batches of 100 variants. The mean and variance of each batch is reported.
  • Figure 3: Yellow Figure: General pipeline of the ML-assisted Directed Evolution (MLDE) framework, starting with sequence embeddings and leading to population updates via Acquisition Maximization. Green Figure: Acquisition Maximization algorithm showing sequence mutations, biological feasibility tests, and sorting by the Acquisition Function to update the population. Red Figure: Predictive modeling workflow, including PCA-based dimensionality reduction and ensemble modeling using DNN for sequence evaluation. Purple Figure: Novel Conformation Rank Committee for acquisition calculation, using ensemble models to predict docking scores and calculate statistical metrics for sequence poses.
  • Figure 4: Regression plots for mutant scores against main parent scores in both pipelines. Point mutants (light blue) and cross mutants (purple) are shown against main parent scores. Dashed lines: mean of initial population (dark blue) and identity line (grey). Regression results are summarized in Tables \ref{['tab:pipeline-a']} and \ref{['tab:pipeline-b']}. In both pipelines, slopes for point mutants exceed those for cross mutants, indicating more moderate impact of point mutations.
  • Figure 5: Sidechain genealogy and the resulting parent chain.