Table of Contents
Fetching ...

Why risk matters for protein binder design

Tudor-Stefan Cotet, Igor Krawczuk

TL;DR

This work addresses risk in Bayesian optimization for protein binder design by simulating campaigns across $72$ model combinations on $11$ binding landscapes, evaluating metrics such as the final fitness and the conditional value at risk $CVaR_{0.1}$, as well as costs under a fixed budget. The approach reveals Pareto-optimal models along the risk-performance axis and shows that landscape properties like epistasis strongly drive both average and worst-case performance, with correlations such as final fitness and non-magnitude epistasis $r \approx -0.52$ and CVaR with epistasis $r \approx -0.55$, and costs correlating with magnitude epistasis $r \approx 0.45$. Bootstrap analyses indicate that risk-aware ranking does not consistently reduce campaign costs due to inherent stochasticity and limited seeds, highlighting the need for more seeds or cross-validation. Overall, the study underscores the practical importance of risk-aware benchmarking in protein engineering and identifies landscape complexity, particularly epistasis, as a key determinant of optimization outcomes.

Abstract

Bayesian optimization (BO) has recently become more prevalent in protein engineering applications and hence has become a fruitful target of benchmarks. However, current BO comparisons often overlook real-world considerations like risk and cost constraints. In this work, we compare 72 model combinations of encodings, surrogate models, and acquisition functions on 11 protein binder fitness landscapes, specifically from this perspective. Drawing from the portfolio optimization literature, we adopt metrics to quantify the cold-start performance relative to a random baseline, to assess the risk of an optimization campaign, and to calculate the overall budget required to reach a fitness threshold. Our results suggest the existence of Pareto-optimal models on the risk-performance axis, the shift of this preference depending on the landscape explored, and the robust correlation between landscape properties such as epistasis with the average and worst-case model performance. They also highlight that rigorous model selection requires substantial computational and statistical efforts.

Why risk matters for protein binder design

TL;DR

This work addresses risk in Bayesian optimization for protein binder design by simulating campaigns across model combinations on binding landscapes, evaluating metrics such as the final fitness and the conditional value at risk , as well as costs under a fixed budget. The approach reveals Pareto-optimal models along the risk-performance axis and shows that landscape properties like epistasis strongly drive both average and worst-case performance, with correlations such as final fitness and non-magnitude epistasis and CVaR with epistasis , and costs correlating with magnitude epistasis . Bootstrap analyses indicate that risk-aware ranking does not consistently reduce campaign costs due to inherent stochasticity and limited seeds, highlighting the need for more seeds or cross-validation. Overall, the study underscores the practical importance of risk-aware benchmarking in protein engineering and identifies landscape complexity, particularly epistasis, as a key determinant of optimization outcomes.

Abstract

Bayesian optimization (BO) has recently become more prevalent in protein engineering applications and hence has become a fruitful target of benchmarks. However, current BO comparisons often overlook real-world considerations like risk and cost constraints. In this work, we compare 72 model combinations of encodings, surrogate models, and acquisition functions on 11 protein binder fitness landscapes, specifically from this perspective. Drawing from the portfolio optimization literature, we adopt metrics to quantify the cold-start performance relative to a random baseline, to assess the risk of an optimization campaign, and to calculate the overall budget required to reach a fitness threshold. Our results suggest the existence of Pareto-optimal models on the risk-performance axis, the shift of this preference depending on the landscape explored, and the robust correlation between landscape properties such as epistasis with the average and worst-case model performance. They also highlight that rigorous model selection requires substantial computational and statistical efforts.

Paper Structure

This paper contains 28 sections, 5 equations, 19 figures, 7 tables.

Figures (19)

  • Figure 1: Our additions for protein optimization benchmarking: we consider risks, costs, and performances relative to a random baseline.
  • Figure 2: Overview of the standard protein BO loop we are benchmarking.
  • Figure 3: Metrics for the top 10 models ranked by average final fitness for GB1: final fitness reached, $\Delta{G} \> AUC$, cost to 99th percentile of fitness, number of sequences above the 99th percentile threshold acquired.
  • Figure 4: Pareto frontier when considering the performance-risk axes for the final fitness reached and for the $\Delta{G}\> AUC$ metric.
  • Figure 5: Kendall $\tau$ correlation coefficients between the average model metrics and landscape properties. *** denotes a p-value < 0.001, ** for < 0.01, and * for < 0.05.
  • ...and 14 more figures