Tree Search-Based Evolutionary Bandits for Protein Sequence Optimization

Jiahao Qiu; Hui Yuan; Jinghong Zhang; Wentao Chen; Huazheng Wang; Mengdi Wang

Tree Search-Based Evolutionary Bandits for Protein Sequence Optimization

Jiahao Qiu, Hui Yuan, Jinghong Zhang, Wentao Chen, Huazheng Wang, Mengdi Wang

TL;DR

The paper tackles efficient exploration of expansive protein sequence spaces by weaving tree search with bandit-guided learning under a Gaussian Process prior. It formulates the design problem in an embedding space, introduces a meta-algorithm that grows a mutational tree and uses bandit strategies to navigate exploration, and provides a Bayesian regret analysis showing favorable scaling with the information gain. The full algorithm supports various tree-search heuristics and bandit models, and empirical tests on AAV, TEM, and AAYL49 landscapes with simulated oracles demonstrate superior sample efficiency and the ability to reach near-optimal designs with a small mutation count. Together, these contributions offer a theoretically grounded and practically effective approach to directed protein evolution with controlled exploration.

Abstract

While modern biotechnologies allow synthesizing new proteins and function measurements at scale, efficiently exploring a protein sequence space and engineering it remains a daunting task due to the vast sequence space of any given protein. Protein engineering is typically conducted through an iterative process of adding mutations to the wild-type or lead sequences, recombination of mutations, and running new rounds of screening. To enhance the efficiency of such a process, we propose a tree search-based bandit learning method, which expands a tree starting from the initial sequence with the guidance of a bandit machine learning model. Under simplified assumptions and a Gaussian Process prior, we provide theoretical analysis and a Bayesian regret bound, demonstrating that the combination of local search and bandit learning method can efficiently discover a near-optimal design. The full algorithm is compatible with a suite of randomized tree search heuristics, machine learning models, pre-trained embeddings, and bandit techniques. We test various instances of the algorithm across benchmark protein datasets using simulated screens. Experiment results demonstrate that the algorithm is both sample-efficient and able to find top designs using reasonably small mutation counts.

Tree Search-Based Evolutionary Bandits for Protein Sequence Optimization

TL;DR

Abstract

Paper Structure (63 sections, 5 theorems, 43 equations, 13 figures, 4 tables, 3 algorithms)

This paper contains 63 sections, 5 theorems, 43 equations, 13 figures, 4 tables, 3 algorithms.

Introduction
Practical considerations in protein screens and a tree search view
Our approach
Results
Related Work
Protein engineering.
Search algorithms for protein sequence design.
Bandit learning.
Method
Problem Formulation
Meta Algorithm
Analysis of simple tree search
Bayesian regret
Simple tree search
Regret Theory under GP Fitness
...and 48 more sections

Key Result

Theorem 3.6

Under Assumption asmp:gp_f, asmp:noisy_fdb, asmp:ft_cvx and Condition cond:local_argmax, Alg.alg:meta updates $f_t$ for $O \left( \gamma_T \right)$ times its Bayesian regret is bounded by where $\beta_T = O\left( \mathbb{E} \left[ \|f^{\star}\|_{k} \right]+\sqrt{d \ln T} + \sqrt{\gamma_{T-1}}\right)$. $\gamma_T$ is the information gain, $r$ is inherit from Condition cond:local_argmax and $N$ is t

Figures (13)

Figure 1: A Tree visualization of AAV screen dataset Bryant2021, generated by starting from the wild-type and building the tree via downsampling children with an editing distance of 1 from the parent. The wet-lab screen initiates with a wild-type design sequence (root node), and in each round new sequences are generated by adding randomization and keeping those with high fitness scores as parents. It was believed that nodes with high fitness are more likely to generate high-fitness children.
Figure 2: Visualization of tree search process of Algorithm \ref{['alg:meta']} using the AAV oracle. The search initiates with a wild-type sequence (the root note) and in each round, we choose 100 sequences generated from the last round according to scores derived from UCB and TS by single mutation which leads to a node in the next layer and recombination of sequences (shown by blue edges) which leads to a jump to a layer with more mutations. A path to the optimal sequence is shown by the bold line.
Figure 3: Diagram of the meta algorithm (Alg. \ref{['alg:meta']})
Figure 4: A demonstration of mutation and recombination.
Figure 5: Learning curves of algorithms with comparison to baselines, tested over three datasets.
...and 8 more figures

Theorems & Definitions (11)

Definition 3.1
Theorem 3.6
Definition G.1: Posterior $\mathcal{G P}(m_t, k_t)$
Definition G.2: Maximum Information Gain
Proposition H.1
proof
Lemma H.2
Lemma H.3
proof
Lemma I.1
...and 1 more

Tree Search-Based Evolutionary Bandits for Protein Sequence Optimization

TL;DR

Abstract

Tree Search-Based Evolutionary Bandits for Protein Sequence Optimization

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (13)

Theorems & Definitions (11)