Model-based reinforcement learning for protein backbone design
Frederic Renard, Cyprien Courtot, Alfredo Reichlin, Oliver Bent
TL;DR
The paper tackles inverse design of protein backbones that meet predefined icosahedral shapes and structural-score thresholds. It applies AlphaZero, a model-based reinforcement learning approach with Monte Carlo Tree Search, to sequentially assemble backbones using helices and loops, comparing a sigmoid reward with a novel threshold-based reward and introducing side-objectives to regularize learning. The key contributions are: (i) demonstrating superior performance of AlphaZero over the prior MCTS baseline, (ii) showing that the threshold reward yields better learning than the sigmoid formulation, and (iii) demonstrating that adding side-objectives further enhances multi-objective design quality. This work paves the way for scalable, traceable multi-objective protein backbone design and suggests avenues for extending to other shapes, sequence design, and structure validation with predictive tools like AlphaFold.
Abstract
Designing protein nanomaterials of predefined shape and characteristics has the potential to dramatically impact the medical industry. Machine learning (ML) has proven successful in protein design, reducing the need for expensive wet lab experiment rounds. However, challenges persist in efficiently exploring the protein fitness landscapes to identify optimal protein designs. In response, we propose the use of AlphaZero to generate protein backbones, meeting shape and structural scoring requirements. We extend an existing Monte Carlo tree search (MCTS) framework by incorporating a novel threshold-based reward and secondary objectives to improve design precision. This innovation considerably outperforms existing approaches, leading to protein backbones that better respect structural scores. The application of AlphaZero is novel in the context of protein backbone design and demonstrates promising performance. AlphaZero consistently surpasses baseline MCTS by more than 100% in top-down protein design tasks. Additionally, our application of AlphaZero with secondary objectives uncovers further promising outcomes, indicating the potential of model-based reinforcement learning (RL) in navigating the intricate and nuanced aspects of protein design
