Table of Contents
Fetching ...

One protein is all you need

Anton Bushuiev, Roman Bushuiev, Olga Pimenova, Nikola Zadorozhny, Raman Samusevich, Elisabet Manaskova, Rachel Seongeun Kim, Hannes Stärk, Jiri Sedlar, Martin Steinegger, Tomáš Pluskal, Josef Sivic

TL;DR

ProteinTTT introduces per-protein customization by test-time self-supervised adaptation of the backbone in a Y-shaped protein language-model setup, targeting a single sequence $x$ to minimize perplexity while leaving the downstream head unchanged. By optimizing $f$ via masked language modeling on $x$, and selecting the best $\theta_x$ through a confidence signal (e.g., $c = p\text{LDDT}$), ProteinTTT yields improved downstream predictions across structure, fitness, and function tasks, using SGD with LoRA to scale to large models. Empirically, ProteinTTT delivers consistent gains across models and datasets, achieving new state-of-the-art results in protein fitness prediction (ProteinGym) and enhancing difficult cases in antibody–antigen loop modeling and the Big Fantastic Virus Database. The approach is practical, data-efficient (no extra data required), and extensible (MSA customization, various heads), offering a versatile tool for researchers focusing on single proteins or specific protein families.

Abstract

Generalization beyond training data remains a central challenge in machine learning for biology. A common way to enhance generalization is self-supervised pre-training on large datasets. However, aiming to perform well on all possible proteins can limit a model's capacity to excel on any specific one, whereas experimentalists typically need accurate predictions for individual proteins they study, often not covered in training data. To address this limitation, we propose a method that enables self-supervised customization of protein language models to one target protein at a time, on the fly, and without assuming any additional data. We show that our Protein Test-Time Training (ProteinTTT) method consistently enhances generalization across different models, their sizes, and datasets. ProteinTTT improves structure prediction for challenging targets, achieves new state-of-the-art results on protein fitness prediction, and enhances function prediction on two tasks. Through two challenging case studies, we also show that customization via ProteinTTT achieves more accurate antibody-antigen loop modeling and enhances 19% of structures in the Big Fantastic Virus Database, delivering improved predictions where general-purpose AlphaFold2 and ESMFold struggle.

One protein is all you need

TL;DR

ProteinTTT introduces per-protein customization by test-time self-supervised adaptation of the backbone in a Y-shaped protein language-model setup, targeting a single sequence to minimize perplexity while leaving the downstream head unchanged. By optimizing via masked language modeling on , and selecting the best through a confidence signal (e.g., ), ProteinTTT yields improved downstream predictions across structure, fitness, and function tasks, using SGD with LoRA to scale to large models. Empirically, ProteinTTT delivers consistent gains across models and datasets, achieving new state-of-the-art results in protein fitness prediction (ProteinGym) and enhancing difficult cases in antibody–antigen loop modeling and the Big Fantastic Virus Database. The approach is practical, data-efficient (no extra data required), and extensible (MSA customization, various heads), offering a versatile tool for researchers focusing on single proteins or specific protein families.

Abstract

Generalization beyond training data remains a central challenge in machine learning for biology. A common way to enhance generalization is self-supervised pre-training on large datasets. However, aiming to perform well on all possible proteins can limit a model's capacity to excel on any specific one, whereas experimentalists typically need accurate predictions for individual proteins they study, often not covered in training data. To address this limitation, we propose a method that enables self-supervised customization of protein language models to one target protein at a time, on the fly, and without assuming any additional data. We show that our Protein Test-Time Training (ProteinTTT) method consistently enhances generalization across different models, their sizes, and datasets. ProteinTTT improves structure prediction for challenging targets, achieves new state-of-the-art results on protein fitness prediction, and enhances function prediction on two tasks. Through two challenging case studies, we also show that customization via ProteinTTT achieves more accurate antibody-antigen loop modeling and enhances 19% of structures in the Big Fantastic Virus Database, delivering improved predictions where general-purpose AlphaFold2 and ESMFold struggle.

Paper Structure

This paper contains 87 sections, 5 equations, 19 figures, 7 tables.

Figures (19)

  • Figure 1: Example of protein structure prediction after single-protein model customization via ProteinTTT. ESMFold poorly predicts the structure of the CASP14 target T1074 (white) because the underlying language model ESM2 poorly fits the sequence, as indicated by the high perplexity (left and Fig. 2E in lin2023evolutionary). Self-supervised test-time customization of ESM2 to the single sequence of T1074 reduces the perplexity, resulting in improved structure prediction (right).
  • Figure 2: Overview of protein language model (PLM) customization with ProteinTTT.(a) Given a protein sequence of interest $x$ and a pretrained PLM $f(\cdot; \theta_0)$, ProteinTTT yields a customized version of the PLM $f(\cdot; \theta_x)$ for that sequence. Customization is achieved by fine-tuning (fire icon) the pretrained parameters $\theta_0$ via masked language modeling solely on the input sequence for $T$ steps, selecting the optimal parameters $\theta_x$ using a confidence function $c$. This procedure adapts the model specifically to the input sequence, improving its internal representation as measured by model perplexity. (b) Once customized, the PLM can be used with pretrained task-specific heads, such as structure, fitness, or function prediction modules, $h_1$, $h_2$, and $h_3$, respectively, without modifying their parameters (snowflake icon). For example, the ESM2 PLM can be customized and then used with the pretrained ESMFold structure prediction head without modifying its 1.4-billion task-specific parameters, resulting in improved structure prediction for the given sequence (e.g., \ref{['fig:casp14_example']}).
  • Figure 3: Customization with ProteinTTT improves protein structure prediction by reducing protein sequence perplexity. ESMFold fails to predict the structure of chain B from PDB entry 7EBL in the CAMEO validation set, as shown at customization step 0, where the perplexity is high and the TM-score is low. By applying customization with ProteinTTT for the single target sequence, the model iteratively improves the structure prediction quality, as demonstrated by the increasing TM-score, associated with reduced perplexity. At customization step 7, the predicted structure achieves the highest TM-score, as well as the highest predicted confidence metric pLDDT, enabling the selection of this step as the final prediction by the customized ESMFold + ProteinTTT.
  • Figure 4: ProteinTTT improves modeling of antibody--antigen loops. (a) Average LDDT on the antibody complementarity-determining regions (CDRs, 175 structures) and antigens (814 structures) from the SAbDab dataset with ESMFold pLDDT < 70. Error bars indicate 95% confidence intervals estimated from 1000 bootstrap samples. (b) Example of improved structure prediction for CDRs in the 8K2W entry. The CDR regions H1, H2, and H3, i.e., the parts of the antibody that bind to the antigen, are highlighted with spheres, while black lines show the alignment error between the ground-truth CDR structure (white) and the predictions (colored).
  • Figure 5: ProteinTTT expands the Big Fantastic Virus Database (BFVD).(a) ProteinTTT (light green) substantially improves the performance of ESMFold (yellow) on viral proteins, yielding better structures (pink) for 19% of BFVD entries compared to the original predictions by AlphaFold2 (green). (b) Improvements in pLDDT for ESMFold after ProteinTTT correspond to improvements in LDDT, as benchmarked against BFVD AlphaFold2 structures with pLDDT $>$ 90. (c) ProteinTTT provides the largest pLDDT improvements (y-axis) for the most out-of-distribution proteins, i.e., those with the smallest MSAs (left on the x-axis) from the Logan database. (d) Structural comparison for BFVD entry UPI000641889E against the PDB structure 2N2J (100% sequence identity) shows that ESMFold + ProteinTTT yields a prediction closest to the ground truth (gray), as also measured by LDDT. (e–g) Additional examples of high-quality viral structures (as measured by pLDDT) predicted with ESMFold + ProteinTTT but not with ESMFold or AlphaFold2. Higher pLDDT values are better.
  • ...and 14 more figures