Table of Contents
Fetching ...

Leveraging Multi-modal Representations to Predict Protein Melting Temperatures

Daiheng Zhang, Yan Zeng, Xinyu Hong, Jinbo Xu

TL;DR

The paper tackles predicting changes in protein melting temperature, $\Delta T_m$, by leveraging multimodal protein representations that integrate sequence and structure information. It introduces ESM3-DTm, a multimodal backbone built on ESM3 with structure inputs, and demonstrates state-of-the-art performance on the s571 dataset (PCC $=0.50$, MAE $=5.21$, RMSE $=7.68$). Through extensive ablations, the study shows that combining sequence and structure with appropriate regression heads and fine-tuning strategies yields superior predictions compared to sequence-only baselines and other backbones. The work highlights the value of 1) multimodal backbones (ESM3, OpenFold), 2) structure-aware embeddings, and 3) end-to-end fine-tuning for accurate $\Delta T_m$ prediction, with implications for protein stability engineering and design.

Abstract

Accurately predicting protein melting temperature changes (Delta Tm) is fundamental for assessing protein stability and guiding protein engineering. Leveraging multi-modal protein representations has shown great promise in capturing the complex relationships among protein sequences, structures, and functions. In this study, we develop models based on powerful protein language models, including ESM-2, ESM-3 and AlphaFold, using various feature extraction methods to enhance prediction accuracy. By utilizing the ESM-3 model, we achieve a new state-of-the-art performance on the s571 test dataset, obtaining a Pearson correlation coefficient (PCC) of 0.50. Furthermore, we conduct a fair evaluation to compare the performance of different protein language models in the Delta Tm prediction task. Our results demonstrate that integrating multi-modal protein representations could advance the prediction of protein melting temperatures.

Leveraging Multi-modal Representations to Predict Protein Melting Temperatures

TL;DR

The paper tackles predicting changes in protein melting temperature, , by leveraging multimodal protein representations that integrate sequence and structure information. It introduces ESM3-DTm, a multimodal backbone built on ESM3 with structure inputs, and demonstrates state-of-the-art performance on the s571 dataset (PCC , MAE , RMSE ). Through extensive ablations, the study shows that combining sequence and structure with appropriate regression heads and fine-tuning strategies yields superior predictions compared to sequence-only baselines and other backbones. The work highlights the value of 1) multimodal backbones (ESM3, OpenFold), 2) structure-aware embeddings, and 3) end-to-end fine-tuning for accurate prediction, with implications for protein stability engineering and design.

Abstract

Accurately predicting protein melting temperature changes (Delta Tm) is fundamental for assessing protein stability and guiding protein engineering. Leveraging multi-modal protein representations has shown great promise in capturing the complex relationships among protein sequences, structures, and functions. In this study, we develop models based on powerful protein language models, including ESM-2, ESM-3 and AlphaFold, using various feature extraction methods to enhance prediction accuracy. By utilizing the ESM-3 model, we achieve a new state-of-the-art performance on the s571 test dataset, obtaining a Pearson correlation coefficient (PCC) of 0.50. Furthermore, we conduct a fair evaluation to compare the performance of different protein language models in the Delta Tm prediction task. Our results demonstrate that integrating multi-modal protein representations could advance the prediction of protein melting temperatures.

Paper Structure

This paper contains 12 sections, 1 figure, 5 tables, 1 algorithm.

Figures (1)

  • Figure 1: Model Architecture. ESM3-DTm efficiently predicts $\Delta T_m$. We also present ESM2-DTm, Saprot-DTm, and Openfold-DTm here. "I4A" means mutation from I to A at position 4.