Transfer Learning for T-Cell Response Prediction
Josua Stadelmaier, Brandon Malone, Ralf Eggeling
TL;DR
This work tackles the challenge of predicting T-cell responses to peptides in a multi-domain setting where data come from diverse sources and MHC alleles, risking shortcut learning. It introduces a domain-aware evaluation framework and explores transfer-learning strategies, including adversarial domain adaptation (ADA-T) and per-source fine-tuning (FINE-T) on a transformer-based predictor. ADA-T reduces domain-specific shortcuts but does not consistently improve accuracy, while FINE-T yields robust gains, particularly for MHC I, and achieves state-of-the-art-like performance on human peptides. Overall, the study highlights the importance of accounting for data heterogeneity in immunogenicity prediction and points to FINE-T as a practical approach for personalized cancer vaccine design, while calling for standardized benchmarks to enable fair comparisons.
Abstract
We study the prediction of T-cell response for specific given peptides, which could, among other applications, be a crucial step towards the development of personalized cancer vaccines. It is a challenging task due to limited, heterogeneous training data featuring a multi-domain structure; such data entail the danger of shortcut learning, where models learn general characteristics of peptide sources, such as the source organism, rather than specific peptide characteristics associated with T-cell response. Using a transformer model for T-cell response prediction, we show that the danger of inflated predictive performance is not merely theoretical but occurs in practice. Consequently, we propose a domain-aware evaluation scheme. We then study different transfer learning techniques to deal with the multi-domain structure and shortcut learning. We demonstrate a per-source fine tuning approach to be effective across a wide range of peptide sources and further show that our final model is competitive with existing state-of-the-art approaches for predicting T-cell responses for human peptides.
