TCR-GPT: Integrating Autoregressive Model and Reinforcement Learning for T-Cell Receptor Repertoires Generation

Yicheng Lin; Dandan Zhang; Yun Liu

TCR-GPT: Integrating Autoregressive Model and Reinforcement Learning for T-Cell Receptor Repertoires Generation

Yicheng Lin, Dandan Zhang, Yun Liu

TL;DR

This work addresses the challenge of modeling and generating TCR repertoires for targeted immune applications. It introduces TCR-GPT, a decoder-only transformer that learns the autoregressive distribution $p(\mathbf{x}|\bm{\theta})$ over CDR3-$\beta$ sequences and uses PPO-based reinforcement learning with PanPep to bias generation toward peptide recognition. The approach achieves strong distribution-inference accuracy ($r=0.953$) and meaningful repertoire differentiation via Jensen-Shannon divergence, while enabling downstream classification and peptide-targeted generation. Practically, this framework offers a scalable path toward designing peptide-specific TCR repertoires for therapies and vaccines, though it currently focuses on the CDR3 region of the beta chain and will benefit from extending to full-length paired TCR sequences.

Abstract

T-cell receptors (TCRs) play a crucial role in the immune system by recognizing and binding to specific antigens presented by infected or cancerous cells. Understanding the sequence patterns of TCRs is essential for developing targeted immune therapies and designing effective vaccines. Language models, such as auto-regressive transformers, offer a powerful solution to this problem by learning the probability distributions of TCR repertoires, enabling the generation of new TCR sequences that inherit the underlying patterns of the repertoire. We introduce TCR-GPT, a probabilistic model built on a decoder-only transformer architecture, designed to uncover and replicate sequence patterns in TCR repertoires. TCR-GPT demonstrates an accuracy of 0.953 in inferring sequence probability distributions measured by Pearson correlation coefficient. Furthermore, by leveraging Reinforcement Learning(RL), we adapted the distribution of TCR sequences to generate TCRs capable of recognizing specific peptides, offering significant potential for advancing targeted immune therapies and vaccine development. With the efficacy of RL, fine-tuned pretrained TCR-GPT models demonstrated the ability to produce TCR repertoires likely to bind specific peptides, illustrating RL's efficiency in enhancing the model's adaptability to the probability distributions of biologically relevant TCR sequences.

TCR-GPT: Integrating Autoregressive Model and Reinforcement Learning for T-Cell Receptor Repertoires Generation

TL;DR

over CDR3-

sequences and uses PPO-based reinforcement learning with PanPep to bias generation toward peptide recognition. The approach achieves strong distribution-inference accuracy (

) and meaningful repertoire differentiation via Jensen-Shannon divergence, while enabling downstream classification and peptide-targeted generation. Practically, this framework offers a scalable path toward designing peptide-specific TCR repertoires for therapies and vaccines, though it currently focuses on the CDR3 region of the beta chain and will benefit from extending to full-length paired TCR sequences.

Abstract

Paper Structure (14 sections, 13 equations, 6 figures)

This paper contains 14 sections, 13 equations, 6 figures.

Introduction
Related work
Methods
Autoregressive generative transformer for TCR sequences
Jensen-Shannon Divergence
Using TCR-GPT to generate TCR sequences
Using the features from TCR-GPT for classification tasks
Fine-tuning TCR-GPT with RL to generate peptide-specific TCR repertoires
Experiments
TCR-GPT infers the probability distribution of TCR repertoires with high accuracy
TCR-GPT captures specific features of TCR repertoires efficiently
Classification of cancer-associated TCRs and SARS-CoV-2 epitope-specific TCRs using features from TCR-GPT
Generating peptide-specific TCRs using TCR-GPT fine-tuned with RL
Conclusion

Figures (6)

Figure 1: (A) The overall architecture of TCR-GPT. (B) Main workflow of peptide-specific RL for TCR-GPT.
Figure 2: Performance comparison of TCR-GPT, soNNia and TCRpeg algorithms. A-C. The scatter plot of actual ($P_{data}$) versus inferred probability ($P_{infer}$) for soNNia (A), TCRpeg (B) and TCR-GPT (C) using test dataset from universal TCR repertoire. The corresponding Pearson correlation coefficients are displayed for each plot.
Figure 3: The heatmap of Jensen-Shannon divergence ($D_{js}$) between pairwise sub-repertoire probability distribution inferred by TCR-GPT.
Figure 4: UMAP Visualization and Classification Performance of TCR-GPT. (A, B) UMAP plots of features learned by TCR-GPT trained on caTCRs (A) and SARS-TCRs (B), along with motif logos of selected clustered TCR sequences. (C, D) Area under curve (AUC) for classifiers predicting caTCRs (C) and SARS-TCRs (D) trained with TCR-GPT.
Figure 5: Binding percentage of generated TCRs with specific peptide sequences increases with the number of PPO gradient steps.
...and 1 more figures

TCR-GPT: Integrating Autoregressive Model and Reinforcement Learning for T-Cell Receptor Repertoires Generation

TL;DR

Abstract

TCR-GPT: Integrating Autoregressive Model and Reinforcement Learning for T-Cell Receptor Repertoires Generation

Authors

TL;DR

Abstract

Table of Contents

Figures (6)