Table of Contents
Fetching ...

Improving Paratope and Epitope Prediction by Multi-Modal Contrastive Learning and Interaction Informativeness Estimation

Zhiwei Wang, Yongkang Wang, Wen Zhang

TL;DR

This work tackles the problem of predicting paratopes and epitopes by exploiting multi-modal data from antibody and antigen sequences and structures. It introduces MIPE, a framework that combines intra- and inter-modal contrastive learning with an interaction informativeness estimation module to learn representations that reflect binding patterns, while aligning modalities to reduce noise. The model optimizes a combined objective $\mathcal{L} = \alpha \mathcal{L}_{CL} + \beta \mathcal{L}_{IIE} + \gamma (\mathcal{L}_{ab} + \mathcal{L}_{ag})$, and experiments on SAbDab demonstrate state-of-the-art performance across single paratope, single epitope, and joint prediction tasks, with AlphaFold2 structures offering robust substitutes when real structures are unavailable. The results highlight the value of multi-modal representation learning and interaction-aware guidance for accelerating antibody design and binding residue discovery.

Abstract

Accurately predicting antibody-antigen binding residues, i.e., paratopes and epitopes, is crucial in antibody design. However, existing methods solely focus on uni-modal data (either sequence or structure), disregarding the complementary information present in multi-modal data, and most methods predict paratopes and epitopes separately, overlooking their specific spatial interactions. In this paper, we propose a novel Multi-modal contrastive learning and Interaction informativeness estimation-based method for Paratope and Epitope prediction, named MIPE, by using both sequence and structure data of antibodies and antigens. MIPE implements a multi-modal contrastive learning strategy, which maximizes representations of binding and non-binding residues within each modality and meanwhile aligns uni-modal representations towards effective modal representations. To exploit the spatial interaction information, MIPE also incorporates an interaction informativeness estimation that computes the estimated interaction matrices between antibodies and antigens, thereby approximating them to the actual ones. Extensive experiments demonstrate the superiority of our method compared to baselines. Additionally, the ablation studies and visualizations demonstrate the superiority of MIPE owing to the better representations acquired through multi-modal contrastive learning and the interaction patterns comprehended by the interaction informativeness estimation.

Improving Paratope and Epitope Prediction by Multi-Modal Contrastive Learning and Interaction Informativeness Estimation

TL;DR

This work tackles the problem of predicting paratopes and epitopes by exploiting multi-modal data from antibody and antigen sequences and structures. It introduces MIPE, a framework that combines intra- and inter-modal contrastive learning with an interaction informativeness estimation module to learn representations that reflect binding patterns, while aligning modalities to reduce noise. The model optimizes a combined objective , and experiments on SAbDab demonstrate state-of-the-art performance across single paratope, single epitope, and joint prediction tasks, with AlphaFold2 structures offering robust substitutes when real structures are unavailable. The results highlight the value of multi-modal representation learning and interaction-aware guidance for accelerating antibody design and binding residue discovery.

Abstract

Accurately predicting antibody-antigen binding residues, i.e., paratopes and epitopes, is crucial in antibody design. However, existing methods solely focus on uni-modal data (either sequence or structure), disregarding the complementary information present in multi-modal data, and most methods predict paratopes and epitopes separately, overlooking their specific spatial interactions. In this paper, we propose a novel Multi-modal contrastive learning and Interaction informativeness estimation-based method for Paratope and Epitope prediction, named MIPE, by using both sequence and structure data of antibodies and antigens. MIPE implements a multi-modal contrastive learning strategy, which maximizes representations of binding and non-binding residues within each modality and meanwhile aligns uni-modal representations towards effective modal representations. To exploit the spatial interaction information, MIPE also incorporates an interaction informativeness estimation that computes the estimated interaction matrices between antibodies and antigens, thereby approximating them to the actual ones. Extensive experiments demonstrate the superiority of our method compared to baselines. Additionally, the ablation studies and visualizations demonstrate the superiority of MIPE owing to the better representations acquired through multi-modal contrastive learning and the interaction patterns comprehended by the interaction informativeness estimation.
Paper Structure (19 sections, 11 equations, 5 figures, 1 table)

This paper contains 19 sections, 11 equations, 5 figures, 1 table.

Figures (5)

  • Figure 1: The interaction map between HexaBody-CD38 Fab and CD38 (PDB: 8BYU).
  • Figure 2: The overview of our proposed MIPE. The yellow part is the multi-modal encoders, the green part is the multi-modal contrastive learning, the red part is the interaction informativeness estimation, and the gray part is the prediction.
  • Figure 3: Results of MIPE and its variants in joint paratope-epitope prediction.
  • Figure 4: The t-SNE visualization for the embeddings after sequence encoder (a), structure encoder (b), multi-modal CL (c), and interaction informativeness estimation (d).
  • Figure 5: The binding visualization for antibodies and antigens, with residues from antibodies highlighted in blue and residues from antigens in red. Reference binding residues are represented by lines, while predictive binding residues are indicated by sticks. Examples of the predictive interactions are plotted in dotted lines with yellow color.