Machine learning approaches for interpretable antibody property prediction using structural data

Kevin Michalewicz; Mauricio Barahona; Barbara Bravi

Machine learning approaches for interpretable antibody property prediction using structural data

Kevin Michalewicz, Mauricio Barahona, Barbara Bravi

TL;DR

The chapter tackles the problem of designing antibodies by incorporating structural information into ML models, arguing that structure-aware representations enable both accurate predictions and mechanistic insight. It presents two graph-based frameworks, ANTIPASTI and INFUSSE, that fuse structure-derived signals with sequence embeddings to predict global properties like binding affinity and local properties such as residue B-factors, while enabling interpretability through model-agnostic and model-dependent analyses. ANTIPASTI reveals long-range affinity-determining correlations across regions (e.g., CDR-H3 with FR-L2), whereas INFUSSE demonstrates that combining ProtBERT embeddings with geometry-based graphs improves per-residue predictions, especially in loops and helices. The work advocates for interpretable, structure-informed antibody design and outlines future directions toward multi-property optimization and uncertainty quantification in in silico design workflows.

Abstract

Understanding the relationship between antibody sequence, structure and function is essential for the design of antibody-based therapeutics and research tools. Recently, machine learning (ML) models mostly based on the application of large language models to sequence information have been developed to predict antibody properties. Yet there are open directions to incorporate structural information, not only to enhance prediction but also to offer insights into the underlying molecular mechanisms. This chapter provides an overview of these approaches and describes two ML frameworks that integrate structural data (via graph representations) with neural networks to predict properties of antibodies: ANTIPASTI predicts binding affinity (a global property) whereas INFUSSE predicts residue flexibility (a local property). We survey the principles underpinning these models; the ways in which they encode structural knowledge; and the strategies that can be used to extract biologically relevant statistical signals that can help discover and disentangle molecular determinants of the properties of interest.

Machine learning approaches for interpretable antibody property prediction using structural data

TL;DR

Abstract

Machine learning approaches for interpretable antibody property prediction using structural data

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (5)