Conditionally Site-Independent Neural Evolution of Antibody Sequences
Stephen Zhewen Lu, Aakarsh Vermani, Kohei Sanno, Jiarui Lu, Frederick A Matsen, Milind Jagota, Yun S. Song
TL;DR
Conditionally Site-Independent Neural Evolution of Antibody Sequences (CoSiNE) presents a neural CTMC that learns site-specific rate matrices conditioned on full antibody sequences to capture epistatic effects during affinity maturation. It provides a theoretical first-order approximation to the true sequential mutation process with a quadratic error bound and introduces Gillespie-based sampling, along with a Taylor-series guided variant for targeted design via Guided Gillespie. Empirically, CoSiNE achieves state-of-the-art zero-shot variant effect prediction, accurately models intra- and inter-chain epistasis, and enables guided affinity maturation from naive antibodies, including local CDR optimization under mutation budgets. The work bridges phylogenetic sequence evolution and deep learning, offering a principled framework for antibody design with potential impact on vaccine design and therapeutic development, while acknowledging limitations such as ignoring indels and the need for broader generalization.
Abstract
Common deep learning approaches for antibody engineering focus on modeling the marginal distribution of sequences. By treating sequences as independent samples, however, these methods overlook affinity maturation as a rich and largely untapped source of information about the evolutionary process by which antibodies explore the underlying fitness landscape. In contrast, classical phylogenetic models explicitly represent evolutionary dynamics but lack the expressivity to capture complex epistatic interactions. We bridge this gap with CoSiNE, a continuous-time Markov chain parameterized by a deep neural network. Mathematically, we prove that CoSiNE provides a first-order approximation to the intractable sequential point mutation process, capturing epistatic effects with an error bound that is quadratic in branch length. Empirically, CoSiNE outperforms state-of-the-art language models in zero-shot variant effect prediction by explicitly disentangling selection from context-dependent somatic hypermutation. Finally, we introduce Guided Gillespie, a classifier-guided sampling scheme that steers CoSiNE at inference time, enabling efficient optimization of antibody binding affinity toward specific antigens.
