Table of Contents
Fetching ...

Bayesian inference of antibody evolutionary dynamics using multitype branching processes

Athanasios G. Bakis, Ashni A. Vora, Tatsuya Araki, Tongqiu Jia, Jared G. Galloway, Chris Jennings-Shaffer, Gabriel D. Victora, Yun S. Song, William S. DeWitt, Frederick A. Matsen, Volodymyr M. Minin

TL;DR

A Bayesian framework is used to infer the functional relationship between B-cell fitness and antigen binding affinity in a Bayesian framework and it is demonstrated that a sigmoidal relationship between fitness and binding affinity can be recovered from realizations of the branching process.

Abstract

When our immune system encounters foreign antigens (i.e., from pathogens), the B cells that produce our antibodies undergo a cyclic process of proliferation, mutation, and selection, improving their ability to bind to the specific antigen. Immunologists have recently developed powerful experimental techniques to investigate this process in mouse models. In one such experiment, mice are engineered with a monoclonal B-cell precursor and immunized with a model antigen. B cells are sampled from sacrificed mice after the immune response has progressed, and the mutated genetic loci encoding antibodies are sequenced. This experiment allows parallel replay of antibody evolution, but produces data at only one time point; we are unable to observe the evolutionary trajectories that lead to optimized antibody affinity in each mouse. To address this, we model antibody evolution as a multitype branching process and integrate over unobserved histories conditioned on phylogenetic signal in sequence data, leveraging parallel experimental replays for parameter inference. We infer the functional relationship between B-cell fitness and antigen binding affinity in a Bayesian framework, equipped with an efficient likelihood calculation algorithm and Markov chain Monte Carlo posterior approximation. In a simulation study, we demonstrate that a sigmoidal relationship between fitness and binding affinity can be recovered from realizations of the branching process. We then perform inference for experimental data from 52 replayed B-cell lineages sampled 15 days after immunization, yielding a total of 3,758 sampled B cells. The recovered sigmoidal curve indicates that the fitness of high-affinity B cells is over six times larger than that of low-affinity B cells, with a sharp transition from low to high fitness values as affinity increases.

Bayesian inference of antibody evolutionary dynamics using multitype branching processes

TL;DR

A Bayesian framework is used to infer the functional relationship between B-cell fitness and antigen binding affinity in a Bayesian framework and it is demonstrated that a sigmoidal relationship between fitness and binding affinity can be recovered from realizations of the branching process.

Abstract

When our immune system encounters foreign antigens (i.e., from pathogens), the B cells that produce our antibodies undergo a cyclic process of proliferation, mutation, and selection, improving their ability to bind to the specific antigen. Immunologists have recently developed powerful experimental techniques to investigate this process in mouse models. In one such experiment, mice are engineered with a monoclonal B-cell precursor and immunized with a model antigen. B cells are sampled from sacrificed mice after the immune response has progressed, and the mutated genetic loci encoding antibodies are sequenced. This experiment allows parallel replay of antibody evolution, but produces data at only one time point; we are unable to observe the evolutionary trajectories that lead to optimized antibody affinity in each mouse. To address this, we model antibody evolution as a multitype branching process and integrate over unobserved histories conditioned on phylogenetic signal in sequence data, leveraging parallel experimental replays for parameter inference. We infer the functional relationship between B-cell fitness and antigen binding affinity in a Bayesian framework, equipped with an efficient likelihood calculation algorithm and Markov chain Monte Carlo posterior approximation. In a simulation study, we demonstrate that a sigmoidal relationship between fitness and binding affinity can be recovered from realizations of the branching process. We then perform inference for experimental data from 52 replayed B-cell lineages sampled 15 days after immunization, yielding a total of 3,758 sampled B cells. The recovered sigmoidal curve indicates that the fitness of high-affinity B cells is over six times larger than that of low-affinity B cells, with a sharp transition from low to high fitness values as affinity increases.

Paper Structure

This paper contains 50 sections, 8 equations, 30 figures, 1 table.

Figures (30)

  • Figure 1: The data-generating experiment. Top row: An antigen and B cells are transferred to host mice, initializing affinity maturation. While each germinal center in each mouse experiences a full evolutionary process (represented by the tree), we are unable to observe anything before the present (sampling) time. The sequences at the tips of the tree are all that is collected as data (shaded box). Bottom row: Given observed B cell sequences, estimate a phylogenetic tree annotated with mutation times. We then are in the unique position to predict the binding affinity of the antibodies produced by any B cell at any point along the tree, using stochastic mapping (SM) and deep mutational scanning (DMS) data. These inferred and affinity-annotated trees form the input to our model.
  • Figure 2: One sampled tree from each of two different germinal centers, as output by BEAST. Branches are colored by discretized affinity value, and dots represent nucleotide-level B cell mutations (including those that do not change the affinity of the antibody).
  • Figure 3: The data-generating process for the branching process. The parameters $\lambda(\cdot), \mu, \Gamma_{\cdot}$ correspond to birth, death, and type change rates, respectively. Left: Waiting times and events are sampled to create a tree. Categorical distribution probabilities are obtained by normalizing the stated weights, which are written as such in the diagram for brevity. For example, the probability that a segment in type $x_1$ ends in a birth event is $\lambda_\phi(x_1) / \left[ \lambda_\phi(x_1) + \mu + \Gamma_{x_1, x_2} + \Gamma_{x_1, x_3} \right]$. Colors represent types; here, black is type $x_1$, red is type $x_2$, and blue is type $x_3$. Lineages which are sampled at collection time are marked with dots on the tips of the tree. The highlighted branch segment is explained by the mathematical captions adjacent to them. Right: Information about lineages that died or were not sampled is lost, yielding the tree on observed lineages.
  • Figure 4: Density plots of the binding affinities seen in the 10X data and from one tree sample for each germinal center in our experimental dataset. The boundaries of the bins used to create the type space are marked in black. Type values (bin medians) are represented by dashed lines. Any affinities in our experimental dataset that lie outside the extreme boundaries are assigned to the extreme bins.
  • Figure 5: A simulation study demonstrating the importance of conditioning the tree model on the survival and observation of at least one lineage at collection time. Shown here are sampling distributions of posterior medians obtained from repeated runs of MCMC on different sets of trees. As larger tree sets are used for inference, the bias from using the unconditioned model becomes increasingly apparent, particularly for the death rate.
  • ...and 25 more figures