Table of Contents
Fetching ...

Quenched coalescent for diploid population models with selfing and overlapping generations

Louis Wai-Tong Fan, Maximillian Newman, John Wakeley

TL;DR

The paper develops a quenched coalescent framework for diploid populations with selfing and overlapping generations by conditioning genealogies on a single population pedigree. It introduces a general diploid exchangeable model, constructs a latent pedigree, and proves that, under high selfing, the conditioned ancestral process converges to coalescing random walks on a random Q-λ graph embedded in an exchangeable fragmentation-coalescence process with measure Ξ. The main result shows the quenched law converges to a random object L_Π^n describing coalescing walks on a pedigree-driven graph, enabling analysis of the site-frequency spectrum conditional on pedigree and highlighting robustness of binary-merger limits. The framework yields applications to robustness across Wright-Fisher and Moran-type demographies, convergence of tree statistics and SFS under nearly complete selfing, and extensions to mixed demographies, while also providing a general theory for weak convergence of random measures on Skorokhod spaces.

Abstract

We introduce a general diploid population model with self-fertilization and possible overlapping generations, and study the genealogy of a sample of $n$ genes as the population size $N$ tends to infinity. Unlike traditional approach in coalescent theory which considers the unconditional (annealed) law of the gene genealogies averaged over the population pedigree, here we study the conditional (quenched) law of gene genealogies given the pedigree. We focus on the case of high selfing probability and obtain that this conditional law converges to a random probability measure, given by the random law of a system of coalescing random walks on an exchangeable fragmentation-coalescence process of \cite{berestycki04}. This system contains the system of coalescing random walks on the ancestral recombination graph as a special case, and it sheds new light on the site-frequency spectrum (SFS) of genetic data by specifying how SFS depends on the pedigree. The convergence result is proved by means of a general characterization of weak convergence for random measures on the Skorokhod space with paths taking values in a locally compact Polish space.

Quenched coalescent for diploid population models with selfing and overlapping generations

TL;DR

The paper develops a quenched coalescent framework for diploid populations with selfing and overlapping generations by conditioning genealogies on a single population pedigree. It introduces a general diploid exchangeable model, constructs a latent pedigree, and proves that, under high selfing, the conditioned ancestral process converges to coalescing random walks on a random Q-λ graph embedded in an exchangeable fragmentation-coalescence process with measure Ξ. The main result shows the quenched law converges to a random object L_Π^n describing coalescing walks on a pedigree-driven graph, enabling analysis of the site-frequency spectrum conditional on pedigree and highlighting robustness of binary-merger limits. The framework yields applications to robustness across Wright-Fisher and Moran-type demographies, convergence of tree statistics and SFS under nearly complete selfing, and extensions to mixed demographies, while also providing a general theory for weak convergence of random measures on Skorokhod spaces.

Abstract

We introduce a general diploid population model with self-fertilization and possible overlapping generations, and study the genealogy of a sample of genes as the population size tends to infinity. Unlike traditional approach in coalescent theory which considers the unconditional (annealed) law of the gene genealogies averaged over the population pedigree, here we study the conditional (quenched) law of gene genealogies given the pedigree. We focus on the case of high selfing probability and obtain that this conditional law converges to a random probability measure, given by the random law of a system of coalescing random walks on an exchangeable fragmentation-coalescence process of \cite{berestycki04}. This system contains the system of coalescing random walks on the ancestral recombination graph as a special case, and it sheds new light on the site-frequency spectrum (SFS) of genetic data by specifying how SFS depends on the pedigree. The convergence result is proved by means of a general characterization of weak convergence for random measures on the Skorokhod space with paths taking values in a locally compact Polish space.

Paper Structure

This paper contains 26 sections, 22 theorems, 175 equations, 7 figures.

Key Result

Theorem 3.11

Suppose that Assumptions A:c_N, A: timescale and A:Q_Nn hold. Then there is a unique finite measure $\Xi$ on $\Delta$, made precise in Remark R: xi_q_connection, such that $\bar{\chi}^{N,n}$, as a $\mathcal{D}\left(\mathbb{R}_+, \mathcal{E}_n\right)$-valued random variable, converges in distribution

Figures (7)

  • Figure 1: An illustration of our population model between time-steps $k+1$ and $k$ in the past, with size $N=6$. There are $K_N = 3$ offspring which are individuals 3,4,6 (from left to right) in timestep $k$. Bold edges represent reproductive relationships. Further, two of the offspring are reproduced by selfing. Individuals $1, 2, 5$, whose edges are marked in orange, simply persisted between consecutive time-steps and are not offspring. One can read $(V_{i,j})$ from the pedigree. We see that $V_1 = 0$ as the first parent (the first individual in timestep $k+1$) has no offspring, $V_2 = 2$ as the second parent has two offspring (individuals 3 and 4 in timestep $k$). Also, $V_{2,2} =V_{4,4} = 1$.
  • Figure 2: (Full single-locus population process on the right and the corresponding pedigree on the left). A realization of our population process with $N=5$ individuals from time-steps $k=0$ to $5$ in the past (right panel). The corresponding pedigree is shown on the left. Here we consider the Moran model in NFW25, where $K_N=1$ deterministically. The black lines correspond to reproductive relationships while the yellow lines correspond to an individual persisting from one time-step to the next. The yellow lines give rise to overlapping generations.
  • Figure 3: (Same pedigree but different genealogies). In the figure we see two different genealogical histories. We focus on a sample of $n=3$ lineages and trace their history backwards in time. Both histories are subject to the same pedigree, that displayed on the right of Figure \ref{['F: several_realizations']}, and yet the history of the sample lineages backwards in time are distinct.
  • Figure 4: Two realizations of coalescing random walks on a single, fixed realization of an EFC $\Pi$ with $c_k = 1$. These random walks follow each coalescence in the EFC and choose between each of the possible edges ahead of them at a fragmentation with equal probability.
  • Figure 5: A gene genealogy for $n=10$ samples. Here, $\tau^{10,3}$ is the sum of the lengths of the two thick, blue edges. Although we do not consider mutation in this paper, note that any mutations on these branches will be inherited by, and thus present on, exactly $3$ gene copies in the sample.
  • ...and 2 more figures

Theorems & Definitions (78)

  • Example 2.1: The diploid Cannings model in birkner2018coalescent
  • Definition 2.3: Pedigree
  • Definition 3.1: Ancestral line
  • Definition 3.2: ancestral process
  • Remark 3.3
  • Remark 3.6
  • Definition 3.8
  • Remark 3.9
  • Remark 3.10
  • Theorem 3.11: Annealed convergence
  • ...and 68 more