Quenched coalescent for diploid population models with selfing and overlapping generations
Louis Wai-Tong Fan, Maximillian Newman, John Wakeley
TL;DR
The paper develops a quenched coalescent framework for diploid populations with selfing and overlapping generations by conditioning genealogies on a single population pedigree. It introduces a general diploid exchangeable model, constructs a latent pedigree, and proves that, under high selfing, the conditioned ancestral process converges to coalescing random walks on a random Q-λ graph embedded in an exchangeable fragmentation-coalescence process with measure Ξ. The main result shows the quenched law converges to a random object L_Π^n describing coalescing walks on a pedigree-driven graph, enabling analysis of the site-frequency spectrum conditional on pedigree and highlighting robustness of binary-merger limits. The framework yields applications to robustness across Wright-Fisher and Moran-type demographies, convergence of tree statistics and SFS under nearly complete selfing, and extensions to mixed demographies, while also providing a general theory for weak convergence of random measures on Skorokhod spaces.
Abstract
We introduce a general diploid population model with self-fertilization and possible overlapping generations, and study the genealogy of a sample of $n$ genes as the population size $N$ tends to infinity. Unlike traditional approach in coalescent theory which considers the unconditional (annealed) law of the gene genealogies averaged over the population pedigree, here we study the conditional (quenched) law of gene genealogies given the pedigree. We focus on the case of high selfing probability and obtain that this conditional law converges to a random probability measure, given by the random law of a system of coalescing random walks on an exchangeable fragmentation-coalescence process of \cite{berestycki04}. This system contains the system of coalescing random walks on the ancestral recombination graph as a special case, and it sheds new light on the site-frequency spectrum (SFS) of genetic data by specifying how SFS depends on the pedigree. The convergence result is proved by means of a general characterization of weak convergence for random measures on the Skorokhod space with paths taking values in a locally compact Polish space.
