A conditional coalescent for diploid exchangeable population models given the pedigree
Frederic Alberti, Matthias Birkner, Wai-Tong Louis Fan, John Wakeley
TL;DR
This work analyzes gene genealogies conditional on a fixed population pedigree under the diploid Cannings model, revealing that, in the large-$N$ limit, the quenched limiting process can differ dramatically from the marginal coalescent when multiple mergers are possible. The authors develop an inhomogeneous $(\Psi,c)$-coalescent to capture timeline-structured large-family events (GLIPs) via a Poisson point process $\Psi$ with intensity $d t\,\Xi(d x)/\langle x,x\rangle$ and a constant pair-merger rate $c_{\text{pair}}=1-\Xi(\Delta\setminus\{0\})$, with time rescaled by $c_N$ to unit scale. A key methodological contribution is a coupling framework for two coalescents on the same pedigree and a coarse-graining argument that reduces the pedigree to a paintbox-driven mechanism for GLIPs, enabling a rigorous convergence to the inhomogeneous coalescent. The results show fundamental differences between quenched and annealed genealogies, with concrete implications for the site-frequency spectrum and multi-locus statistics, and provide a suite of examples (Wright–Fisher, random fitness, occasional large families) along with simulations illustrating pedigree-driven variation in genetic data. These findings have practical impact on inference and simulation in populations with highly skewed reproductive success, and they extend prior work by incorporating arbitrary sample sizes and a full pedigree-conditioned limiting process. The framework paves the way for further extensions to recombination, sex structure, and forward-time duals.
Abstract
We study coalescent processes conditional on the population pedigree under the exchangeable diploid bi-parental population model of \citet{BirknerEtAl2018}. While classical coalescent models average over all reproductive histories, thereby marginalizing the pedigree, our work analyzes the genealogical structure embedded within a fixed pedigree generated by the diploid Cannings model. In the large-population limit, we show that these conditional coalescent processes differ significantly from their marginal counterparts when the marginal coalescent process includes multiple mergers. We characterize the limiting process as an inhomogeneous $(Ψ,c)$-coalescent, where $Ψ$ encodes the timing and scale of multiple mergers caused by generations with large individual progeny (GLIPs), and $c$ is a constant rate governing binary mergers. Our results reveal fundamental distinctions between quenched (conditional) and annealed (classical) genealogical models, demonstrate how the fixed pedigree structure impacts multi-locus statistics such as the site-frequency spectrum, and have implications for interpreting patterns of genetic variation among unlinked loci in the genomes of sampled individuals. They significantly extend the results of \citet{DiamantidisEtAl2024}, which considered a sample of size two under a specific Wright-Fisher model with a highly reproductive couple, and those of \citet{TyukinThesis2015}, where Kingman coalescent was the limiting process. Our proofs adapt coupling techniques from the theory of random walks in random environments.
