Conditional gene genealogies given the population pedigree for a diploid Moran model with selfing
Maximillian Newman, John Wakeley, Wai-Tong Louis Fan
TL;DR
This work develops a conditional coalescent framework for a diploid Moran population with selfing, demonstrating that conditioning on the population pedigree yields three distinct limiting regimes as population size grows: negligible outcrossing, limited outcrossing described by an ancestral graph and random-walk meeting times, and partial selfing where familiar Kingman-like dynamics re-emerge. The authors introduce a detailed Moran-based pedigree model, derive the unconditional distribution of pairwise coalescence times, and prove conditional limit theorems for two-gene samples under each regime, including both samples from different individuals and from the same individual. They further extend the analysis to larger samples via conjectured limits and quantify how pedigree structure induces variance components and covariances in coalescence times, linking these to identity disequilibrium and multi-locus variation. The results emphasize that pedigree-informed coalescent models can capture genome-wide heterogeneity in genealogies and offer a principled alternative to pedigree-averaged coalescents for interpreting multi-locus data in populations with substantial selfing.
Abstract
We introduce a stochastic model of a population with overlapping generations and arbitrary levels of self-fertilization versus outcrossing. We study how the global graph of reproductive relationships, or population pedigree, influences the genealogical relationships of a sample of two gene copies at a genetic locus. Specifically, we consider a diploid Moran model with constant population size $N$ over time, in which a proportion of offspring are produced by selfing. We show that the conditional distribution of the pairwise coalescence time at a single locus given the random pedigree converges to a limit law as $N$ tends to infinity. The distribution of coalescence times obtained in this way predicts variation among unlinked loci in a sample of individuals. Traditional coalescent analyses implicitly average over pedigrees and generally make different predictions. We describe three different behaviors in the limit depending on the relative strengths, from large to small, of selfing versus outcrossing: partial selfing, limited outcrossing, and negligible outcrossing. In the case of partial selfing, coalescence times are related to the Kingman coalescent, similar to what is found in traditional analyses. In the case of limited outcrossing, the retained pedigree information forms a random graph, with coalescence times given by the meeting times of random walks on this graph. In the case of negligible outcrossing, which represents complete or nearly complete selfing, coalescence times are determined entirely by the fixed times to common ancestry of diploid individuals in the pedigree.
