Empirical distribution of ancestral lineages in populations with density-dependent interactions
Madeleine Kubasch
TL;DR
This work develops a time-inhomogeneous spine for density-dependent, multi-type population processes to obtain a clean many-to-one formula for the empirical distribution of ancestral lineages under uniform sampling. The proposed spine avoids stochastic exponential weighting and provides a direct interpretation of survivorship bias via the auxiliary process, with exact results in the finite-population setting and a rigorous large-population limit. The large-population analysis yields a deterministic limit for the population composition and a corresponding spine that yields a quantified $O(1/\sqrt{K})$ approximation error when sampling in the limit versus the finite system. The framework enables practical computation of lineage statistics and paves the way for extensions to whole-tree genealogies and multi-sample spine constructions, as well as longer time horizons through controlled coupling arguments.
Abstract
We study a density-dependent Markov jump process describing a population where each individual is characterized by a type, and reproduces at rates depending both on its type and on the population type distribution. We are interested in the empirical distribution of ancestral lineages in the population process. First, we exhibit a time-inhomogeneous Markov process, which allows to capture the behavior of a sampled lineage in the population process. This is achieved through a many-to-one formula, which relates the expected value of a functional evaluated over the lineages in the population process to the expectation of the functional evaluated along this time-inhomogeneous process. This provides a direct interpretation of the underlying survivorship bias, as illustrated on a minimalistic population process. Second, we consider the large population regime, when the population size grows to infinity. Under classical assumptions, the population type distribution converges to a deterministic limit. Here, we focus on the empirical distribution of ancestral lineages in this large population limit, for which we establish a many-to-one formula. Using coupling arguments, we further quantify the approximation error which arises when sampling in this large population approximation instead of the finite-size population process.
