Table of Contents
Fetching ...

Weak convergence of the scaled jump chain and number of mutations of the Kingman coalescent

Martina Favero, Henrik Hult

TL;DR

This work analyzes the large-sample asymptotics of the Kingman coalescent with a finite-allele mutation scheme under neutrality by studying the joint block-counting and mutation-counting process $\mathbf{Z}^{(n)}=(\mathbf{Y}^{(n)},\mathbf{M}^{(n)})$. The authors prove that, after appropriate scaling, $\mathbf{Z}^{(n)}$ converges to a deterministic path $\mathbf{Y}(s)$ together with independent time-inhomogeneous Poisson mutation-counting processes $\mathbf{M}$ with intensities $\lambda_{ij}(\mathbf{Y}(s))= \frac{\theta P_{ij} Y_i(s)}{\|\mathbf{Y}(s)\|_1^2}$, first under parent-independent mutation (PIM) and then for general mutation via a novel change-of-measure technique. The change-of-measure framework uses Radon–Nikodym factors $r^{(n)}_{P,Q}$ and $c_{P,Q}$ to transfer convergence results from the PIM setting to the general mutation setting, supported by a technical Ethier–Kurtz–type framework to handle explosion near the boundary. The results provide a rigorous basis for large-sample inference in population genetics with general mutation schemes and illuminate the joint behavior of lineage counts and mutation counts in the Kingman coalescent limit.

Abstract

The Kingman coalescent is a fundamental process in population genetics modelling the ancestry of a sample of individuals backwards in time. In this paper, in a large-sample-size regime, we study asymptotic properties of the coalescent under neutrality and a general finite-alleles mutation scheme, i.e. including both parent independent and parent dependent mutation. In particular, we consider a sequence of Markov chains that is related to the coalescent and consists of block-counting and mutation-counting components. We show that these components, suitably scaled, converge weakly to deterministic components and Poisson processes with varying intensities, respectively. Along the way, we develop a novel approach, based on a change of measure, to generalise the convergence result from the parent independent to the parent dependent mutation setting, in which several crucial quantities are not known explicitly.

Weak convergence of the scaled jump chain and number of mutations of the Kingman coalescent

TL;DR

This work analyzes the large-sample asymptotics of the Kingman coalescent with a finite-allele mutation scheme under neutrality by studying the joint block-counting and mutation-counting process . The authors prove that, after appropriate scaling, converges to a deterministic path together with independent time-inhomogeneous Poisson mutation-counting processes with intensities , first under parent-independent mutation (PIM) and then for general mutation via a novel change-of-measure technique. The change-of-measure framework uses Radon–Nikodym factors and to transfer convergence results from the PIM setting to the general mutation setting, supported by a technical Ethier–Kurtz–type framework to handle explosion near the boundary. The results provide a rigorous basis for large-sample inference in population genetics with general mutation schemes and illuminate the joint behavior of lineage counts and mutation counts in the Kingman coalescent limit.

Abstract

The Kingman coalescent is a fundamental process in population genetics modelling the ancestry of a sample of individuals backwards in time. In this paper, in a large-sample-size regime, we study asymptotic properties of the coalescent under neutrality and a general finite-alleles mutation scheme, i.e. including both parent independent and parent dependent mutation. In particular, we consider a sequence of Markov chains that is related to the coalescent and consists of block-counting and mutation-counting components. We show that these components, suitably scaled, converge weakly to deterministic components and Poisson processes with varying intensities, respectively. Along the way, we develop a novel approach, based on a change of measure, to generalise the convergence result from the parent independent to the parent dependent mutation setting, in which several crucial quantities are not known explicitly.

Paper Structure

This paper contains 13 sections, 4 theorems, 68 equations, 1 figure.

Key Result

theorem 1

Let $\textbf{y}_0\in \mathbb{R}_+^d$ and $\textbf{y}_0^{(n)}\in\frac{1}{n}\mathbb{N}^d\setminus\{\boldsymbol{0}\}, n\in \mathbb{N}$. Assume $\textbf{Y}^{(n)}(0)= \textbf{y}_0^{(n)}$, and $\textbf{y}_0^{(n)} \to \textbf{y}_0$ as $n\to\infty$. Then, for all $t\in [0,\left\lVert\textbf{y}_0\right\rVert and $\textbf{M}=(M_{ij})_{i,j=1}^d$ , with $M_{ij}=\{M_{ij}(s)\}_{s\in[0,t]}$ being independent tim

Figures (1)

  • Figure 1: A realisation of the Markov chains $\textbf{Y}^{(100)}$ (left) and $\textbf{Y}^{(1000)}$ (right) with starting point $\textbf{y}_0=(0.4,0.6)$. The limiting process $\textbf{Y}$ is represented by the blue line. Parameters: mutation rate $\theta=4$, mutation probabilities $Q_1=Q_2=0.5$, number of types $d=2$.

Theorems & Definitions (12)

  • theorem 1
  • theorem 2
  • proof
  • proposition 1
  • remark 1
  • lemma 1
  • proof
  • remark 2: Closed sets
  • remark 3: Compact sets
  • remark 4: Continuous functions
  • ...and 2 more