Stochastic Models for Replication Origin Spacings in Eukaryotic DNA Replication

Huw Day; N C Snaith

Stochastic Models for Replication Origin Spacings in Eukaryotic DNA Replication

Huw Day, N C Snaith

TL;DR

This work analyzes the spatial-temporal distribution of replication origins in eukaryotes by introducing a simple 2D Poisson exclusion model that identifies active origins via empty backward light cones, yielding origin repulsion and a tail decay in spacings faster than exponential. It connects this minimal model to the Kolmogorov–Johnson–Mehl–Avrami framework and to Polynuclear Growth models, deriving a closed-form nearest-neighbour spacing density and the mean density of active points, with exact asymptotics for small and large spacings. Comparative data from multiple organisms show qualitative agreement with the predicted repulsion and tail behavior, though inhomogeneous placements may be required for some species. The work also draws links to integrable models, including directed polymers on Poisson points and PNG geometry, highlighting broad mathematical connections and potential for exact results in related growth processes.

Abstract

Replication of genetic material is an important process for all living organisms. Origins of replication initiate the copying of DNA at many points on a chromosome, and it is the distribution of these points that is relevant here, as it presents us with an interesting stochastic process. It was observed by Newman et al. that for various types of yeast cells, there were fewer very small inter-origin spacings, and fewer very large inter-origin spacings in the replication origin data than would be expected if the origins were uncorrelated, random points. We propose a very simple stochastic model for DNA replication and determine that this probabilistic process produces replication origins that display repulsion between origins and relative scarcity of large spacings. We detail some connections between this model and existing polynuclear or polymer growth models.

Stochastic Models for Replication Origin Spacings in Eukaryotic DNA Replication

TL;DR

Abstract

Paper Structure (15 sections, 8 theorems, 63 equations, 19 figures)

This paper contains 15 sections, 8 theorems, 63 equations, 19 figures.

Introduction to Eukaryotic DNA Replication
Kolmogorov-Johnson-Mehl-Avrami (KJMA) Model
2D Poisson Process Exclusion Model
Distinguishing Active and Passive Points
Nearest Neighbour Spacing Calculation for the Borderless Exclusion Model
Deriving the Nearest Neighbour Spacing Density
Mean density of active points
Behaviour of large and small spacings
Comparison with replication data
Links between Exclusion Model and Polynuclear Growth Models
Polynuclear Growth Models
Geometric Perspective of PNG and Links with our Exclusion Model
Flat PNG and PNG Droplet
Links with Related Integrable Models and the Exclusion Model on a Triangle
Directed polymers on Poisson points

Key Result

Theorem 2.2

Consider a homogeneous 2D Poisson Point process of unit intensity imposed on $\mathbb{R}\times\mathbb{R}_{+}$. We apply the exclusion model described in Section sect:model, defining active points to be those with backwards light cone (with lines of slope $\pm 1$) empty of other Poisson points. $\rho The probability density given in (eq:noboundary) is plotted numerically in Figure fig:checkingnns.

Figures (19)

Figure 1: A schematic representation of the system we will attempt to model. This is inevitably an oversimplification of a part of the entire process of DNA replication. The basis of this description is taken from Newman. Chromosomes in eukaryotes can be well modelled by a line and replication origins by small intervals on the line (in our calculations we model them as single points, but for the purpose of clarity in the diagram we draw red circles). In the first diagram we see licensed replication origins spaced out along the chromosome. Once replication begins, replication origins trigger at various times, sending a bidirectional replication fork which reads outwards along the chromosome. Thus the blue line is replicated DNA and the black is not. In the second diagram, all but the second origin have triggered. Eventually, the replication forks 'read' the entire chromosome, represented by the entirety of the black line being covered by blue arrows in the third diagram. Note also that the second origin failed to trigger before replication forks from neighbouring origins reached it before it triggered. For our mathematical model, we will focus on the origins that trigger before they are read over, and we will name such points 'active' points.
Figure 2: This Figure is Figure 3A from Newman. Original caption: "Inter-origin spacings in the S. cerevisiae genome. (A) Interorigin spacings in S. cerevisiae were calculated and assigned to different 1 kb bins. The frequency of origins in each bin is shown. Red dots: mean origin separation in a computer simulation where the same number of origins were placed at random on the whole S. cerevisiae genome. Grey dots: mean origin separation in a computer simulation where the same number of origins were placed at random only in the intergenic regions of the S. cerevisiae genome"
Figure 3: A (red) tray of (dark blue) water placed in sub zero temperature will begin freezing in different places at different times. Freezing (denoted with a lighter blue) will spread outwards overtime until all of the water is frozen. We see just water in 1), two points of nucleation/freezing occurring after some length of time in 2) and in 3) we see coalescence where two ice fronts meet after more time has passed.
Figure 4: Figure taken from DNANuc2. Original caption "Mapping DNA replication onto the one-dimensional KJMA model." We see a clear correspondence between the nucleation/freezing in KJMA and the initiation/firing of replication origins in DNA replication. In the left side above, the "eye", where the line is doubled indicates where DNA has already replicated.
Figure 5: A basic realisation of the 2D Poisson process with $3$ points (marked by purple dots). Backward light cones are marked in red, forward light cones are marked in blue. In this instance we see that the highest point $(x_{1}, t_{1})$ contains the other $2$ points in its backwards cone, which means it would be made passive by $(x_{2}, t_{2})$ without the presence of $(x_{3}, t_{3})$. Similarly, $(x_{2}, t_{2})$ is made passive by the lowest point. We can say that the lowest point $(x_{3}, t_{3})$ is active and in this case makes the other $2$ points passive. Note that the angle at the top of a red triangle would be 90 degrees if we assume that the replication fork proceeds with speed 1.
...and 14 more figures

Theorems & Definitions (20)

Definition 2.1: Forward and Backward Light Cones PNG
Theorem 2.2: Nearest Neighbour Spacing for Active Points on a Borderless Exclusion Model
proof
Lemma 2.3
proof
Corollary 2.4: Local Linear Repulsion of Active Points in the Borderless Exclusion Model
proof
Corollary 2.5: Behaviour of Large Spacings Between Active Points in the Borderless Exclusion Model
proof
Definition 3.1: Flat PNG
...and 10 more

Stochastic Models for Replication Origin Spacings in Eukaryotic DNA Replication

TL;DR

Abstract

Stochastic Models for Replication Origin Spacings in Eukaryotic DNA Replication

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (19)

Theorems & Definitions (20)