Table of Contents
Fetching ...

Stochastic Kinetics of mRNA Molecules in a General Transcription Model

Yuntao Lu, Yunxin Zhang

TL;DR

The paper addresses stochastic transcription in a general multi-state gene framework by deriving an exact time-dependent solution to the chemical master equation using a matrix-valued generating function. It then extracts time-asymptotic binomial moments and proves sharp inequalities that bound the binomial moments and the mRNA copy-number distribution, establishing an upper bound by a Poisson distribution and proving a Heavy-Tailed Law constraint for intrinsic noise. The authors validate the theory against stochastic simulations (SSA) and finite state projection (FSP), recover the Telegraph model and renewal-condition Markovian models as special cases, and analyze numerical aspects including truncation errors. The results offer a unified, computationally efficient approach with explicit error control, providing practical tools for studying transcriptional noise and guiding statistical inference and numerical methods in stochastic gene expression.

Abstract

Stochastic modeling of transcription is a classic yet long-standing problem in theoretical biophysics. The lack of unified results and a computationally efficient approach for a general, fine-grained transcription model has confined relevant research to some over-simplified special cases like the Telegraph model. This article establishes a general, unified and computationally efficient framework for studying stochastic transcription kinetics. We consider a chemical reaction model of transcription and construct the time-dependent solution to the corresponding chemical master equation. A well-known matrix-form expression for steady-state binomial moments is recovered by calculating the temporal limit of the time-dependent dynamics. Two novel inequalities for binomial moments and the probability mass function are derived using techniques from functional analysis. It follows that the distribution of mRNA counts is upper-bounded by a constant multiple of Poisson distribution, thus mathematically proving the main statement of the Heavy-Tailed Law. Additionally, the standard binomial moment method is analyzed from a numerical perspective, where truncation error is estimated using our inequalities. Compared with some widely-used numerical methods, a key advantage of this result is the significantly lower computational complexity.

Stochastic Kinetics of mRNA Molecules in a General Transcription Model

TL;DR

The paper addresses stochastic transcription in a general multi-state gene framework by deriving an exact time-dependent solution to the chemical master equation using a matrix-valued generating function. It then extracts time-asymptotic binomial moments and proves sharp inequalities that bound the binomial moments and the mRNA copy-number distribution, establishing an upper bound by a Poisson distribution and proving a Heavy-Tailed Law constraint for intrinsic noise. The authors validate the theory against stochastic simulations (SSA) and finite state projection (FSP), recover the Telegraph model and renewal-condition Markovian models as special cases, and analyze numerical aspects including truncation errors. The results offer a unified, computationally efficient approach with explicit error control, providing practical tools for studying transcriptional noise and guiding statistical inference and numerical methods in stochastic gene expression.

Abstract

Stochastic modeling of transcription is a classic yet long-standing problem in theoretical biophysics. The lack of unified results and a computationally efficient approach for a general, fine-grained transcription model has confined relevant research to some over-simplified special cases like the Telegraph model. This article establishes a general, unified and computationally efficient framework for studying stochastic transcription kinetics. We consider a chemical reaction model of transcription and construct the time-dependent solution to the corresponding chemical master equation. A well-known matrix-form expression for steady-state binomial moments is recovered by calculating the temporal limit of the time-dependent dynamics. Two novel inequalities for binomial moments and the probability mass function are derived using techniques from functional analysis. It follows that the distribution of mRNA counts is upper-bounded by a constant multiple of Poisson distribution, thus mathematically proving the main statement of the Heavy-Tailed Law. Additionally, the standard binomial moment method is analyzed from a numerical perspective, where truncation error is estimated using our inequalities. Compared with some widely-used numerical methods, a key advantage of this result is the significantly lower computational complexity.

Paper Structure

This paper contains 32 sections, 1 theorem, 82 equations, 5 figures.

Key Result

THEOREM 1

Let $C=(c_{i,j})_{N\times N}$ be a $N\times N$ strictly row diagonally dominant matrix with $\Theta:=\min_{1\leq i\leq N}\left(\mid c_{i,i}\mid-\sum_{j\neq i}\mid c_{i,j}\mid\right)$. Then $\lVert C^{-1}\rVert_\infty\leq \Theta^{-1}$.

Figures (5)

  • Figure 1: Trajectories from Stochastic Simulation: 10 sample paths of stochastic simulation of the reaction system \ref{['Reaction']} are plotted in the above illustration. The parameters are set as $D_0=$$-2.110.10.00.00.010.1-2.610.50.00.010.10.2-5.40.00.10.10.10.3-3.50.00.10.10.10.1-100.4$, $D_1=$$001011100012110100110000100$ and $\delta=1$. The initial state of the system is $M(0)=0$ and $\mathcal{S}(0)=\mathcal{S}_1$. The bold dashed line of deterministic description is the time-dependent solution of the reaction rate equation describing \ref{['Reaction']} under given initial condition (assume the underlying Markov chain characterized by $D$ is in equilibrium). We sample $4000$ points equally spaced in the time interval $[0,40]$. Python package GillesPy2GillesPy2BiochemicalModeling2023Lett.Biomath. is used.
  • Figure 2: Upper Bound for Binomial Moments and Probability Distribution of mRNA molecules: In the left panel, binomial moments and the upper bound are computed according to \ref{['binomial']} and \ref{['converge']}, respectively. In the right panel, the upper bound of the probability mass function is given by \ref{['Bound1']}. Without loss of generality, we normalize all the parameters by dividing $\delta$. In the above illustrations, parameters are both set as $D_0 =$$-72303-121512-126880-19$ and $D_1=$$1100101101111101$.
  • Figure 3: Probability Distribution of mRNA Counts through SSA, FSP, and Our Theoretical Results: Steady-state distributions of mRNA counts in four examples are computed using three different approaches. The histograms are each generated from $1\times10^5$ trajectories using SSA, truncated at dimensionless time $t=40$. GillesPy2 is implemented with C++ solver. The line plots are generated using FSP, where the truncation is determined such that truncation error is below $1\times10^{-5}$. The steam plots are generated according to analytical results \ref{['binomial']} and \ref{['distribution']}. For parameters in these four model, refer to Jupyter Notebook ParametersFig3.ipynb. These four models are designed to represent progressively increasing complexity, specifically of orders $3$, $5$, $10$, and $20$.
  • Figure 4: Computation Time Using Different Methods: Computation times are evaluated for a sequence of increasingly complicated models using four different approaches. Parameters are given in the form of $a_{i,j}=0(i\neq j)$ and $b_{i,j}=i$. As the dimension of $D_0$ or $D_1$ increases, the number of reaction pathways grows quadratically, resulting in increased model scale. SSA is performed for $1000$ trajectories and truncated at dimensionless time $t=10$ (GillesPy2 is implemented with Python solver); FSP is implemented such that approximation error caused by truncating the CME is below $1\times10^{-5}$; Curves of Renewal Theory and Our Result are computation times of the first $200$ binomial moments (starting from $B_1$) based on \ref{['GIqueue']} and \ref{['binomial']}, respectively. In the top left panel, four different approaches (SSA; FSP; Renewal Theory; Our Result) are implemented to models with dimension $k\;\;(1\leq k\leq 10)$; in the top right panel, three approaches (FSP; Renewal Theory; Our Result) are implemented to models with dimension $k\;\;(1\leq k\leq20)$; in the bottom left panel, two approaches (Renewal Theory; Our Result) are implemented to models with dimension $k\;\;(1\leq k\leq50)$; in the bottom right panel, only Our Result is implemented to models with dimension $k\;\;(1\leq k\leq200)$. To mitigate randomness, running times are all evaluated for three replicate computations and then averaged.
  • Figure 5: Intrinsic Noise and Fano Factor: Intrinsic noise and Fano factor are numerically evaluated for models of order $n$($1\leq n\leq 100$), with parameters the same as \ref{['Time']}. In the left panel, intrinsic noise $\eta^2$ is computed based on \ref{['cv']} and $-\ln(\eta^2)$ is used to generate the illustration; in the right panel, Fano factor is computed based on \ref{['fano']}.

Theorems & Definitions (1)

  • THEOREM